GithubHelp home page GithubHelp logo

Comments (17)

behnamm avatar behnamm commented on May 29, 2024

Hi @liulalala
Sorry for the very late reply. The ns2 code is only used for PIAS and pFabric comparisons. Homa is not simulated in ns2, only in omnet++.

If you are interested in pFabric or PIAS simulations, you need to check out ns2_pfabric or pias branches. These branches populate the code for those simulations and fill RpcTransportDesign/ns2_Simulations/scripts/ directory with simulation scripts.

I'll update the README to contain these info.

Cheers,

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Thanks a lot for replying me! I am interested in your Homa project hhh! I looked into omnet++ code, and 1. I cannot figure out what does cbf mean? (like getRemainSizeCdfCbf, getCbfFromCdf) 2. what does defaultReqBytes mean? That indicates that the sender sends request packets first(containing number of requested packets)? But in paper it says: "an initial unscheduled portion, followed by a scheduled portion".(in 3.2) And no requested portion is involved. Looking forward for your reply! Thanks :)

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Hi behnamm, sorry to disturb you again. I'm trying to run the homa code. I follow the README to set up, but I wonder what's the default input file? (./homatransport xxx.ini)? And I wonder the structure of the homa code. Looking forward for your reply!

from homasimulation.

behnamm avatar behnamm commented on May 29, 2024

@liulalala
First, make sure you know enough about omnet++ and how to configure and run simulations from command line (refer to oment++ manual on omnet++ website). Then, in order to run Homa, after you build the simulation package, you need to go to RpcTransportDesign/OMNeT++Simulation/homatransport/src/dcntopo folder and run your simulation scenario from there. Here is an example on how to run a single configurations:

../homatransport -u Cmdenv -c WorkloadHadoop -r 6 -n ..:../../simulations:../../../inet/examples:../../../inet/src -l ../../../inet/src/INET homaTransportConfig.ini

"-u Cmdenv" tells OMNeT++ not to run the simulation in the gui. homaTransportConfig.ini at the end of the command is the configuration file we use and "-c WorkoaldHadoop" asks omnet to use parameters specified in WorkloadHadoop section of the config file. -r 6 specifies run number 6 withing that section to be simulated.

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Thanks a lot for replying me. I wonder that whether the receiver will send grants for each unscheduled packets and request packets? Or only the last unscheduled packets?

from homasimulation.

behnamm avatar behnamm commented on May 29, 2024

Grant packets are transmitted one packet at a time, for every single data packet that arrives. So, for each unscheduled packet (including the request packet) that arrives at the receiver, a new grant packet is sent. However, grants are only sent for a message if the message belongs to the high priority set of messages that the receiver is actively granting. Please read the paper for more information.

Cheers,

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Thanks a lot for your reply! And I am sorry to bother you again... I am reading the paper and code carefully, but I still cannot figure out some detail.

  1. In HomaTransport::initialize(), uint32_t maxDataBytes = MAX_ETHERNET_PAYLOAD_BYTES - IP_HEADER_SIZE - UDP_HEADER_SIZE - dataPkt.headerSize(); if (homaConfig->grantMaxBytes > maxDataBytes) { homaConfig->grantMaxBytes = maxDataBytes; }, that means a grant will grant at most MTU bytes according to the meaning of grantMaxBytes.( And I cannot find the value of maxDataBytes in homaTransportConfig.ini). But in paper section 3.3 Flowcontrol, it says that, the offset is chosen so that there are always RTTbytes of data in the message that have been granted but not yet received. Seems that it means a grant can grant for multiple data packets?
  2. In HomaTransport::ReceiveScheduler::processReceivedPkt(), oversubscription period is mentioned, means that you can open or close oversubscription due to network situation? Could you explain it in detail?
    Looking forward for your reply :)

from homasimulation.

behnamm avatar behnamm commented on May 29, 2024

@liulalala
Happy to help. Find the responses inline.

Thanks a lot for your reply! And I am sorry to bother you again... I am reading the paper and code carefully, but I still cannot figure out some detail.

  1. In HomaTransport::initialize(), uint32_t maxDataBytes = MAX_ETHERNET_PAYLOAD_BYTES - IP_HEADER_SIZE - UDP_HEADER_SIZE - dataPkt.headerSize(); if (homaConfig->grantMaxBytes > maxDataBytes) { homaConfig->grantMaxBytes = maxDataBytes; }, that means a grant will grant at most MTU bytes according to the meaning of grantMaxBytes.( And I cannot find the value of maxDataBytes in homaTransportConfig.ini). But in paper section 3.3 Flowcontrol, it says that, the offset is chosen so that there are always RTTbytes of data in the message that have been granted but not yet received. Seems that it means a grant can grant for multiple data packets?

Note that a grant may allow transmission of multiple data packets, but that doesn't mean we don't send grants on a per packet basis. As I said before, grants are sent on a per packet basis and in the common case when grants are not delayed, we expect that a new scheduled packet is sent for every new grant packet that arrives at the sender. What you are referring to in the paper is an optimization for when two grant packets G1 and G2 are reordered in the network and the later grant G2 arrives earlier than G1 at the sender. To compensate for the reordering of the grants, with arrival of G2, we allow transmission of two scheduled packets instead of one. The offset you refer to is a way to implement this effect. That said, while we have implemented this optimization in the RAMCloud implementation, we didn't implement this in the simulations. So in the simulations, we can only transmit one scheduled packet for every new grant.

  1. In HomaTransport::ReceiveScheduler::processReceivedPkt(), oversubscription period is mentioned, means that you can open or close oversubscription due to network situation? Could you explain it in detail?

This is a feature I added for collecting statistics and computing the wasted bandwidth. It doesn't have any effect on the algorithm and Homa mechanisms. You don't need to worry about this.

Looking forward for your reply :)

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Got it! Thank you so much! Another question: how to determine the priority of the scheduled packets? As far as I am concerned,
First, to maintain a candidate list, length of overcommitment level, which contains the flows having the shortest remaining bytes to receive.
Each time receiving a data packet, update it.
Then, every time receiving a data packet, check whether the head of the flows(of highest priority) has grant to send (the bytes on wire is smaller than RTTbytes), if not, check the second one, and so on... If no flow in the candidate list can send a grant, this data packet will not trigger any grant.
I am not sure whether my understanding is right. And I wonder how to decide the priority of scheduled packets, in another word, the prio field in grant's head. For example, if we send a grant whose list number is sId, the prio field will be set to sId +# unscheduled priority? Then What does always use the lowest scheduled priorities means?
And In HomaTransport::ReceiveScheduler::SenderState::sendAndScheduleGrant, there is grantPrio = std::min(grantPrio, (uint32_t)resolverPrio); , this may result in scheduled packets use the priority of unscheduled packets?

from homasimulation.

behnamm avatar behnamm commented on May 29, 2024

@liulalala
responses inlined

Got it! Thank you so much! Another question: how to determine the priority of the scheduled packets? As far as I am concerned,
First, to maintain a candidate list, length of overcommitment level, which contains the flows having the shortest remaining bytes to receive.
Each time receiving a data packet, update it.
Then, every time receiving a data packet, check whether the head of the flows(of highest priority) has grant to send (the bytes on wire is smaller than RTTbytes), if not, check the second one, and so on... If no flow in the candidate list can send a grant, this data packet will not trigger any grant.

That should work. Although, this is not exactly how I have implemented in the simulator. The simulator also send grants based on a timer: when one packet time is passed, we check if we can send a grant for any of the active messages, subject to conditions like if the message is among the top scheduled message and there is less than on RTTBytes outstanding bytes for that message. The simulations code has more than what we have discussed in the paper that may make the code difficult to understand. I would suggest look at the RAMCloud implementation of Homa for a cleaner code.

I am not sure whether my understanding is right. And I wonder how to decide the priority of scheduled packets, in another word, the prio field in grant's head. For example, if we send a grant whose list number is sId, the prio field will be set to sId +# unscheduled priority? Then What does always use the lowest scheduled priorities means?

It basically means that as the top priority message in the list completes, you push the remaining messages up in the list. That means a new place in the list opens up at the lowest priority level so if a new message arrives, it would be inserted to the list at the lowest priority level. Section 3.4 of the paper explain this.

And In HomaTransport::ReceiveScheduler::SenderState::sendAndScheduleGrant, there is grantPrio = std::min(grantPrio, (uint32_t)resolverPrio); , this may result in scheduled packets use the priority of unscheduled packets?

This relates to an optimization that may not have been explained in the paper. Basically, because of this optimization, the last RTTBytes of the scheduled messages gets an unscheduled priority level. That makes sense because from the receiver perspective that is doing SRPT, the last RTTbytes of a message is as important as the first RTTBytes of the message. So, we assign unscheduled priority level for the last RTTBytes of scheduled portion. Hope this makes sense.

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Thank you so much! Sorry to bother you again.
Do you mean that for scheduled packets that do not belong to the last RTT, their priority is calculated by sId +# unscheduled priority, and for the last RTTbytes, their priority is calculated by grantPrio = std::min( sId +# unscheduled priority, (uint32_t)resolverPrio)?
Another detailed question, how the W1-W5 workload in paper correspond to workload file? As there are so many workload files in folder sizeDistributions such as FABRICATED_HEAVY_MIDDLE.txt, FABRICATED_HEAVY_HEAD.txt. And number of flows in one simulation?

from homasimulation.

behnamm avatar behnamm commented on May 29, 2024

Thank you so much! Sorry to bother you again.
Do you mean that for scheduled packets that do not belong to the last RTT, their priority is calculated by sId +# unscheduled priority, and for the last RTTbytes, their priority is calculated by grantPrio = std::min( sId +# unscheduled priority, (uint32_t)resolverPrio)?

Yes, That's correct.

Another detailed question, how the W1-W5 workload in paper correspond to workload file? As there are so many workload files in folder sizeDistributions such as FABRICATED_HEAVY_MIDDLE.txt, FABRICATED_HEAVY_HEAD.txt. And number of flows in one simulation?

W1 -> FacebookKeyValueMsgSizeDist.txt
W2 -> Google_SearchRPC.txt
W3 -> Google_AllRPC.txt
W4 -> Facebook_HadoopDist_All.txt
W5 -> DCTCP_MsgSizeDist.txt
To save space in the paper, the results for the rest of the workloads in that folder were not reported in paper.

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Thanks! Small few questions that I got confused:

  1. the workload files present the size of flows in bytes or in packet number? I guess in bytes?
    But for W5, the cdf file (DCTCP_MsgSizeDist.txt) is :
    image
    According to the paper, seems that the flows in DCTCP is large, then the size in DCTCP_MsgSizeDist.txt seems to represent the flow size in packet number?...
  2. the priority number is 8 for custom switch queue, and 1 highest for signals like grant and request, another 7 priority is left for unscheduled and scheduled packets, right? But in the DCTCP flow pattern, adaptiveSchedPrioLevels = 7, that means that unscheduled and scheduled flows will use one same priority? As we discussed before, it seems make sense.

from homasimulation.

behnamm avatar behnamm commented on May 29, 2024

Thanks! Small few questions that I got confused:

  1. the workload files present the size of flows in bytes or in packet number? I guess in bytes?
    But for W5, the cdf file (DCTCP_MsgSizeDist.txt) is :
    image
    According to the paper, seems that the flows in DCTCP is large, then the size in DCTCP_MsgSizeDist.txt seems to represent the flow size in packet number?...

Correct! The original DCTCP search workload from DCTCP paper is specified in terms of packet counts rather than bytes. That's why, the file for this workload is also in terms of packet counts but the simulator takes care of transforming the workload from packets to bytes.

  1. the priority number is 8 for custom switch queue, and 1 highest for signals like grant and request, another 7 priority is left for unscheduled and scheduled packets, right? But in the DCTCP flow pattern, adaptiveSchedPrioLevels = 7, that means that unscheduled and scheduled flows will use one same priority? As we discussed before, it seems make sense.

No, there's no distinct priority reserved for grants. The grants share the highest priority level with some of the highest priority unscheduled packets that belong to smallest of messages. The load that grant packets are putting on the network is taken into account when dividing the priorities among the unscheduled and scheduled packets. For example, in DCTCP workload, the unscheduled packets and grants and the last RTTBytes of scheduled packets all share the single highest priority level from 8 total priority levels available. The remaining 7 priority levels are all used for scheduled packets (ie. adaptiveSchedPrioLevels=7). Hope this makes things clear.

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

Thanks! Another question, what does the 99% slowdown for each x-axis means?
image

from homasimulation.

liulalala avatar liulalala commented on May 29, 2024

As far as I'm concerned, I think that the 99% slowdown is: first sort the flows in an ascending order, choose the 99% flow's completion time / it's oracle completion time. But I cannot figure out what does the 99% slowdown means for each particular flow size (as the x-axis shows: 2 3 5 11...)

from homasimulation.

behnamm avatar behnamm commented on May 29, 2024

So, imagive we run the experiment at a specific load factor (eg. 80% load factor) for a long enough time such that for every single message size in the workload, we have generated 1000s of instances of that size and found the latency for each instance of that message size. Now we sort the latencies for that message size and find the 99%ile and minimum latency among them. Divide the 99%ile latency over the minimum latency and you have the 99%ile slowdown for that message size.
This was explained in the paper, but if that wasn't clear, hope this makes it clear.

from homasimulation.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.