Newsgroups: comp.dcom.lans.ethernet From: mart@csri.toronto.edu (Mart Molle) Subject: 15% collisions on Ethernet OK? (was: Switching vs. Routing) Message-ID: <1993Apr30.173455.12194@jarvis.csri.toronto.edu> References: <1rnrlu$nd0@access.digex.net> Date: 30 Apr 93 21:34:55 GMT Lines: 202 rl@access.digex.net (Rich) writes: >On 04-27-19, Vjs@rhyolite.wpd.sgi.com had this to say to All: >V>In article <1993Apr26.122114.28216@cyantic.com>, mark@cyantic.com (Mark >V>> ... >V>> The only visibile problem on the network is a high level of collision >V>> the accounting system, sometimes reaching 15% during operations peaks >V> >V>hmmph. classic "gotta have the latest technology" disease. >V>15% collisions is not a "high level" by any stretch, unless it means >V>15% of the packets suffer 16 consecutive collisions and are dropped, so >V>that the per transmission attempt collision rate close to 100%. >V>(Yes, an ethernet is very sick at much less than 100% collisions, but >V>if I did the arithmetic right, 15% "excessive" collsions can't happen >V>much below 100%.) >Normally you and I are in sync, Vernon, but I have to question on this >one (BTW folks, this is my new address, previously you knew me as >rich@grebyn.com, and yes, I work for Synoptics communications and the >following is not official advice from myself as a Synoptics employee and >is not to be misconstrued as Synoptics policy or recommendation, etc etc >etc). >Consider the following graph: > Where you would want to be, hopefully > \/ >Efficiency >---------- > | XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > | XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > | XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > | XXOOOOOOXXXXXXXXXXXXXXXXXXXXXXXXXXXX > | XOOO......OOOOXXXXXXXXXXXXXXXXXXXXXXXX > | XOOO..............OOOXXXXXXXXXXXXXXXXXXX > | OO......................XXXXXXXXXXXXXXXXX > | O...........................XXXXXXXXXXXXXX > |........................................XXXX > -------------------------------------------------- > Utilization ( % bandwidth ) >Where X=Collisions, O=Packets, .=Actual data >(sorry about the lousy scaling) Interesting ASCII graphics. But impossible to interpret without knowing the scales. Normally these sorts of curves plot "G" ("offered load", in transmission attempts per unit time; note that this measure may count individual packets more than once if they suffer collisions) on the horizontal axis and "S" ("throughput", in successful packet transmissions per unit time; this *never* counts an individual packet more than once). If the time unit is taken to be the transmission time for an average packet (rather than a second, millisecond, etc), normalized throughput is a number between 0 and 1 and is sometimes called efficiency. Is that what you mean here???? >Of course, this shows the stochastic nature of ethernet which I >know you are already familiar with. The only reason I draw it is >we appear to disagree on what is an acceptable delta between the >number of packets transmitted and the percentage of those packets >that are collisions. In the above graph, this would be the delta >between the top of level of the "O" and the top of the "X" curve. This is an understatement! >Obviously, and as shown in the graph, collisions are an expected >result of traffic and therefore are no cause for immediate alarm. Obviously! >The point we should be interested in is the marked top of the curve, >where the highest efficiency for the least cost in collisions is >obtained. The popular figure for that point is around 2-3% (!). There is something terribly wrong with what you are saying. Consider the following: 1) Assuming I understand what your axes mean, you are saying that the correct operating point for the system (i.e., the rate at which your hosts should attempt to schedule packet transmissions) is where the throughput curve (i.e., number of *successful* packet transmissions completed per unit time) is maximized. Am I right? Well, in that case, your system would behave like a saturated queue, where the waiting times blow up to unacceptable levels. (Remember, we're talking about the *average* arrival rate of new packets being set equal to the *average* departure of successful packets...) If you don't want to get swamped with queueing delays, you need to make the arrival rate *smaller* than the departure rate. 2) Again, since the vertical axis of your graph is labelled "efficiency", I understand your comment to mean that at your suggested operating point, 2-3% OF THE CHANNEL TIME is being spent on collisions (as opposed to idle periods or successful packet transmissions). This is NOT the same as saying that only 2-3% of the packet transmission attempts result in a collision. Remember that each collision event takes up a small amount of time (on the order of a few hundred bit times) whereas a successful packet transmission can occupy the channel for much longer (on the order of thousands of bit times, for a reasonable segment size). For example, consider a scenario where every packet is 10,000 bits long and every collision event occupies the channel for 1,000 bit times (pretty much the worst case, since you know that everyone will have detected the collision and aborted their transmissions in half that time....). If the collision rate is 50% (i.e., half the time you succeed and half the time you suffer a collision when attempting to send a packet), the proportion of time occupied by collisions in the worst case would only be: (50% * 1,000 bits) / (50% * 1,000 bit times + 50% * 10,000 bit times) which is 1/11 or about 9%. >That is, 2-3% of your total packets should be collisions, and you >are then at the top of that curve. My own unscientific surveys of >multi-protocol networks suggests that this is higher - 5%, maybe >more depending on protocol, but I've never considered it to be 15% >In the above graph, 15% would still be a useable network but well >on the way to the right of the optimal point. Once you pass that >point, you are headed along the exponential curve of additional >collisions. This isn't reasonable. Right now when I typed "netstat -i" on my Sun workstation, I got: Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue le0 1500 csri2-ether genie.csri 1658181 0 608422 4 52114 0 lo0 1536 loopback localhost 115487 0 115487 0 0 0 Modulo various comments about the inaccuracies of the collision counter (since it may only report the number of packets that suffered one or more collisions, not the total number of collisions...), it can be seen that roughly 8.5% of the time my workstation tried to send a packet over our ether, a collision occurred. This is not a busy network, with average daytime utilization under 5% and peak utilizations over 1 minute intervals under 20-30%. >I'm basing my much lower percentage on the "LAN and Enterprise >Network Performance Queuing Model", SST Associates, and "Performance >of a Stochastically Optimized CSMA Network" from Mitre (also >summarized fairly well in Nemzow's "The Ethernet Management Guide"). >I'm aware that these are biased towards TCP/IP, but the original >poster is naming that as one of his primary protocols. I've never heard of any of these guys. Show me where I can see the measurements and/or analysis to back up these numbers. (Remember, the world is full of snake oil salesmen who quote the throughput formula for unslotted Aloha, or non-persistent CSMA *without collision detection* and say it is Ethernet...) >In any case, do you have a more recent study that you are >obtaining the 15% figure from for multi-protocol networks? My own >empirical evidence suggests that 15% as acceptable would be a rare >case. >V>15% collisions is an almost idle network. I agree with Vernon here. 15% collisions *is* an almost idle network, especially if it has a bunch of machines with fast Ethernet interfaces on it (like SGIs) where one host sending, say, the fragments of an NFS page as multiple packets goes around colliding with everybody else who was deferring to him... >If 15% of the number of packets sent are collisions, it's almost >certainly *not* an idle network, no matter if you agree with the >curve above or not, wouldn't you say? Not at all. You really have to think more about what happens on a shared contention medium like Ethernet. If the network is mostly busy, then almost every time *I* decide to generate a new packet, I'll find the network busy, and have to defer my transmission until end-of-carrier. Now if we are talking about a "real" network with a mixture of long (data) and short (acknowledgement) packets, it is very likely are that I will find the network busy with a long packet than a short one. (Trust me, or go look up anybody's measurement study showing the histograms of % packets versus % bytes sent in packets of various lengths -- and remember that if you decide to xmit at a random time, you are equally likely to do so when a random byte is byte is being sent, rather than a random packet.) Because of the short packets in the traffic mix, there is a good chance that more than one station will decide to transmit a packet during the transmission time for a long. QED, you suffer a collision. In other words, the only way *not* to have many collisions is for most of the packets to arrive to the system to find the channel mostly idle, since it is the deferred packets (common when the network is busy) that suffer most of the collisions. > People may differ on what they >think the optimal "break point" on the curve is, but everybody seems >to agree that the slope starts out very small and grows slowly until >it reaches a break point. Sorry, it's not that simple. If it were, analysts like me would have more respect on the net... :-( Mart L. Molle Computer Systems Research Institute University of Toronto Toronto, Canada M5S 1A4 (416)978-4928
![]() |
||||||
|