Speedtests, SamKnows, and Fantasy vs. Reality at the FCC

Far too many people seem to think that when they go to Speedtest.net to test their connection, they get a number that has any bearing on reality. For most of us, it simply doesn't. This is true of other large tools for measuring connections. And it has important policy implications because the FCC contracted with a company called Sam Knows to measure wireline speeds available to Americans (I'm a volunteer in that project). Sam Knows explains :
SamKnows has been awarded a ground breaking contract by the Federal Communications Commission (FCC) to begin a new project researching and collecting data on American fixed-line broadband speeds delivered by Internet Service Providers (ISP's) - until now, something that has never been undertaken in the USA. The project will see SamKnows recruit a team of Broadband Community members who will, by adding a small 'White Box'’ to their home internet set up, automatically monitor their own connection speeds throughout the period of the project.
Unfortunately, SamKnows appears to be documenting fantasy, not reality. To explain, let's start with a question Steve Gibson recently answered on his amazing netcast, Security Now (available via the TWiT network). A listener asked why he gets such large variation in repeated visits to Speedtest.net. Security Now Logo Steve answers the question as an engineer with a technical explanation involving the TCP/IP protocol and dropped packets. But he missed the much larger issue. Packets are dropped because the "pipes" are massively oversubscribed at various places within the network (from the wires outside you house to those closer to the central office or head end). What this means is that the cable company (and DSL company, to a lesser extent) takes 100Mbps of capacity and sells hundreds of people 20Mbps or 30 Mbps or whatever. Hence the "up to" hedge in their advertisements. The actual capacity you have available to you depends on what your neighbors (cable) or others in the network (DSL) are doing. Dropped packets in TCP result often result from the congestion of high oversubscription ratios. This gets us into why Speedtest.net and Sam Knows deliver fantasy numbers. Large operators know where the Speedtest.net and SamKnows servers are and they find ways of prioritizing that connection. The result they give you is often the maximum line capacity that you could get, like travel times on roads at 2AM. Unfortunately, every other site you visit is gets stuck in rush hour. I have access to a few very fast, non rate-limited servers and I can never come close to the transfer speeds that SamKnows or Speedtest.net tells me I am getting most of the time. SamKnows suggests that I am regularly getting 30Mbps or more downstream but I have never seen that in practice regardless of whether I am connected to a content distribution network or other site that is capable of sending very high throughput. Recent SamKnows Results Heck, I cannot make through a single show on Hulu to my TV at the 2Mbps stream without several pauses to buffer. I pay Comcast for something like "up to" 30 downstream and 4 up. Speedtest.Net and SamKnows tell me that I am actually getting 6-9Mbps upstream, suggesting that Comcast is giving me more than I pay for. But when I upload a large file to a server in Utah, I find that Comcast only delivers between 2.5 and 3Mbps. If I go to my Comcast connection in an office across town, I find I can achieve around 4.5 Mbps regularly with the same file to the same server (we pay for upstream of 5Mbps there). This suggests my experienced speed is controlled by Comcast, not by the connections between Comcast and Utah. Further, I tried the same test while I was in Lafayette, Louisiana. I uploaded the file to the same Utah server at 20Mbps on the muni fiber network from a friend's house (for which I believe he pays less than I do for my much slower connection). SamKnows is telling the FCC that my cable connection is significantly higher than what Comcast really delivers, which I suspect is a systematic overstatement of real speeds for ISPs that are gaming the system. This makes cable and DSL companies appear to be doing a much better job than they actually are. It roughly doubles my real upload speed and doubles or triples my experienced download speed. The problem is that cable and DSL companies can optimize their networks for known sites like Speedtest.net and the SamKnows servers. If we want to get a better sense of what cable and DSL companies are truly delivering, there is a solution -- and interestingly enough, it comes from a virus. A few years ago, Steve Gibson described the Conficker computer virus in intricate detail. In order for a modern virus to be useful, it has to get instructions from the owner (to send Spam or do other botnet things). If the virus always looked for instructions at jerkstore.com, it would be very easy to isolate and destroy. So Conficker used an algorithm to determine what domain it would check for instructions (probably ultimately going to something like wqerpvxvblkj.biz). On any given day, the virus would check a site out of a potential list of 50,000 -- making it very hard to predict what site it would check. The good guys could not register 50,000 sites every day to block it but the bad guys had only to register a few sites that they knew the algorithm would pick to check. This is how honest speedtests should be done. If a company were receiving a large contract from the FCC to measure real-world speeds rather than fantasyland connections, it should scatter servers throughout the country and use an algorithm to prevent ISPs from knowing the difference between a user engaging in normal activities and a SamKnows router recording the actual speeds one would get going to a normal website. Perhaps the ISPs would find additional ways to game this approach, using Deep Packet Inspection or some other tools but I think it would be dramatic improvement rather than making it so easy for companies like Comcast to present its fantasy to the FCC as fact. Fantasy statistics are great for cable and DSL companies that are trying to convince DC that they are doing a great job and we have no need for policies that would give Americans a real choice in providers. If we are spending public money to gather data, let's get real data, not fantasies. I would love to see the great folks at M-Lab do this, but I don't know if they have the capacity or connections necessary for a nationally distributed server setup hidden behind many hundreds of different IPs. Addendum: I am aware of the "speedboost" or "turboboost" or "superamazingcoolfrigginawesomeboost" that some ISPs use and that alone should not account for any of the above discrepancies. By testing a variety of filesizes, one should be able to get accurate results regardless of such short-term enhancements used by ISPs.