2009/12/09

ADSL modem soap opera :)

In the last weeks, I migrated from a rock-solid Internet connection to the typical uncertainty of most ADSL users seem to be plagued by. But crisis leads to solutions, and bits of new knowledge.

Everything began when I lost a discount in my phone bill in my old ADSL plan, which used to be the "best in the city", 1.5Mbps. (Don't laugh, lots of people in Brazil would have an orgasm by having access to such plan!). Well, I called to complain about the bill, and they said the discount was a time-limited offer... but the region got brand new ADSL2+ ports, so I could migrate to a new, faster, and cheaper plan. I could opt between 2, 4 and 8 Mbps, I chose 4. (I don't understand why ADSL2+ plans are cheaper, I guess it may have relationship to the fact that ADSL2 can "hibernate", saving electrical power, while ADSL1 is full-throttle all the time. If someone knows the answer, please comment.)

Flashback

My line was said to be marginal by the telco guy that came here when ADSL was first installed. The D-Link 500b modem reported a high downstream attenuation (55dB). It did not cause ADSL link falldown, it did not cause latency increase, but it seemed to cause packet loss in ATM level, something around 0,3%. Problem is, one 1500-byte IP packet is assembled from 30 ATM packets, and AAL5 does not do any retransmission, so any loss is magnified 30 times, resulting in 10% packet loss at IP level, for big packets. So, losses in ATM level must be *really* low. ADSL works a lot to avoid those losses, but they still happen sometimes.

The way I found to test for ATM-related packet loss was to ping the nearest telco router in two fashions: plain, and -s 1400 (that is, with a big packet). If packet loss is different between them, ATM and therefore link quality may be the problem. A variation is to ping a distant site (like google.com), so the plain ping shows the "baseline" packet loss -- but you will not be sure whether the "big ping" packet loss is really an ADSL problem. (Actually, google.com is not a good ping target because they don't answer "big pings" with packets of the same size. Since ADSL tends to have more problems in downstream, due to higher frequencies involved, it is better to choose a site that "pongs" packets as big as the ones that were pinged).

As most people know, TCP/IP assumes that packet loss is caused by congestion, so any packet loss makes bandwidth to fall sharply, something like having -90% speed under a 10% packet loss. That's why wireless level 2 protocols like 802.11 actually do packet retransmission, even though this task does not fall in this layer, not in classroom at least... The sympthom here was that things like YouTube did not work well, since they stream video over a single TCP/IP connection. But loading a page worked well because they use several parallel and short-lived TCP connections, which mitigates the effects of packet loss. (SCTP protocol has the concept of "streams" and every one of them has a separate congestion control, plus a "urgent" flag for messages, exactly because of these things.)

Well, fiddling a lot with 500b configurations, I found that DMT mode had the biggest packet loss at IP level: 5%. I don't remember exactly but I think that G.lite had 3% and ANSI had 1% -- the latter one is within the limit that TCP/IP tolerates without delivering lackuster bandwidth.

Other random facts: my father's ADSL router, which looks like the cheapest in the Universe, maintained a perfectly lossless link here. And the 500b modem behaved very well in my father's phone line. And the most important: upload speed was NEVER affected in any moment, it is always the contracted one, probably because upstream uses lower frequencies in phone line, which are less victimized by line losses.

But the ultimate solution was put my old and trusted 3Com HomeConnect modem back into service. For some reason, this modem offered undetectable packet losses over the same line. I never bothered again about this, until some weeks ago.

WiFi

After two years of veeeeery smooth Internet experience, some problems with heavy packet loss began to happen from time to time, in particular when powering up the iMac. Disabling WiFi and re-enabling tended to solve the problem. If not once, but in two or three toggles the things went straight again.

No other WiFi device here has this problem, so it must be some incompatibility between iMac's WiFI and my access points here. This one I could never solve completely, but abolishing WDS improved things a lot. Now, exactly one turn off/turn on solves the problem on iMac, while sometimes I had to do it 20 times in the past. Let's just pray my next router pleases the iMac better and does away with this little problem.

Packet losses

Some weeks before my phone bill problem, I began to have serious performance issues while communicating with certain sites, but not with others. Ping was normal, and packet loss was the same for any packet size, so the packet loss was happening either at remote end or by some intermediate router. Indeed a intermediate route of Traceroute was losing packets.

Funny thing is that, going to epx.com.br or to google.com means getting a different route as early as at 4th hop. The loss was happening at 5th hop in the path to epx.com.br. The path to google.com never had problems. After I heard that anycast exists for IPv4 too, and Google uses is actively to serve content in Brazil using Brazilian servers, the problem boils down to a simple international link overcommitment :) This conclusion is corroborated by the fact that losses increased in "peak" hours (which means from 18:00 to 00:00 in Internet usage).

At least, this particular problem seems to have been solved these days. So good, because the ATM packet loss made a comeback:

ADSL2+

Ok, so I migrated to 4Mbps, ADSL2+ plan. Thinking that my 3Com modem would not be able to connect, I plugged my D-LInk 500b to the line. The link got the bought speed, it said my line could go up to 9Mbps... but the packet loss problem was back.

After I heard that ADSL2+ equipment should be backwards-compatible, and ADSL1 goes up to 8Mbps, I put the 3Com modem back in service once more, and the packet loss was one more gone. The only (and new) problem: link tended to fall often, in particular when no traffic was being exchanged. This 3Com modem always had a problem with phone line disconnections: when the link falls (e.g. when you disconnect the phone connector and plug it again), it sometimes does not restablish communication, even if light goes green. You need to cycle the modem.

Perhaps the telco equipment was "hibernating" the connection a la ADSL2, which exercized this particular 3Com bug? Or the line actually worsened in those 3 years and modem was showing a instance of the "cliff effect", the all-or-nothing nature of communications with forward error correction. Or the line sometimes goes bad because of some intermitent noise source, like a big motor somewhere. I'm still looking for answers...

Fiddling a bit with 500b configuration again, I found that disabling ADSL2+, leaving only ADSL2 and RE-ADSL2 enabled, reduced the packet loss problem to 0,5% at IP level, which means of course much less in ATM level. And this modem did not have the problem of disconnecting often. The downside: downstream did not go over 1.1Mbps in this mode.

So I had the choice of using the old modem that delivered maximum speed but disconnected from time to time (and my VPN does not like this), or using the newer modem with reduced speed and non-zero packet loss.

In time, RE-ADSL2, or ADSL2 Annex L is an extended range mode which uses more power and more redundancy. Some users have reported better link condition with ADSL2 than with ADSL2+, either because of RE-ADSL2 or because ADSL2+ uses frequencies up to 2MHz, which are more victimized by noise and line losses.

So I bought a SpeedStream 4200 modem (Siemens), which has a local reputation of working well in long, noisy telephone lines. Unfortunately it turned me down. The packet loss was about the same as 500b, and synced in a slower speed (around 2.5Mbps). Great deal... the only parameter that was better than before was the latency (11ms RTT).

Moreover, the line condition seemed to change all the time. Time to try the same trick as I did in 500b: force some particular connection mode different from ADSL2+. SpeedStream 4200 does not have this kind of configuration via Web, but it has a telnet CLI mode. After trying several modes, and reviewing all phone connections at home (and soldering connections that seemed to be rusty/corroded), I settled with RE-ADSL2:

> cfg dsl{mode=red2
> cfg save
> do reboot

This mode made my link to settle in the following condition: 2.4Mbps, packet loss is absolute zero, latency is a bit jittery (40ms RTT with 30ms stddev, for small packets). Link is not falling "for free". Upstream is the contract 400Kbps -- as I mentioned before, upstream is always perfect (ironically, upload is the very thing I don't care about, I am not a Torrent guy).

Not the 4Mbps I was expecting, but good enough, and paying less than I used to pay for 1.5Mbps. Too bad that a future upgrade to 8Mbps or more will have to deal with the attenuation problem.

UPDATE 10/Dec: had a sharp decline in line quality this morning, which only allowed a stable link using G.lite (which limits speed to 1.5Mbps). After one hour, line got ok again. Actually, it got better than last few weeks. (Has telco fixed something? I called telco today to change my plan to 2Mbps due do those problems, and they said that former change to 4Mpbs was still being processed by tech team, which I hope means that they are monitoring problems too.) Now it runs at full 4Mbps connection again in ADSL2. ADSL2+ still goes just to 2Mpbs. Packet loss is around zero (1 in 10000 packets) in either mode.

Next steps

Since the telco guy said back then that buildings with underground phone lines tended to have high attenuation, and I have to reroute the line (it is currently laid over the ground so it reaches my office, coming from the original point), current plan for next year is to replace the whole line all the way to the pole. I didn't like the "rusty" aspect of wires today, I thought that copper would not rust! Maybe electrolitic corrosion? Or low-quality wire?

If this does not improve the attenuation sharply, then it will be the time for a call to telco asking for a line replacement. And who knows what can happen in 1 or 2 years, maybe the telcos will show up here in a sunny morning with a fiber optics cable? Or they decide to put a cabinet between central and distant phones like mine. We'll see.
blog comments powered by Disqus