Wednesday, January 27, 2010

Could it Really Be That Easy?

Hmm... my ISP's service is really bad. The connection quality sucks: it's not just that the connection is slow (expected), but that too many packets get dropped. Watching streaming video is out of the question. Downloading files larger than (say) 1 MB is possible only with wget (wget rules!) or DownThemAll (Firefox extension). Manually pausing and restarting downloads sometimes works, but who wants to do that. Both Firefox and Chrome exhibit similar behaviour. Without wget (or DownThemAll!), downloads just stall: no timeouts, no nothing. Extremely frustrating.

Since a lot of tech presentations (& fun stuff) are videos on YouTube, I wrote a small program a while back, which, given a YouTube video URL, could generate a link to a downloadable video file, on which wget goes to work. Problem solved, eh?

Then, in classical Top Gear fashion, I began wondering: How hard can it be (to write a robust downloading component)?

So Monday night I began writing some exploratory code, was traveling Tuesday, so got no work done, today I finished my proof-of-concept (POC). Turns out, if you set a small-ish value for your read timeout (30-45 seconds), and resume from where you left off (gotta know your HTTP!), its not that hard to write a reliable downloading component!

Which brings me to the question: why do so many downloading components in such major pieces of software as Firefox, Chrome, Eclipse (3.4) etc. find it so hard to download reliably in the face of a bad connection?

Of course, my HttpDownloader class is nowhere near robust enough for widespread use (yet), but the POC does show that basic reliability can be had in less than a day's work. There must be more to this than I've found out so far —there must be some very good reasons why reliable downloaders are so few and far between. It can't be that easy.

Keep checking this space (or follow me on Twitter): I will update when I know more. With the POC complete, this does go on the back-burner though.

Bookmark and Share


  1. I too don't understand why it's hard for Chrome and Firefox to do this.

    I'm thinking there are yet to be discovered circumstances that Chrome and Firefox are accounting for in order to ensure that they aren't corrupting the partially downloaded file.

    A simple first appache to a "reliable HTTP download mechanism" is really at simple as sending the "Range" header-- but since it seems like Firefox and Chrome have a hard time with this there must be more to it?

  2. I am thinking along similar lines, that's why I have a hard time accepting why such major software has so much trouble doing a seemingly simple task. Hence the continuation of the experiment (as & when I find the time).

    I already have a re-usable component, which, with a little more spit & polish and some more convenience methods could be released as open source, but I wonder how many people would be interested :-)? Most people are lucky enough not to have such a sucky ISP.