Samstag, 5. Januar 2008

Messing about with a python implementation of wget


wget is a really wicked tool - I really miss it when I have to use other systems. So I thought, "why not try to implement wget in python - seeing as python is easy to install on lots of systems - even Nokia's S60 can handle it. Of course, writing something with anywhere near the functionality of wget is pretty much impossible - and anyway, I don't particularly need all of it's functionality. The concrete requirement I have is to build a downloader module for currxchange (to download xml files containing currency exchange rates).

After messing about with python's urllib, I decided to use urllib2 - I only really needed urllib2.urlopen as this provides the info() function which can spit out the metadata about the upstream file - things like file_descriptor.info()["Content-Length"] are thus easy to access.

I have the script kind of working with Kelvie Wong's cool ProgressBar module from http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/168639 (note: the class posted as a comment right down at the bottom of the page). There is another pretty cool python class available for doing more or less exactly what I want to do (even with gui if you're into that) at http://www.python-forum.de/topic-9647.html. It's threaded and downloads the file in three different parts. Methinks it can resume downloads as well - pretty neat. Still, I'm going to keep messing about with my own class so I can tailor it for currxchange. From all this messing about you really appreciate how much work went into a utility like wget - amazing!


Keine Kommentare: