Comment Walk the dom directly (Score 1) 104
Once i had to collect a lot of info from a website. I used java and wget and some java html parser library (possibly JTidy). anyway the code was very short and clean. I'd recommend DOM walking to other solutions when the data isn't trivial.