Ok so here is a review of a golden oldy, Win HTTrack has been around for literally as long as I can remember and has been useful to me many times in my professional life. Basically what it does is it will crawl a website and mirror it on your hard drive so you can browse it, or scrape it,?later on.
The plus points of downloading a website before scraping is that when you are testing out your scraping script you will not be wasting bandwidth and the scraping can be done at the same time you download another site meaning you can maximise your time.
There are a couple of issues with using applications such as HTTrack, Win or Web (Linux/Unix version), in that it will download the entire site, not just the pages you need but the entire site (ways around this to come later on), and also that it will struggle to download sites that dont have standard navigation routes, such as search driven sites, AJAX / JS?navigation etc. There is an option for inputting login details, username and password, so you can download membership based sites.
HTTrack is a multi-threaded system hogging machine when it wants to be but as such doesnt mess around in doing what it is meant to do. I really like it and find that its short comings are far outweighed by its positives. Best thing about it is that it is free so anyone can check it out.