Basics
Wget is one of the powerful tools available there to download stuff from internet. You can do a lot of things using wget. Basic use is to download files from internet.
To download a file just type
wget http://your-url-to/file
But you cannot resume broken downloads.use -c option to start resumable downloads
wget -c http://your-link-to/file
You can also mask the program as web browser using -U.
This helps when the sites doesn’t allow download managers.
wget -c -U Mozilla http://your-link-to/file
Download Entire Website
You can download an entire website using -r option.
wget -r http://your-site.com
But be careful. It downloads the entire website for you. Since this tool can put a large load on servers it obeys robot.txt you can mirror a site on you local drive using -m option.
wget -m http://your-site.com
You can select the levels up to which you can dig into the site and downloads using -l option.
wget -r -l3 http://your-site.com
This will download only up to 3 levels. Suppose you want download only sub folders in a website url use –no-parent option. With this option wget downloads only the sub folders and ignores,the parent folders
wget -r –no-parent http://your-site.com/subfldr/subfolder
Now coming to terrible ideas.. to the hell with webmasters, not allowing to download the website type to ignore the robots.txt.
wget -r -U Mozilla -erobots=off http://url-to-site/
p.s. masking like a browser is a crime in some countries…. or something like that, i have heard on net.
Fooling the Webmasters
Do you think the web master cannot stop u with above command. to fool him use
wget -r -U Mozilla -erobots=off -w 5 –limit-rate=20 http://url-to-site/
here -w 5 instructs wget to wait 5 secs before downloading another file and –limit-rate=20 makes wget to cap the download speed to 20KBps. So u can fool the webmaster ….
Download all PDFs
You can download all files of a particular format , like all pdfs listed on a webpage,
wget -r -l1 -A.pdf –no-parent http://url-to-webpage-with-pdfs/
This is most useful for students. When they find a webpage of a professor with the files they can use this command to download all pdfs or lecture notes.
visit wget man page for more details wget is also available in windows.
You can get all the powerful features of wget in windows!!!! get it here
References
Mastering wget
Linux reviews
More On Website Mirroring
to download urls with utf-8 characters use
–restrict-file-names=nocontrol option with wget
you will save files with correct name in filesystem