Monday, December 20, 2010

Windows wget

SkyHi @ Monday, December 20, 2010

wget is a great command line utility that is natively available in Linux and can be downloaded for Windows (see also GNU WGet for Windows (Windows 7, Vista, XP, etc.)). wget can be used for many download situations including large files, recursive downloads, non-interactive downloads, multiple file downloads, etc.



Note: options ARE case sensitive.



1. Download a single file with wget using no options.

wget http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz
While downloading, wget will display a progress bar with the following information:

  • % of download completion
  • Download progress in bytes
  • Current download speed
  • Estimated time remaining
Download in progress



















Completed download





















2. Download a file saving with a different name using wget -O

wget http://www.vim.org/scripts/download_script.php?src_id=7701
Even though the downloaded file is in zip format, it will be saved with the name download_script.php?src_id=7701 without the -O switch.



To modify this behavior specify the output file name using the -O option.

wget -O taglist.zip http://www.vim.org/scripts/download_script.php?src_id=7701
3. Specify download speed / download rate Using wget –limit-rate



While executing the wget, by default it will try to use all possible bandwidth. You can limit the download speed using the –limit-rate switch.

wget --limit-rate=200k http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz
4. Restart a download which stopped in the middle using wget -c.

wget -c http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz
5. Download in the background with wget -b

wget -b http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz










The download will begin and give back the shell prompt to you. You can always check the status of the download using tail -f  (Linux only) .

tail -f wget-log
6. Mask user agent and display wget like browser using wget –user-agent



Some websites can disallow you to download its page by identifying that the user agent is not a browser. So you can mask the user agent by using –user-agent options and show wget like a browser.

wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3" http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz
7. Test URL using wget –spider.  This will test that the file exists, but not perform the download.

wget --spider http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz
















8. Increase total number of retry attempts using wget –tries.

wget --tries=75 http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz
9. Download multiple files / URLs using wget -i



First, store all the download files or URLs in a text file:

URL1

URL2

URL3

URL4



Next, give the download-file-list.txt as argument to wget using -i option.

wget -i download-file-list.txt
10. Download a full website using wget –mirror

wget --mirror -p --convert-links -P ./LOCAL-DIR WEBSITE-URL
  • –mirror: enable mirroring
  • -p: download all files that are necessary to properly display a given HTML page
  • –convert-link: after the download, convert the links in document for local viewing
  • -P ./LOCAL-DIR: save all the files and directories to the specified directory
11. Skip certain file types while downloading using wget –reject.  In order to download all content except .gif images use the following.

wget --reject=gif WEBSITE-TO-BE-DOWNLOADED
12. Log messages to a log file instead of stderr using wget -o.  To redirect output to a log file instead of the terminal.

wget -o download.log DOWNLOAD-URL
13. Quit downloading when certain size is exceeded using wget -Q.

wget -Q5m -i FILE-WHICH-HAS-URLS
14. Download only certain file types using wget -r -A



You can use this for the following situations

  • Download all images from a website
  • Download all videos from a website
  • Download all PDF files from a website
wget -r -A.pdf http://url-to-webpage-with-pdfs/
15. You can use wget to perform FTP downloads.

wget ftp-url
FTP download using wget with username and password authentication.

wget --ftp-user=USERNAME --ftp-password=PASSWORD DOWNLOAD-URL
Note: username and password can be used for HTTP and HTTPS downloads as well using --http-user=USER, --http-password=PASS respectively.

REFERENCES
http://www.powercram.com/2010/01/how-to-use-wget-includes-several.html