Tech > WebMirroring

Web Mirroring

In many situations, you can pull down the contents of a website, recursively using the wget utility.

To do a straight forward mirror of a site:

  • wget -m http:\\

long version:

  • wget --mirror http:\\

This is actually the same as specifying:

  • wget -r -l inf -nr -N http:\\

long version:

  • wget --recursive --levelinf --dont-remove-listing --timestamping http:\\

Convert for off-line reading, including giving all html files an .html extension.

  • wget --recursive --level1 --dont-remove-listing --timestamping --convert-links --html-extension

Other common options or variants

-l depth
  Specifies the maximum depth level for recursion.

  Follows relative links only.

man wget will tell you more.

Proxy servers

Set an environment label http_proxy

E.g. in bash shell...

  • export http_proxy

See the manual for specifying username and password if your proxy server requires them. Options --proxy-user --proxy-passwd


  • wget --verbose --timeout=30 --mirror --proxy-user=myuser --proxy-passwd=mypassword

More info at


  • wget --mirror

-- Frank Dean 6 Dec 2002