<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
<title>Snipplr - noah</title>
<link>http://snipplr.com/users/noah/tags/download</link>
<description>Recent snippets posted on Snipplr.com</description>
<language>en-us</language>
<pubDate>Wed, 15 Feb 2012 13:14:57 GMT</pubDate>
<item>
<title>(Bash) Download an entire site with wget -pkr</title>
<link>http://snipplr.com/view/5094/download-an-entire-site-with-wget-pkr/</link>
<description><![CDATA[ <p>## Where to Get Even More WGet Hacks
See also these killer `wget` [hacks by Jeff Veen.](http://www.veen.com/jeff/archives/000573.html)

## The WGet Hacks
Here are a couple of recipes to **download and archive an entire Web site,** starting with the given page and recursing down 1 level.   Adjust how many levels deep by changing the numeric argument given after -l


## Pitfalls
As of 2008, WGet doesn't follow @import links in CSS.</p> ]]></description>
<pubDate>Sat, 16 Feb 2008 21:42:46 GMT</pubDate>
<guid>http://snipplr.com/view/5094/download-an-entire-site-with-wget-pkr/</guid>
</item>
<item>
<title>(Bash) Download linked JPEGs from a Web page, on the command line</title>
<link>http://snipplr.com/view/4063/download-linked-jpegs-from-a-web-page-on-the-command-line/</link>
<description><![CDATA[ <p>The following command will download all the files with a JPG extension that are linked from http://flickr.com.

_Requires the LWP and HTML::Tree Perl modules.  You must also have Wget installed on your system for this to work._</p> ]]></description>
<pubDate>Fri, 02 Nov 2007 22:57:45 GMT</pubDate>
<guid>http://snipplr.com/view/4063/download-linked-jpegs-from-a-web-page-on-the-command-line/</guid>
</item>
<item>
<title>(Perl) Grab linked files from a list of web pages</title>
<link>http://snipplr.com/view/3126/grab-linked-files-from-a-list-of-web-pages/</link>
<description><![CDATA[ <p>## how to use

`perl grabit.pl urls_for_download.txt`

Expects as argument the name of a file containing a newline-delimited list of URLs:

    http://example.com/coolstuff
    http://example.com/coolstuff/fun
    http://example.com/videos/explosions

When invoked, launches an interactive shell that asks what type of file should be downloaded.  Then downloads all the files that are linked from each of the listed Web pages.

Note that the location of the download folder is hard-coded to `c:/windows/desktop/grabit/` so you may want to change that before trying.

This script is also [available on Github](http://github.com/textarcana/scrapers/blob/643e6e7cb349fa94cbc3fc88e1d55c7b6a262d11/grabit.pl)

## Wait! Do you know about WGet and Curl?

This script is legacy.  People seem to like it (hey, I still use it) but today I would probably not write my own tool to download multiple files off remote sites.

Instead I would likely just use a command-line Web browser like [WGet](http://lifehacker.com/software/top/geek-to-live--mastering-wget-161202.php 'Gina Trapani of Lifehacker, on the way of the WGet ninja') or Curl.  [LWP-Request would also do the trick](http://snipplr.com/view/4063/download-linked-jpegs-from-a-web-page-on-the-command-line/)

## do not comment your code like this!

For a great explanation of the rather baroque commenting style I was using circa 2001, see [Steve Yegge's excellent article on code style: *Portait of a n00b.*](http://steve-yegge.blogspot.com/2008/02/portrait-of-n00b.html)  

Of course, when I sit down to write a Perl script today, I [use POD](http://snipplr.com/view/18611/perl-pod-embedded-documentation-example/) to format and publish my comments.</p> ]]></description>
<pubDate>Tue, 03 Jul 2007 22:31:30 GMT</pubDate>
<guid>http://snipplr.com/view/3126/grab-linked-files-from-a-list-of-web-pages/</guid>
</item>
<item>
<title>(Bash) wget with username and password</title>
<link>http://snipplr.com/view/2687/wget-with-username-and-password/</link>
<description><![CDATA[ <p>This is how to connect to a secure site with wget.  The Cygwin manpage is quite confusing on this issue.</p> ]]></description>
<pubDate>Tue, 22 May 2007 11:08:00 GMT</pubDate>
<guid>http://snipplr.com/view/2687/wget-with-username-and-password/</guid>
</item>
</channel>
</rss>
