Help - Search - Members - Calendar
Full Version: Downloading HiRISE pictures with wget
Unmanned Spaceflight.com > Mars & Missions > Orbiters > MRO 2005
Borek
Hello,

I would like to download HiRISE pictures with wget, but somehow I cannot:

CODE
borek@nms1:/media/usbdisk/Images/HiRISE$ wget http://hiroc.lpl.arizona.edu/images/download.php?PSP_001336_1560_1.jp2
--17:03:24--  http://hiroc.lpl.arizona.edu/images/download.php?PSP_001336_1560_1.jp2
           => `download.php?PSP_001336_1560_1.jp2'
Resolving hiroc.lpl.arizona.edu... 128.196.250.179
Connecting to hiroc.lpl.arizona.edu|128.196.250.179|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/force-download]
download.php?PSP_001336_1560_1.jp2: Invalid argument

Cannot write to `download.php?PSP_001336_1560_1.jp2' (Invalid argument).


Does someone understand what wget complains about (see the last line)?

Borek
um3k
I don't know anything about wget, but I would try using direct links to the images, such as the following:
CODE
http://hiroc.lpl.arizona.edu/images/PSP/PSP_001336_1560/PSP_001336_1560_1.jp2
http://hiroc.lpl.arizona.edu/images/PSP/PSP_001336_1560/PSP_001336_1560_2.jp2

Although, I'm not completely sure this is allowed. unsure.gif
djellison
I would imagine for the purposes of maintaining server sanity, there's something in place to limit downloading by things like Wget etc.... you may notice that a lot of the large images are a PHP link not a direct file link.

Doug
akuo
Wget generally cannot handle PHP, so you'd have to use the direct links.

Wget usage can be disallowed with the robots.txt on the server, looks like there are no limits on hiroc.
DEChengst
QUOTE (akuo @ Dec 1 2006, 07:50 PM) *
Wget usage can be disallowed with the robots.txt on the server.


Not really as wget can be instructed to ignore robots.txt:

$ wget -erobots=off http://your.site.here
MarkL
It's a bit rude to ignore robots.txt with that wget switch though you can. It's the kind of thing that will make the web managers take steps to prevent wget users from accessing the site. But in this case, as there is no robots.txt, there should be no harm in using wget to download the images via their direct links. This sort of defeats the purpose of using wget for batch downloads, however you could accumulate the url's in a text file and use the -i switch.
GuyMac
The issue is very simple... it is the question mark (query string). Just enclose the URL in quotes, e.g.

wget "http://hiroc.lpl.arizona.edu/images/download.php?PSP_001513_1655_RED.jp2"
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2024 Invision Power Services, Inc.