Looking for some picture downloaders

Weresmilodon
Fel is my Dark Lord ;)
Posts: 464
Joined: Mon Sep 22, 2003 7:19 pm

Looking for some picture downloaders

Post by Weresmilodon »

Ok, I'm looking for some picture downloaders, that can download pictures either from a page with multiple links, all links going from that page, or from subfolders. It should also be able to determen if the picture has been downloaded before, and if so, have an option not to do so again. (Sorting through hundreds of pictures again and again is not my idea of fun.)

Additionally, it should be able to 'check' the page/subfoldes at certain times to see if there is new material, preferably some form of schedule, or ‘check every * minutes’ option. Also, it should keep itself somewhat at the same speed as a browser, so as not to overload the page.

Anyone have any ideas of one (or several)? None of the ones I have found can check for itself, or determine if there is new material. They just download everything when you tell them to, regardless of if it has already been downloaded before, just overwriting the former pictures.
"I'm a male. Males are supposed to act tough."
Tarrin, Chapter 29, The Questing Game.
Scuttle
Talent
Posts: 5
Joined: Sat Nov 13, 2004 9:31 pm

Re: Looking for some picture downloaders

Post by Scuttle »

Hm, httrack is good, also, I use wget when mirroring sites, wget checks the timestamps when it retreives files and only gets new ones
Weresmilodon
Fel is my Dark Lord ;)
Posts: 464
Joined: Mon Sep 22, 2003 7:19 pm

Re: Looking for some picture downloaders

Post by Weresmilodon »

Thanks, I'll try looking them up.
"I'm a male. Males are supposed to act tough."
Tarrin, Chapter 29, The Questing Game.
User avatar
Shadowhawk
Child of Niami
Posts: 776
Joined: Thu Jan 22, 2004 12:17 am
Location: Poland
Contact:

Re: Looking for some picture downloaders

Post by Shadowhawk »

Scuttle wrote:(...) I use wget when mirroring sites, wget checks the timestamps when it retrieves files and only gets new ones
If the files doesn't change, and they might be only new files, wget in recursive mode (or with page prerequisites), with proper depth limit, type limit, directory limit etc. in continue mode can probably do what you want. If files change, you might use time-stamping. I even used wget successfully it on the sites which used simple leech protection, like checking the referer, or time between successive downloads.

You might also (if you use MS Windows) try Nici (not tested, trial shareware :!:)
I AM DEATH, NOT TAXES. *I* TURN UP ONLY ONCE.
(Terry Pratchet, "Feet of Clay")
afrigeek
Da'Shar
Posts: 233
Joined: Mon Nov 14, 2005 1:10 pm
Location: Uganda
Contact:

Re: Looking for some picture downloaders

Post by afrigeek »

I use sitecopy and it works beatifully for me with regards to all the aforementioned features.
"Bachelors should be heavily taxed. It is not fair that some men should be happier than others." - Oscar Wilde

What actually happened was that George Dubya Bush saw an Iraqi maths teacher carrying a geometry set, accused him of being a member of the notorious Al-gebra movement, and charged him with possessing weapons of maths instruction.
Weresmilodon
Fel is my Dark Lord ;)
Posts: 464
Joined: Mon Sep 22, 2003 7:19 pm

Re: Looking for some picture downloaders

Post by Weresmilodon »

Tried several of them, but I’m having trouble. Perhaps I should explain, and see if there's been a misunderstanding, or if some of you can help me get it right.

Let’s pull up 4chan as an example. Lets say I wanted all the pictures in the /wg (Wallpapers/General) directory. The link there is http://cgi.4chan.org/wg/imgboard.html . Now, I want the program to get all the picture in /wg, and keep checking if there’s new ones, and download those as well. This cause several problems for me; the biggest one is actually getting all the pictures. It needs to go into the threads, check all the replys, and all the pages and get everything that way. At the same time, I don't want it to download everything again and again, just because I deleted the junk. That's why I mentioned checking the subfolders. If it monitors http://cgi.4chan.org/wg/ and downloads the files that are added, but don't touch the files already gotten, then that’s all I need.

It should also be able to monitor several addresses at the same time, and if possible, download them into different folders.

Now, I tried wget, got wgetGUI, and worked on that for some time, but I couldn’t get anything. And I mean noting with that. I couldn’t even connect to anything. Then I tried Nici. Seems to be closer to what I want, but I can’t really get anything with that one either. Seems like I have no real control over what it gets, and if I want some, I have to monitor it all the time, and add everything it should get manually. Then it’s easier to just go to the page and use Linky to open all images in tabs, then Magpie to save all the images to a folder. Wastes a lot of time, and that's why I’m looking for this kind of program in the first case, so I don't have to do that.

Anyway, I appreciate any help you can give me, and thanks for the help that’s already been given.
"I'm a male. Males are supposed to act tough."
Tarrin, Chapter 29, The Questing Game.
User avatar
Wiz
Novice
Posts: 13
Joined: Sat Oct 09, 2004 12:56 am
Location: Columbus, OH

Re: Looking for some picture downloaders

Post by Wiz »

I think that Quadsucker Web from Scott Baker may be what you need.
http://www.quadsucker.com/quadweb/
It is multithreaded and rejects duplicates.
Weresmilodon
Fel is my Dark Lord ;)
Posts: 464
Joined: Mon Sep 22, 2003 7:19 pm

Re: Looking for some picture downloaders

Post by Weresmilodon »

Ok, trying Quadsucker and httrack right now. Both seems good, but neither seems to have a schedule. Do they automatically keep checking, or am I required to activate them every time to check for new content?
"I'm a male. Males are supposed to act tough."
Tarrin, Chapter 29, The Questing Game.
User avatar
Shadowhawk
Child of Niami
Posts: 776
Joined: Thu Jan 22, 2004 12:17 am
Location: Poland
Contact:

Re: Looking for some picture downloaders

Post by Shadowhawk »

Weresmilodon wrote:Now, I tried wget, got wgetGUI, and worked on that for some time, but I couldn’t get anything. And I mean noting with that. I couldn’t even connect to anything.
Well, that it is because this site (http://cgi.4chan.org/wg/imgboard.html) has some protection against leechers aka. web robots. I haven't tried to download images with the page (using either '<tt>--page-requisites</tt>' option, or limited recursive downloading including '<tt>--recursive --level=1 --accept="html,php,gif,jpeg,jpg"</tt>' options, of course with '<tt>--timestamping</tt>' (download file if there is newer version) and/or '<tt>--continue</tt>' (do not overwrite existing files)) but the main page itself can be downloaded using wget... by faking that you are using web browser, e.g. '<tt>--user-agent="Mozilla/5.0"</tt>'. For checking the site once per some time, use some command scheduler, for example put appropriate wget calling script in cron on Linux.

It is possible that you might have to set the refer to main page to download images from it.
I AM DEATH, NOT TAXES. *I* TURN UP ONLY ONCE.
(Terry Pratchet, "Feet of Clay")
Weresmilodon
Fel is my Dark Lord ;)
Posts: 464
Joined: Mon Sep 22, 2003 7:19 pm

Re: Looking for some picture downloaders

Post by Weresmilodon »

Yah, well, there's a reason for me getting the wgetGUI. I can't really work command-lines, so I don't really have any big idea of what you said there. I did try the pretend to be a browser option, and also the ignore robot.txt option, but neither one helped. I didn't see any way to set up an referral in the GUI either, so I have no idea of how to do that.

Moreover, is it possible for wget to get everything, and keep checking for new stuff? That's the function I’m really looking for, because a few of the places I want to monitor delete everything about 1-2 times per day. I don't have a chance getting them myself, so I really need some way for the program to manage itself.
"I'm a male. Males are supposed to act tough."
Tarrin, Chapter 29, The Questing Game.
User avatar
Wiz
Novice
Posts: 13
Joined: Sat Oct 09, 2004 12:56 am
Location: Columbus, OH

Re: Looking for some picture downloaders

Post by Wiz »

I read through the user manual http://www.quadsucker.com/quadweb/quadweb.html and it looks like Quadsucker will not automatically check for new content. You may be able to use a tool like Windows Task Scheduler but I have never tried.
Weresmilodon
Fel is my Dark Lord ;)
Posts: 464
Joined: Mon Sep 22, 2003 7:19 pm

Re: Looking for some picture downloaders

Post by Weresmilodon »

Yeah, I came to the same conclusion once it had run once. I doubt the Task Scheduler will work, as you have to press a button too start it all, providing it saves the settings. I couldn’t find any commands for the commandline either, so it's pretty much a bust. I read something about a more advanced version (Supersucker or somesuch) that I will check into next. Might be the same except bigger, but it's worth a try.

Edit: Nevermind. Ultrasucker is the same, except it shows no thumbs, and instead opens 12 connections rather then 4.
"I'm a male. Males are supposed to act tough."
Tarrin, Chapter 29, The Questing Game.
User avatar
Shadowhawk
Child of Niami
Posts: 776
Joined: Thu Jan 22, 2004 12:17 am
Location: Poland
Contact:

Re: Looking for some picture downloaders

Post by Shadowhawk »

Weresmilodon wrote:Yah, well, there's a reason for me getting the wgetGUI. I can't really work command-lines, so I don't really have any big idea of what you said there. I did try the pretend to be a browser option, and also the ignore robot.txt option, but neither one helped. I didn't see any way to set up an referral in the GUI either, so I have no idea of how to do that.

Moreover, is it possible for wget to get everything, and keep checking for new stuff? That's the function I’m really looking for, because a few of the places I want to monitor delete everything about 1-2 times per day. I don't have a chance getting them myself, so I really need some way for the program to manage itself.
First, you can put any command line option in Additional Parameters field in wgetGUI (or use I am a pro button to modify parameters passed to wget). Second, I don't know why Identify as browser option doesn't work for you... Do you have the problem with downloading main page, or just only the links (images)?

I haven't used wget for mirroring, so the answers might be wrong, but I think between Timestamping (download only if file is newer or does not exist), No clobber and Continue file download you should get what you want.
I AM DEATH, NOT TAXES. *I* TURN UP ONLY ONCE.
(Terry Pratchet, "Feet of Clay")
Weresmilodon
Fel is my Dark Lord ;)
Posts: 464
Joined: Mon Sep 22, 2003 7:19 pm

Re: Looking for some picture downloaders

Post by Weresmilodon »

Shadowhawk wrote:
Weresmilodon wrote:I haven't used wget for mirroring, so the answers might be wrong, but I think between Timestamping (download only if file is newer or does not exist), No clobber and Continue file download you should get what you want.
Except that I have to start it myself. And that's the problem. I need something that checks for new stuff at least every hour, without me having to give it an order to.
"I'm a male. Males are supposed to act tough."
Tarrin, Chapter 29, The Questing Game.
User avatar
Shadowhawk
Child of Niami
Posts: 776
Joined: Thu Jan 22, 2004 12:17 am
Location: Poland
Contact:

Re: Looking for some picture downloaders

Post by Shadowhawk »

Weresmilodon wrote:Except that I have to start it myself. And that's the problem. I need something that checks for new stuff at least every hour, without me having to give it an order to.
Can't you use the *.bat file generated by wgetGUI in Task Scheduler?

Or equivalently, make a batch file/script/program which calls wget/batch file generated by wgetGUI every hour, e.g. in bash shell.

Code: Select all

while ["1" = "1"]; do
   wget [parameters]
   sleep 3600 # 3600 seconds = 1 hour
done
There is probably equivalent in MS Windows DOS shell/batch file syntax... if there is equivalent of <tt>sleep</tt> command (<tt>delay</tt>, perhaps?).
I AM DEATH, NOT TAXES. *I* TURN UP ONLY ONCE.
(Terry Pratchet, "Feet of Clay")
Locked