[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Wget: Adding a prefix to downloaded files?
From: |
michel . kempeneers |
Subject: |
RE: Wget: Adding a prefix to downloaded files? |
Date: |
Tue, 17 Dec 2019 12:56:18 +0100 (CET) |
Hi Tim,
It seems completely logical that Wget --- or any application for that matter
--- works through an input list sequentially.
But the resulting order might depend upon whether Wget only handles a single
file at a time, or whether it is capable of processing several files in
parallel.
I suppose the answer is a single file only, as I cannot find anything about
parallel processing in the Manual.
But I wouldn't put money on it.
On the other hand, I may have been tricked by the settings of Windows Explorer
when wondering if the file size had an impact.
Indeed, when I try to doublecheck this behavior, it turns out that the
downloads simply are executed too quickly to visually confirm sth. of the kind!
Also, you are right in pointing out that in fact the target directory is
"ruled" by local settings (e.g. a folder in Windows Explorer), including the
sort order, which can have a confusing effect.
Some further testing learned me that in this particular case I also needed to
change the time switch for DOS' DIR command. Indeed,
DIR /O: D /T: C
sorts files per D (ate), and uses the C (reation date) to do so.
(the default values being W (= last written) for the Date, and
"sort-of-alphabetically" if no O(rdering) switch is applied. See:
[ https://ss64.com/nt/dir.html | https://ss64.com/nt/dir.html ]
[ https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 |
https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 ] )
Windows Explorer offers many more possibilities apart from its default values
("Date Created" and/or "Date Modified", I'm not really sure).
See the following screen shot (if that's of any use; I'm not sure if this forum
persists them):
The problem being that Windows Explorer itself does not explain what they
mean... So in a sense they are useless.
That's not just a remark, when you know that the default "Date created" in
Windows Explorer does NOT give the same output as the (apparent) DOS equivalent
!!
Idem for the other date types proposed by Windows Explorer: none of them
matches the output of the above DIR command...
("Date acquired", "Date archived", "Date completed", "Date received", "Date
released", and "Date sent" are even empty)
Typical MS clumzyness, I guess.
If you'd want a stance of the mess MS keeps making of Date/Time fields, have a
look here:
[
https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
|
https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
]
Apparently, their meaning changes between versions (Win7 or Win10), and even
among Win10 releases... Go figure!
Nevertheless, thx to your feedback I've been able to confirm that indeed, this
is not a Wget issue.
I suppose I can use this info to work around Wget's missing option for a
prefix/counter. (which remains the bottom line and triggered this question in
the first place)
PS:
The workaround you suggest, is of the same type as the other ones mentioned
before.
For yes, it could be done by calling Wget as often as there are images to
download, and (externally) adding a prefix (counter) for every single download.
But any such workaround would miss out on the efficiency of feeding Wget with a
plain input txt file.
And I can only repeat that such a feature could ad some power to Wget, as it
would avoid cumbersome workarounds.
Thx again for all the feedback received,
MK
Van: "Tim Rühsen" <address@hidden>
Aan: "Michel Kempeneers" <address@hidden>, "bug-wget" <address@hidden>
Verzonden: Vrijdag 13 december 2019 15:39:24
Onderwerp: Re: Wget: Adding a prefix to downloaded files?
On 12/12/19 1:25 PM, address@hidden wrote:
Hi,
I run into a particular problem when I'm trying to download a bunch of URLs I
grouped together in file "input.txt" like this:
wget -nv -a log.txt -P .\Images\ -i input.txt
Some of these files are huge, hence take a long time to download.
As a consequence, they will not appear in the same sorting order in the
download folder as int he input folder, and that's a problem, as this order has
its importance.
Since wget works sequentially, why do you think the order of downloads
has something to do with the file size ?
If 'Images' is a fresh and empty directory *and* all files download OK,
the order in the directory is the same as the order in input.txt. At
least a sane file system should keep the order (is NTFS sane ?).
Then, what is irritating: 'dir' or 'ls' tools like to use a certain sort
order by default. E.g. here on GNU/Linux 'ls' orders the output files
alphabetical by name. 'ls -rc' prints with a reverse order by creation
time (oldest first, then newer files), which seems to be what you want.
In short, wget likely is not your problem. Find out what it really is
and you can find a mitigation.
As a 'dump' work-around, save your files into a temp directory, then
move them to Images\ in the order of occurrence in input.txt.
Regards, Tim