coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 'date' enhancement: flush in format string


From: Eric Blake
Subject: Re: 'date' enhancement: flush in format string
Date: Fri, 12 Aug 2011 11:53:32 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110621 Fedora/3.1.11-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.11

Re-adding the list, for closure.

On 08/12/2011 11:04 AM, Darko Veberic wrote:
dear eric,

thanks for the prompt answer!

On 08/12/2011 12:20 AM, Eric Blake wrote:
No. Format symbols are scarce, and %f is already taken.

ok. at some point i guess one-letter ones will run out... what about
%{flush} ...? ;)

Adding a new % has to be done at the strftime() wrapper level, so it won't happen (there is no single character 'flush' that you can output from strftime).


One idea is to change date to always flush after every line. But I don't
know if this would hurt performance when the flushing is not needed, and

probably. i am not advocating this. it should not be automatic. user
should request it either with a format symbol or command-line switch.

If it has to be conditional, then it has to be a separate command line option, and not part of the format string.

nevertheless, your "stdbuf" suggestion below is more of a clean solution...

you'd have the same problem as with your %f proposal - namely, it would
take a while for the code to propagate onto your machine.

Another idea is to use something that already exists. date uses stdio,
so the stdbuf utility (also part of coreutils) can be used to force date
into line-buffered mode. So, try "stdbuf -o L date -u -f- +%s" as your
command instead of raw date.

elegant solution, indeed. now awk runs much faster with a co-process.

Glad to hear it!  And best of all, no new coreutils code needed.


spawning date is slow:

time (for ((i=0; i<10000; ++i)) ; do date -u | sed 's/UTC //'; done >foo)
real 0m28.325s

awk with date spawned for each line:

time (awk '{d="date -u -d \""$0"\" +%s"; d | getline t; close(d); print
t}' foo >/dev/null)
real 0m11.316s

awk with co-process "date" (~80x faster on larger sample):

time (awk 'BEGIN{d="stdbuf -o L date -u -f- +%s"}{print $0 |& d; d |&
getline t; print t}END{close(d)}' foo >/dev/null)
real 0m0.198s

date's file transform (flush overhead?):

Yes, line buffering is more expensive than fully buffered; it's the price you pay to avoid deadlock.


time (stdbuf -o L date -u -f foo +%s >/dev/null)
real 0m0.062s

time (date -u -f foo +%s >/dev/null)
real 0m0.055s

Still a nice speedup compared to multiple processes.

Also, how uniform is the date input already present in the input file?

it is basically the output string of "date -u | sed 's/UTC //'"...

If it is sufficiently uniform, can you just use awk's own date parsing
utilities? It comes with a few builtins that may be able to already
handle your needs without spawning external processes.

can it convert say "Fri Aug 12 16:24:30 2011" to unix second?

Yes, it should be possible, with gawk (but since %s is not POSIX, good luck with any other vendor's awk). Parse the date into fields, set up an array to convert 3-letter months into two digits, then pass the rearranged fields in to strftime("%s", mktime(YYYY MM DD HH MM SS)).

It might take a multi-line function to do all that work, but shouldn't be too far-fetched to get working:

$ awk 'END { print strftime("%s", mktime("2011 08 12 11 50 00")) }' \
    /dev/null
1313171400

--
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]