[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: use of TZ by mktime()/strftime()
From: |
Ed Morton |
Subject: |
Re: use of TZ by mktime()/strftime() |
Date: |
Wed, 10 Aug 2022 14:03:42 -0500 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.0 |
On 8/10/2022 12:48 PM, Neil R. Ormos wrote:
Ed Morton wrote:
Eli Zaretskii wrote:
[arnold@skeeve.com wrote:]
Ed Morton wrote:
So in the above setting TZ to EST or UTC
worked and specifying IST at the end of the
timestamp worked, but setting TZ to IST
failed just like it does in gawk. Clearly I'm
missing something...
All of this depends on the underlying C
library. As far as I know there aren't
standardized time zone names that work the
same everywhere.
Actually, there are, at least in most practical
cases. But they are very few, and you cannot
rely on their DST rules to be up to date with
the current practices; they might on some
systems still reflect the DST rules of many
years ago, or even work according to the rules
of another country.
FWIW I found some information on "standard" time
zones: [...]
Thanks for the feedback all, looks like gawk
behaves the same as date wrt TZ environment
values so there's no gawk issue.
As you've seen, date(1) is pretty good at recognizing dates[*], including time
zones, in arguments supplied via the -d option.
I make an external call to date(1), instead of mktime(), when I can't be sure
that the input is well-behaved. I'm sure it's more expensive then mktime(),
but the overhead seems a tolerable price to pay when compared to the
alternatives of parsing the date string, trying to maintain a table of time
zone offsets, or explicitly consulting the system's time zone database.
Something like this:
returncode = ( ( "date -d " datearg " +%s" ) | getline dateresult )
(Simplified to show the concept. I have a wrapper function that escapes the
arguments to date(1), checks the returncode, and call close().)
[*] At least on systems that have the GNU core utilities date.
Thanks Neil, yeah I've done the same at times for small input files but
it is an order of magnitude slower than using builtin time functions so
if/when I don't NEED to do that then I avoid it. In this case the input
looks like:
2020-12-03T12:23:34 UTC
2020-12-03T12:23:34 Z
2020-12-03T12:23:34 EST
2020-12-03T12:23:34 EDT
2020-12-03T12:23:34 BST
2020-12-03T12:23:34 IST
2020-12-03T12:23:34 +00:00
2020-12-03T12:23:34 -0400
2020-12-03T12:23:34 -0800
2020-12-03T12:23:34 +06:00
where the numeric values at the end of the last 4 lines are UTC offsets
(as `date` would interpret them) rather than timezones so I was hoping
this script would be all I needed:
gawk '{
dt = gensub(/\s+\S+$/,"",1); gsub(/[-:T]/," ",dt)
tz = $NF
if ( match(tz,/^([-+]?)([0-9]{2}):?([0-9]{2})$/,a) ) {
tz = (a[1] == "-" ? "+" : "-") a[2] ":" a[3]
}
ENVIRON["TZ"] = tz
epochSecs = mktime(dt)
ENVIRON["TZ"] = "UTC"
printf "%-30s -> %10s -> %s UTC\n", $0, epochSecs,
strftime("%F %T",epochSecs)
}' file
2020-12-03T12:23:34 UTC -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 Z -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 EST -> 1607016214 -> 2020-12-03
17:23:34 UTC
2020-12-03T12:23:34 EDT -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 BST -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 IST -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 +00:00 -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 -0400 -> 1607012614 -> 2020-12-03
16:23:34 UTC
2020-12-03T12:23:34 -0800 -> 1607027014 -> 2020-12-03
20:23:34 UTC
2020-12-03T12:23:34 +06:00 -> 1606976614 -> 2020-12-03
06:23:34 UTC
but as you can see from the output above it doesn't recognize EDT (US
Eastern Daylight), BST (British Summer), or IST (Indian Standard) so I
settled on this instead:
gawk 'BEGIN {
tzmap["EST"] = "US/Eastern"
tzmap["EDT"] = "-04:00"
tzmap["BST"] = "+01:00"
tzmap["IST"] = "Asia/Calcutta"
}
{
dt = gensub(/\s+\S+$/,"",1); gsub(/[-:T]/," ",dt)
tz = ( $NF in tzmap ? tzmap[$NF] : $NF )
if ( match(tz,/^([-+]?)([0-9]{2}):?([0-9]{2})$/,a) ) {
tz = (a[1] == "-" ? "+" : "-") a[2] ":" a[3]
}
ENVIRON["TZ"] = tz
epochSecs = mktime(dt)
ENVIRON["TZ"] = "UTC"
printf "%-30s -> %10s -> %s UTC\n", $0, epochSecs,
strftime("%F %T",epochSecs)
}' file
2020-12-03T12:23:34 UTC -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 Z -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 EST -> 1607016214 -> 2020-12-03
17:23:34 UTC
2020-12-03T12:23:34 EDT -> 1607012614 -> 2020-12-03
16:23:34 UTC
2020-12-03T12:23:34 BST -> 1606994614 -> 2020-12-03
11:23:34 UTC
2020-12-03T12:23:34 IST -> 1606978414 -> 2020-12-03
06:53:34 UTC
2020-12-03T12:23:34 +00:00 -> 1606998214 -> 2020-12-03
12:23:34 UTC
2020-12-03T12:23:34 -0400 -> 1607012614 -> 2020-12-03
16:23:34 UTC
2020-12-03T12:23:34 -0800 -> 1607027014 -> 2020-12-03
20:23:34 UTC
2020-12-03T12:23:34 +06:00 -> 1606976614 -> 2020-12-03
06:23:34 UTC
which is fine for my purposes.
Thanks all who responded.
Ed.
- use of TZ by mktime()/strftime(), Ed Morton, 2022/08/09
- Re: use of TZ by mktime()/strftime(), Ed Morton, 2022/08/09
- Re: use of TZ by mktime()/strftime(), arnold, 2022/08/10
- Re: use of TZ by mktime()/strftime(), Eli Zaretskii, 2022/08/10
- Re: use of TZ by mktime()/strftime(), Ed Morton, 2022/08/10
- Re: use of TZ by mktime()/strftime(), Neil R. Ormos, 2022/08/10
- Re: use of TZ by mktime()/strftime(),
Ed Morton <=
- Re: use of TZ by mktime()/strftime(), Neil R. Ormos, 2022/08/10
- Re: use of TZ by mktime()/strftime(), Ed Morton, 2022/08/10
- Re: use of TZ by mktime()/strftime(), Neil R. Ormos, 2022/08/10
Re: use of TZ by mktime()/strftime(), Andrew J. Schorr, 2022/08/09