[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Getting GPS data into stream
From: |
Marcus Müller |
Subject: |
Re: Getting GPS data into stream |
Date: |
Thu, 4 May 2023 09:39:53 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 |
Hey Marcus,
as you say, for a lot of science you don't get high rates – so I'm really less worried
about that. More worried about Excel interpreting some singular data point as date; or, as
soon as we involve textual data, all the funs with encodings, quoting/delimiting/escaping…
(not to mention that an Excel set to German might interpret different things as numbers
than a Northern American one).
I wish there was just one good CSV standard that tools adhered to. Alas, that's not the
case, and especially Excel has a habit of autoconverting input and losing data at that point.
So, looking for an alternative that has these well-defined constraints and isn't as
focused on hierarchical data (JSON, YAML, XML), far too verbose but excellent to query
with command line tools (XML), completely impossible to correctly parse as human or parser
in its full beauty (YAML)… Just some tabular data notation that's textual, appendable, and
not a party of guesswork for the reading tool.
We could just canonalize calling all our files
marcusdata.utf8.textalwaysquoted.iso8601.headerspecifies_fieldname_parentheses_type.csv
but even that wouldn't solve the issue of excel seeing an unquoted 12.2021 and deciding
the field being about christmases past.
So, maybe we just do some rootless JSON format that starts with a SigMF object describing
the file and its columns, and then basically is just a sequence of JSON arrays
[ 1.212e-1, 0, "Müller", 24712388823 ]
[ 1.444e-2, 1, "📡🔭 \"👽\"!", 11111111111 ]
[ 2.0115-1, 0, "Cygnus-B", 0 ]
(I'm not even sure that's not valid JSON; gut feeling tells me we should be putting []
around the whole document, but we don't want that for streaming purposes. ECMA-404 doesn't
seem to *forbid* it.)
That way, we get the metadata in a format that's easy to skip by simpler tools, but
trivial to parse with the right tools (I've grown to like `jq`), and the data into a
well-defined format. Sure, you can't dump that into Excel, still, but you know what, if it
comes down to it, we can have a python script that takes these files and actually converts
them to valid XLSX without the misconversion footguns, and that same tool could also be
run in a browser for those having a hard time executing python on their machines.
Cheers,
Marcus
On 03.05.23 23:05, Marcus D. Leech wrote:
On 03/05/2023 16:51, Marcus Müller wrote:
Do agree, but really don't like CSV, too underspecified a format, too many ways that
comes back to bite you (aside from a thousand SDR users writing emails that their PC
can't keep up with writing a few MS/s of CSV…)
I like CSV because you can hand your data files to someone who doesn't have a complete
suite of astrophysics tools, and they
can slurp it into Excel and play with it.
How important is plain-textness in your applications?
I (and many others in my community) tend to throw ad-hoc tools at data from ad-hoc
experiments. In the past, I used a lot
of AWK to post-process data, and these days, I use a lot of Python. Text-based
formats lend themselves well to this kind
of processing. Rates are quite low, typically. Like logging an integrated power
spectrum a few times a minute, for example.
There are other observing modes where text-based formats aren't quite so obvious--like
pulsar observations, where filterbank
outputs might be recorded at 10s of kHz, and then post-processed with any of a number
of pulsar tools.
In all of this, part of the "science" is extracted in "real-time" and part in
post-processing.
Best,
Marcus
smime.p7s
Description: S/MIME Cryptographic Signature
Re: Getting GPS data into stream, Fabian Schwartau, 2023/05/03
- Re: Getting GPS data into stream, Fabian Schwartau, 2023/05/03
- Re: Getting GPS data into stream, Marcus D. Leech, 2023/05/03
- Re: Getting GPS data into stream, Fabian Schwartau, 2023/05/03
- Re: Getting GPS data into stream, Fabian Schwartau, 2023/05/04
- Re: Getting GPS data into stream, Fabian Schwartau, 2023/05/04
- Re: Getting GPS data into stream, Marcus Müller, 2023/05/05
- Re: Getting GPS data into stream, Fabian Schwartau, 2023/05/07