help-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Checking GSL for Spectroscopy


From: Fritz Sonnichsen
Subject: Re: Checking GSL for Spectroscopy
Date: Thu, 18 Mar 2021 08:33:05 -0400

Thanks Mike
  as you mention the code that I am starting with (written by a colleague)
is more "model based" in that it deliberately does not invoke the
underpinning properties of Raman optical spectra. This code is designed to
run really fast on GPUs and minimizes any calculations. When I start
working with it I will figure out if some preprocessing to categorize the
spectra for example would be of value.
   Your comment on CRAN R is of interest-i looked at it a few days ago and
it may be an approach--I think I will know more once I get the original
code and see how sophisticated the statistics usage is. But at least at
this point I think C with a little GSL will do.
  I am interested to find that Python is losing at least some ground these
days. I started on UNIVACs using assembler and FORTRAN on punch cards so I
was always a little distressed by adherence to column position issues
necessary back then. So when I had  to write in Python a few years ago I
actually had to call a computer colleague to make sure I wasn't
missing something when the column requirements came up. Jeeesh. More
disturbing is the inability to write comments at the end of lines. I am a
big supporter of in-line doc and I expect this. Putting documents before
the line takes a lot of space and more words--I have never rated code well
on how many pages you have to flip. But the real problems with Python
started when I needed to do serial port programming--some concoction of
shared routines that ran differently  on ver 2 and ver 3 were needed and it
was a disaster. A lot of other stories but I think people here know them.
It is sad to see a large body of scientists here thinking that somehow
Python was necessary to accommodate its rich set of routines. They have
never seen another language and don't understand that those routines could
have been written in something more stable and simpler.
  Your story about writing a citation program hits home for me. I worked on
a system for formatting NOAA data that one of the institutions uses here.
It requires a representative from each department and tons of ill-formatted
data must be made to comply with a generic format-all this done with unique
code for each case. And then the code never worked of course. An enormous
cost of rare research funds that could have easily been saved by simply
making a few requirements on CSV data. The computer tower of babble that
has been built in recent years shows to me at least that science has lost
sight of what it is supposed to do and how it should get here. The tool is
no longer a tool-it is becoming the prime cost.
   The GPL issues have been discussed here and I appreciated that input. I
have our business person looking at this. The Spectra library that we call
must remain proprietary but the operations on it are--to my mind-rather
generic (I have written them more than one in C over the years) and we
certainly don't need to keep these close. And of course our small company
is willing to pay reasonable amounts for a licence. The big problem these
days seems to be finding out what you need to do regarding licensing--and
knowing it will stick once it is done. Sometimes it is worse reading the US
tax code--which grew much Kudzu  as computer languages and architectures do
these days!

cheers
Fritz

On Thu, Mar 18, 2021 at 6:21 AM Mike Marchywka <marchywka@hotmail.com>
wrote:

> Thanks. I've looked briefly at a lot of different kinds of "spectra" -
> audio, solar, image fft, distributions,  xps, even Raman that may evolve
> with time -
>  and
> as you suggest you may not be interested so much in some abstract
> comparison as in extracting some model information. Comparing spectra
>  may be with the intent of resolving a given one into component
> pieces- how much of each basis element is  in the  measured thing.
> Generally you have lines with some profile- gauss and lorentz would
> be well known - and then a continuum which could be anything
> with blackbody and I guess fluorescence as examples.  Then you have
> instrument issues to resolve- baseline and maybe broadening could
> be factors for a library.
>
> You could imagine developing a language around common things-
> consider maybe writing "R" packages that use GSL.  CRAN's R
> may be a good open replacement for MATLAB.
>
> I played with python briefly and any language that enforces white space,
> and IIRC earlier distinguished space and tab lol, is a bit
> of a suspect ...
>
> I've also run into various language-vs-library issues and thinking about
> business
> issues. I've got one "program" to make downloading citation information
> less distracting from diverse sources targeted at academics or anyone
> doing internet
> research ( this could be companies writing white papers or technical
> reports for their own products
> compared to competitors,  political hacks writing position or policy
> papers if the
> internet sites supply Bibtex for their works ). The code itself is almost
> the opposite
> of science- it is a collection of hacks tried in the order in which I
> discovered
> they may be useful to try to download citation information without
> bothering
> the user much. After looking at maybe 100's of hacks, some patterns emerged
> and in the conversion from an awful bash script to c/c++ it looked
> like you could come up with a mini-language based on "subroutine"
> or method calls.  The dev version uses readline for interaction which
> appears
> to have some licensing issues but since I almost always just write for
> myself I don't usually notice stuff like that.
>
> btw, as their are likely academics here if you have your own horror or
> success
> stories getting citation information for your publication efforts please
> share
> as appropriate here or on the texhax list . Thanks.
>
> note new address
>  Mike Marchywka 306 Charles Cox Drive Canton, GA 30115
>  2295 Collinworth  Drive Marietta GA 30062.  formerly 487 Salem Woods
> Drive Marietta GA 30067 404-788-1216 (C)<- leave message 989-348-4796 (P)<-
> emergency
>
>
> ________________________________________
> From: Fritz Sonnichsen <sonnichs@gmail.com>
> Sent: Tuesday, March 16, 2021 9:53 AM
> To: Mike Marchywka
> Cc: help-gsl@gnu.org
> Subject: Re: Checking GSL for Spectroscopy
>
> Mark
>   I am converting someone's MATLAB code so I am not sure what he is doing
> yet--but several years ago I did spectral analysis in MATLAB and probably
> very similar. This is for Raman and LIBS spectra.
> 1) "Usually" I apply a high pass filter to the spectrum. This gets rid of
> the noise I need control over this since as you would expect the signal and
> noise can get pretty close! Intuition comes into play here.
> 2) Next I baseline the spectra. This removes any constant bias.  For LIBS
> I was usually able to further filter "spikes" and then take a mean of the
> remaining line, subtracting this from the overall spectrum. Raman can get a
> bit more difficult-I am, at least,  subtracting the fluorescent line which
> can have a lot of features (e.g. spikes). At times, if you know this
> background you can subtract it first but you get all types of complications
> from normalization. Again--intuition comes into play.
> 3) The resulting spectrum needs to be compared to a database. For LIBS the
> latter is quite small--mostly atomic/elemental data such as NIST. I could
> generally do a discrete comparison of the spike locations using a
> peak-finder, align them with the known examples and get a pretty high hit
> rate. This was for qualitative data.
> Raman is, again, much more complex. The data I was using was constrained
> and simpler but the case in hand here is much more complex. We are doing
> mixed plastics at the moment. My colleague found the best matches by taking
> a stats correlation with 44000 entries and pulling out the values closest
> to "one". It works remarkably well.
>
> I don't think there is much above that cannot be written in C in a
> reasonable amount of time. But we are looking ahead and would like to draw
> on the collective experience of the science community. This type of
> analysis is quite common and there are enough new wheels out there that we
> don't want to re-invent old ones!
>     Very important is that "intuition" part. I would think a lot of this
> issue has been better solved since I was doing this. There are a lot of
> adjustments that could be made-for example iterating trial baselines,
> rejecting noise at varied levels etc. Processors are faster now and the AI
> movement has brought in PCA and a lot of other techniques that begin to
> transcend my current state of knowledge (I work more on the physics end of
> things and would prefer to use routines from the communities if possible to
> save time).
>
> Thanks for your interest Mark!
> Fritz
>
>
> On Tue, Mar 16, 2021 at 9:25 AM Mike Marchywka <marchywka@hotmail.com
> <mailto:marchywka@hotmail.com>> wrote:
> Can you comment on how you compare spectra? Just for my own
> personal interest, not sure if will further the thread here however..
> Not sure a "dot product" in the conventional sense would help much.
> You could imagine comparing peak positions and relative heights
> or a fit to a continuum for example.  Peaks plus black body in some
> vector comparison?
>
> note new address
>  Mike Marchywka 306 Charles Cox Drive Canton, GA 30115
>  2295 Collinworth  Drive Marietta GA 30062.  formerly 487 Salem Woods
> Drive Marietta GA 30067 404-788-1216 (C)<- leave message 989-348-4796 (P)<-
> emergency
>
>
> ________________________________________
> From: Help-gsl <help-gsl-bounces+marchywka=hotmail.com@gnu.org<mailto:
> hotmail.com@gnu.org>> on behalf of Fritz Sonnichsen <sonnichs@gmail.com
> <mailto:sonnichs@gmail.com>>
> Sent: Tuesday, March 16, 2021 9:15 AM
> To: help-gsl@gnu.org<mailto:help-gsl@gnu.org>
> Subject: Checking GSL for Spectroscopy
>
> I am preparing to convert MATLAB code to something more general. The new
> code will run on LInux and ARM processors.
>    For a lot of reasons I am not going to use Python. We also want to
> keep this project "close" to scientists and do not want to turn it into a
> full time computer programming job. So the final word is that I am looking
> for something that can be called by (and hopefully is written) in C. Worse
> case I will just write the code myself but would prefer to start
> integrating our systems into something with a lot of pre-written and vetted
> routines.
>
> GSL looks like a good choice. Maybe R comes next. We have a mix of needs
> but I will point out a few:
> 1) Baselining a spectrum
> 2) Finding peaks in that spectrum
> 3) using Pearson correlation to compare the spectrum QUICKLY to
> about 50,000 recorded examples.
>
> We also have some uses with basic statistics and we do some image
> processing.
>
> So my question is--does GSL position itself in these areas? MATLAB (with
> packages) does them all.
>      I am not sure how active GSL, if it is keeping up with AI, imaging and
> spectroscopy--or is it fading or giving way to popular languages for
> example. I was surprised that the 600+ page manual did not seem to show
> anything relating to the simple spectral analysis described above for
> example. Certainly I can search the web for others' code but at some point
> if I cannot attach to a well established product I will just write it
> myself.
>
> Any comments appreciated
> thanks
> Fritz
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]