help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Loading of non structured data


From: Sergei Steshenko
Subject: Re: Loading of non structured data
Date: Tue, 13 Nov 2018 17:59:00 +0000 (UTC)



On Tuesday, November 13, 2018, 2:27:37 PM GMT+2, Kristina Klemenčić <address@hidden> wrote:


Dear Ian,

Thank you for your response. It is a long file of more than 51000 lines.

What I wanted to do is some kind of for loop that goes through each line and dysetcs it. What I do not know is how to make the the program read each string until the coma (don't know the command), how to manipulate it strings and turn parts into numbers/integers. The simplest part is putting it into an array, just the part until that point is the problem. I am not familiar with commands. I tried using
to see the programming vocabulary but it is too confusing for me. Maybe for someone who is more skilled is clear. Could you please point me to a user manual that is more easy to use? Something like Octave for dummies? :)

Thank you





On Mon, 12 Nov 2018 at 21:12, Ian McCallion <address@hidden> wrote:
You don't say how big your dataset is, but if not too huge I would
suggest reading it line by line using readline. Each line would then
be a string which you can do further processing on having decided it's
format. If this turns out to be too slow there may be ways to reduce
the amount of looping.

I'm assuming you intend to read the whole dataset in before beginning
any serious processing. In this case you need to decide what the data
will be like once it is entirely read in. Use vectors for the
numerical data, and probably multiple vectors, one for each line of
the form
    AL041851,            UNNAMED,     49,
which I assume is a sort identifier of the data below it.

For multiple vectors, if you know the names ahead of time you can put
each array in a separately named variable. If not use a cell array of
vectors and parallel cell arrays of strings for the non-numerical
data. This console log (I typed in the lines beginning >>) may give
you enough to get you started:

>> a{1}='AL041851'
a =
{
  [1,1] = AL041851
}

>> a{2} = 'UNNAMED'
a =
{
  [1,1] = AL041851
  [1,2] = UNNAMED
}

>> a{3} = 49
a =
{
  [1,1] = AL041851
  [1,2] = UNNAMED
  [1,3] =  49
}

>> a{4} = [ -999, -999, -999 ]
a =
{
  [1,1] = AL041851
  [1,2] = UNNAMED
  [1,3] =  49
  [1,4] =

    -999  -999  -999

}

Without knowing more about the data and the processing needed it is
impossible to give more than general pointers.

Ian
On Mon, 12 Nov 2018 at 14:35, kristina <address@hidden> wrote:
>
> Hello,
>
> I am completely new to Octave so I need help. I looked through the forums
> but did not find an exactly the same problem. The data I have to upload does
> not have a fixed form and contains different kinds of data.
>
> Thank you very much for your help!
>
> This is an example of the content:
>
> AL011851,            UNNAMED,     14,
> 18510625, 0000,  , HU, 28.0N,  94.8W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510625, 0600,  , HU, 28.0N,  95.4W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510625, 1200,  , HU, 28.0N,  96.0W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510625, 1800,  , HU, 28.1N,  96.5W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510625, 2100, L, HU, 28.2N,  96.8W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510626, 0000,  , HU, 28.2N,  97.0W,  70, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510626, 0600,  , TS, 28.3N,  97.6W,  60, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510626, 1200,  , TS, 28.4N,  98.3W,  60, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510626, 1800,  , TS, 28.6N,  98.9W,  50, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510627, 0000,  , TS, 29.0N,  99.4W,  50, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510627, 0600,  , TS, 29.5N,  99.8W,  40, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510627, 1200,  , TS, 30.0N, 100.0W,  40, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510627, 1800,  , TS, 30.5N, 100.1W,  40, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510628, 0000,  , TS, 31.0N, 100.2W,  40, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> AL021851,            UNNAMED,      1,
> 18510705, 1200,  , HU, 22.2N,  97.6W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> AL031851,            UNNAMED,      1,
> 18510710, 1200,  , TS, 12.0N,  60.0W,  50, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> AL041851,            UNNAMED,     49,
> 18510816, 0000,  , TS, 13.4N,  48.0W,  40, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510816, 0600,  , TS, 13.7N,  49.5W,  40, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510816, 1200,  , TS, 14.0N,  51.0W,  50, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510816, 1800,  , TS, 14.4N,  52.8W,  50, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510817, 0000,  , TS, 14.9N,  54.6W,  60, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510817, 0600,  , TS, 15.4N,  56.5W,  60, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510817, 1200,  , HU, 15.9N,  58.5W,  70, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510817, 1800,  , HU, 16.1N,  60.4W,  70, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510818, 0000,  , HU, 16.6N,  62.5W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510818, 0600,  , HU, 16.9N,  64.1W,  80, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510818, 1200,  , HU, 17.2N,  66.0W,  90, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510818, 1800,  , HU, 17.6N,  67.6W,  90, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510819, 0000,  , HU, 18.0N,  69.3W,  90, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510819, 0600,  , HU, 18.4N,  71.1W,  70, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510819, 1200,  , TS, 18.9N,  72.6W,  60, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510819, 1800,  , TS, 19.4N,  74.3W,  60, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
> 18510820, 0000,  , HU, 19.9N,  75.9W,  70, -999, -999, -999, -999, -999,
> -999, -999, -999, -999, -999, -999, -999, -999,
>
>
>
> --
> Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html
>
>


====================================================


"What I do not know is how to make the the program read each string until the coma (don't know the command), how to manipulate it strings and turn parts into numbers/integers.+ - you might find https://octave.sourceforge.io/octave/function/fscanf.html , https://octave.sourceforge.io/octave/function/fgets.html , https://octave.sourceforge.io/octave/function/index.html , https://octave.sourceforge.io/octave/function/regexp.html to be of interest.

I suggest to learn regular expressions (the last link) as a powerful parsing tool.

--Sergei.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]