[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Orgmode] Re: [babel] R questions
From: |
Sébastien Vauban |
Subject: |
[Orgmode] Re: [babel] R questions |
Date: |
Tue, 08 Dec 2009 10:50:15 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux) |
Hi Dan,
Dan Davison wrote:
> Sébastien Vauban <address@hidden> writes:
>>
>> I have this table generated by a script:
>>
>> #+results: abc2008
>> | "2008/1" | -78.59 | 1627.24 |
>> | "2008/2" | -80.17 | 700.33 |
>> | "2008/3" | -80.17 | 879.8 |
>> | "2008/4" | -80.17 | -25823.17 |
>> | "2008/5" | -80.17 | 3570.75 |
>> | "2008/6" | -81.77 | 2377.8 |
>> | "2008/7" | -81.77 | 2889.4 |
>> | "2008/8" | -81.77 | 2612.92 |
>> | "2008/9" | -81.77 | 1585.21 |
>> | "2008/10" | -83.4 | 1561.42 |
>> | "2008/11" | -83.4 | 2189.17 |
>> | "2008/12" | "" | "" |
>>
>> I want to draw the 12 months with the values side by side.
>>
>> Problem #1: the "" in the last line hinder the generation of the graph.
>> Format error.
>
> Missing values in R are represented by the value NA. If you change the last
> line of your table to
>
> | "2008/12" | NA | NA |
>
> then it works [1], [2], [3].
>
> [1] Note no quotes around NA here. You asked a good question about quoting
> in org-babel; it will be answered.
OK.
> [2] I guess one could potentially think about dealing with missing values
> more explicitly in org-babel. E.g. there could be a header arg
> specifying what values are to be treatyed as missing. Nothing like that
> exists currently.
I guess such a feature would be required on the long term. Of course, even
specifying what would be the needed behavior is already difficult, I think.
One must have good knowledge of the multiple languages and environments, and
try to abstract the best behavior out of these.
Side note -- I know, for example, that there is an option in Access to let it
consider the empty string ('') as the NULL value, or not. Clear.
But what's a "NA" value in general? Is 0 always a meaningful value as
numeric? Context-sensitive...
Side question -- You talked of some way to remember the bugs or features to be
added to Org. Same question here: where will these little things be added in
order to avoid forgetting them? Is it in one of the Worg documents itself?
> [3] You might think that an alternative would be to do something like this
> in R
>
> abc[abc == "\"\""] <- NA
>
> but the trouble is that with those double quotes present, R will interpret
> the column as containing character data rather than numeric, and things will
> not be pretty.
I believe you...
>> #+srcname: expenses-bar-plot(abc = abc2008)
>> #+begin_src R :results file :file abc2008.pdf
>> barplot(abc[,3], col = "red", main = "Profit and Loss 2008", las = 1,
>> xlab
>> = "Months", ylab = "EUR")
>> #+end_src
>>
>> Problem #2: I don't know how to ask for drawing the 2 columns. I've tried
>
> OK, so one point that is arguably relevant to this mailing list is that when
> org tables are read into R, the object that is created in R is a *data
> frame*. Not a matrix. (A data frame can have columns of different types;
> matrices are all one type). [4]
>
> [4] org-babel uses orgtbl-to-tsv followed by read.table() to convert the
> org table into a data.frame in R. A source of much confusion with
> R-beginners is that by default, read.table converts character columns into
> the *factor* data type. Note that org-babel currently uses 'as.is=TRUE' when
> calling read.table and therefore does *not* convert to factor. This may
> avoid some confusion among users but is memory-inefficient and misses out on
> other advantages of factors.
>
> So to solve your problem, you'd need to read the description of the height
> argument in the help page for barplot (?barplot), noting that it says
> "either a vector or matrix", and also noting that it says that bars
> correspond to columns (not rows), thus realising that you need to explicitly
> convert the relevant columns of the data frame to a matrix and then
> transpose.
>
> However, your two columns have rather different magnitude values and so are
> not very well suited for plotting on the same scale. Below I rescaled the
> first column by a factor of 20 so you can at least see the bars.
>
> #+srcname: expenses-bar-plot-two-columns(abc = abc2008)
> #+begin_src R :file abc2008.png
> ## select the two columns, convert to matrix, transpose and rescale top
> ## row.
> x <- t(as.matrix(abc[,2:3])) * c(20,1)
> barplot(x, col = rep(c("red","blue"), ncol(x)), main = "Profit and Loss
> 2008", las = 1, xlab= "Months", ylab = "EUR", beside=TRUE)
> #+end_src
Thanks a lot for the enlightened explanation, and the correction to be brought
to the R code.
Best regards,
Seb
--
Sébastien Vauban