[R] RODBC, missing values, and Excel

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jul 13 13:50:47 CEST 2006


Perhaps the Excel API requires an absolute path name.
Try:

chartr("/", "\\", file.path(getwd(), "myfile.xls"))

where we make sure its using backslashes in case that's also
required.

On 7/13/06, Sundar Dorai-Raj <sundar.dorai-raj at pdf.com> wrote:
> Hi, Gabor,
>
> Thanks for the code. When I tried this I get an error when trying to use
> a relative path name:
>
> read.excel <- function(file, sheet, ...) {
>   require(rcom)
>   require(gdata)
>   oxl <- comCreateObject('Excel.Application')
>   comSetProperty(oxl, "Visible", TRUE)  # this line optional
>   owb <- comGetProperty(oxl, "Workbooks")
>   ob <- comInvoke(owb, "Open", file)
>   osheets <- comGetProperty(ob, "Worksheets")
>   n <- comGetProperty(osheets, "Count")
>   ithSheetName <- function(i)
>     comGetProperty(comGetProperty(osheets, "Item", i), "Name")
>   sheetNames <- sapply(1:n, ithSheetName)
>   comInvoke(oxl, "Quit")
>   read.xls(file, match(sheet, sheetNames), ...)
> }
>
>  > read.excel("tmp.xls", "Sheet2", na.strings = "na")
> Error in 1:n : NA/NaN argument
>  > read.excel("D:/Users/sundard/frm/config/R/tmp.xls",
> +            "Sheet2", na.strings = "na")
>       x
> 1 0.11
> 2 0.11
> 3   NA
> 4   NA
> 5   NA
> 6   NA
> 7 0.11
>
> Any reason I need an absolute path?
>
> Thanks again,
>
> --sundar
>
> Gabor Grothendieck wrote:
> > In thinking about this some more I have a better idea.  Use rcom (or
> > RDCOMClient)
> > to get a list of the sheet names and then use that to determine which sheet you
> > need.  Then use read.xls to get it like this assuming that the Excel
> > file and path are C:\test.xls and that one of the sheets in that spreadsheet
> > is xyz.  In my version the na.strings had a space at the end so you may
> > need to change the na.strings= setting:
> >
> > library(rcom)
> > xls <- "C:\\test.xls"
> > oxl <- comCreateObject('Excel.Application')
> > comSetProperty(oxl, "Visible", TRUE)  # this line optional
> > owb <- comGetProperty(oxl, "Workbooks")
> > ob <- comInvoke(owb, "Open", xls)
> > osheets <- comGetProperty(ob, "Worksheets")
> > n <- comGetProperty(osheets, "Count")
> > ithSheetName <- function(i)
> >       comGetProperty(comGetProperty(osheets, "Item", i), "Name")
> > sheetNames <- sapply(1:n, ithSheetName)
> > comInvoke(oxl, "Quit")
> >
> > library(gdata)
> > read.xls(xls, match("xyz", sheetNames), na.strings = "na ")
> >
> >
> > On 7/12/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> >
> >>Would it be good enough to just read all the sheets in?
> >>
> >>The perl program can do that and although the read.xls R function does not
> >>interface to that aspect of its functionality its not that difficult to access
> >>it yourself.  Assume your excel file is in \test.xls .  Just
> >>switch to that folder.  paste together a command to run the perl
> >>program, run it, get a list of the file names it produced and read them in:
> >>
> >>library(gdata)
> >>setwd("/")
> >>cmd <- paste("perl", system.file("perl/xls2csv.pl", package = "gdata"), "test")
> >>system(cmd)
> >>ff <- list.files(patt = "test_Sheet.*.csv")
> >>sapply(ff, read.csv, na.strings = "na ", simplify = FALSE)
> >>
> >>
> >>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj at pdf.com> wrote:
> >>
> >>>Hi, Gabor,
> >>>
> >>>Thanks for the reply. Perhaps Prof. Ripley will enlighten us as he is
> >>>the RODBC maintainer.
> >>>
> >>>Unfortunately, gdata::read.xls will not work for me (at least I don't
> >>>think it will) because I need to refer to each worksheet by name and not
> >>>by number. For example, I need extract data from "Sheet1" and not simply
> >>>the first sheet.
> >>>
> >>>Thanks,
> >>>
> >>>--sundar
> >>>
> >>>Gabor Grothendieck wrote:
> >>>
> >>>>I also got a strange result too (I renamed it sdr.read.xls
> >>>>to distinguish it from read.xls in gdata and noticed that a
> >>>>space got into my na's somehow so I used "na " for my
> >>>>na.strings:
> >>>>
> >>>>
> >>>>
> >>>>>sdr.read.xls("/test.xls", "Sheet2", na.strings = "na ")
> >>>>
> >>>>     x
> >>>>1 <NA>
> >>>>2 <NA>
> >>>>3   na
> >>>>4   na
> >>>>5   na
> >>>>6   na
> >>>>7 <NA>
> >>>>
> >>>>I had more success using read.xls in the gdata package.
> >>>>Note that we need to install perl first if not already present:
> >>>>
> >>>>
> >>>>
> >>>>>library(gdata)  # for read.xls
> >>>>>read.xls("/test.xls", 2, na.strings = "na ")
> >>>>
> >>>>     x
> >>>>1 0.11
> >>>>2 0.11
> >>>>3   NA
> >>>>4   NA
> >>>>5   NA
> >>>>6   NA
> >>>>7 0.11
> >>>>
> >>>>
> >>>>
> >>>>>R.version.string # XP
> >>>>
> >>>>[1] "Version 2.3.1 Patched (2006-06-04 r38279)"
> >>>>
> >>>>
> >>>>>packageDescription("gdata")$Version
> >>>>
> >>>>[1] "2.1.2"
> >>>>
> >>>>
> >>>>>packageDescription("RODBC")$Version
> >>>>
> >>>>[1] "1.1-7"
> >>>>
> >>>>
> >>>>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj at pdf.com> wrote:
> >>>>
> >>>>
> >>>>>Hi, all,
> >>>>>
> >>>>>I'm trying to use RODBC to read data from Excel. However, I'm having
> >>>>>trouble converting missing values to NA and rather perplexed by the
> >>>>>output. Below illustrates my problem:
> >>>>>
> >>>>>## DATA - copy to Excel and save as "tmp.xls"
> >>>>>## tmp.xls!Sheet1
> >>>>>x
> >>>>>0.11
> >>>>>0.11
> >>>>>na
> >>>>>na
> >>>>>na
> >>>>>0.11
> >>>>>
> >>>>>## tmp.xls!Sheet2
> >>>>>x
> >>>>>0.11
> >>>>>0.11
> >>>>>na
> >>>>>na
> >>>>>na
> >>>>>na
> >>>>>0.11
> >>>>>
> >>>>>## R Code
> >>>>>read.xls <- function(file, sheet = "Sheet1", ...) {
> >>>>> require(RODBC)
> >>>>> channel <- odbcConnectExcel(file)
> >>>>> sheet <- sprintf("select * from `%s$`", sheet)
> >>>>> x <- sqlQuery(channel, sheet, ...)
> >>>>> odbcClose(channel)
> >>>>> x
> >>>>>}
> >>>>>
> >>>>>read.xls("./tmp.xls", "Sheet1", na.strings = "na")
> >>>>>## works as expected
> >>>>>#     x
> >>>>>#1 0.11
> >>>>>#2 0.11
> >>>>>#3   NA
> >>>>>#4   NA
> >>>>>#5   NA
> >>>>>#6 0.11
> >>>>>
> >>>>>read.xls("./tmp.xls", "Sheet2", na.strings = "na")
> >>>>>## Huh? What happened?
> >>>>>#   x
> >>>>>#1 NA
> >>>>>#2 NA
> >>>>>#3 NA
> >>>>>#4 NA
> >>>>>#5 NA
> >>>>>#6 NA
> >>>>>#7 NA
> >>>>>
> >>>>>
> >>>>>>sessionInfo()
> >>>>>
> >>>>>Version 2.3.1 (2006-06-01)
> >>>>>i386-pc-mingw32
> >>>>>
> >>>>>attached base packages:
> >>>>>[1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
> >>>>>[7] "base"
> >>>>>
> >>>>>other attached packages:
> >>>>> RODBC
> >>>>>"1.1-7"
> >>>>>
> >>>>>______________________________________________
> >>>>>R-help at stat.math.ethz.ch mailing list
> >>>>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >>>>>
> >>>>
> >>>>
> >>>>______________________________________________
> >>>>R-help at stat.math.ethz.ch mailing list
> >>>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >>>
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list