[R] RODBC and NULL values
Mark Wardle
mark at wardle.org
Tue Nov 14 18:10:39 CET 2006
Dear all,
I'm afraid I'm still having trouble with RODBC and NULL values on Mac OS
X 10.4.8. (I would add that otherwise, RODBC is running perfectly, and
is doing an excellent job!). R 2.4.
As before, all my data is stored in Postgresql 8.1.4. I'm using Actual's
ODBC drivers (now updated to 2.5). I've removed RODBC and reinstalled,
compiling from source rather than using the binary package. Installed
using the line (all one line):
LIBS='-framework iODBC'
PKG_CFLAGS='-I/Library/Frameworks/iODBC.framework/Headers' R CMD INSTALL
RODBC_1.1-7.tar.gz
NULL values *are* correctly returned to a number of different client
applications, including "iodbctest", Filemaker, Excel and R/RdbiPgSQL,
but not RODBC.
SQL>select calc_survival_unilateral_support as unilateral,
has_family_history_ataxia as familial from clinical, patient where
clinical.patient_fk=patient_id and excluded=0 and
calc_walking_disability_valid=1;
unilateral |familial
----------------+----------------
6 |1
6 |0
4 |0
2 |0
5 |1
****************|0
****************|1
8 |0
****************|0
****************|0
Running the same query from Excel, Filemaker or RdbiPgSQL correctly
imports the data with missing data as NULL or empty. This is not the
case using R/RODBC:
> unilateral = sqlQuery(channel, "select
calc_survival_unilateral_support as unilateral,
has_family_history_ataxia as familial from clinical, patient where
clinical.patient_fk=patient_id and excluded=0 and
calc_walking_disability_valid=1")
> unilateral[1:10,]
unilateral familial
1 6 1
2 6 0
3 4 0
4 2 0
5 5 1
6 0 0
7 0 1
8 8 0
9 0 0
10 0 0
>
These fields are both defined as "numeric". There is no difference if I
use 'ALTER TABLE' to change to int4. Using "as.is" makes no difference.
I cannot explain why NULL values are not being transferred correctly
using RODBC. Are there any other diagnostic strategies I could try?
Since I last posted, I have installed RdbiPgSQL, and this seems to work.
> conn = dbConnect(PgSQL(), host="localhost", dbname="ataxia")
> test.df = dbGetQuery(conn, "select calc_survival_unilateral_support as
unilateral, has_family_history_ataxia as familial from clinical, patient
where clinical.patient_fk=patient_id and excluded=0 and
calc_walking_disability_valid=1")
test.df[1:10,]
unilateral familial
1 6 1
2 6 0
3 4 0
4 2 0
5 5 1
6 NA 0
7 NA 1
8 8 0
9 NA 0
10 NA 0
In my dataset, there is a considerable interpretative difference between
NA and zero. I don't know how to take this further, and perhaps it only
applies to my peculiar set-up. If there are any further diagnostic tests
that others can suggest, or some debug mode (I can't see this in the
source code), then let me know. Obviously, the fact that RdbiPgSQL
successfully manages to transfer NULL values limits the problem to
either the ODBC driver or RODBC itself. The success with iodbc,
Filemaker and Excel as ODBC client presumably suggests this problem is
limited to RODBC, and is not an ODBC driver problem? Any ideas? I'll be
switching to RdbiPgSQL from now, but I thought it appropriate to flag
this up as an unsolved problem.
Many thanks,
Best wishes,
Mark
--
Dr. Mark Wardle
Clinical research fellow and Specialist Registrar in Neurology,
C2-B2 link, Cardiff University, Heath Park, CARDIFF, CF14 4XN. UK
More information about the R-help
mailing list