[R] how to emerge two tables by taking the ave.
jim holtman
jholtman at gmail.com
Mon Nov 12 05:02:28 CET 2007
Here is the way to read the data and convert it. Your data was a
dataframe with the first column being the id:
> x <- read.table(textConnection("id b1 b2 b3
+ a1 2 4 6
+ a2 1 2 NA
+ a3 4 6 NA"), header=TRUE)
> y <- read.table(textConnection("id b1 b2 b3
+ a1 NA 4 4
+ a2 2 2 NA
+ a3 1 2 2"), header=TRUE)
> # look at what x & y are:
> str(x)
'data.frame': 3 obs. of 4 variables:
$ id: Factor w/ 3 levels "a1","a2","a3": 1 2 3
$ b1: int 2 1 4
$ b2: int 4 2 6
$ b3: int 6 NA NA
> str(y)
'data.frame': 3 obs. of 4 variables:
$ id: Factor w/ 3 levels "a1","a2","a3": 1 2 3
$ b1: int NA 2 1
$ b2: int 4 2 2
$ b3: int 4 NA 2
> # to convert to matrix, get rid of first column
> x <- as.matrix(x[,-1])
> y <- as.matrix(y[,-1])
> z <- mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
> dim(z) <- dim(x)
> z
[,1] [,2] [,3]
[1,] 2.0 4 5
[2,] 1.5 2 NaN
[3,] 2.5 4 2
> is.na(z) <- is.nan(z)
> z
[,1] [,2] [,3]
[1,] 2.0 4 5
[2,] 1.5 2 NA
[3,] 2.5 4 2
>
>
On Nov 11, 2007 10:47 PM, affy snp <affysnp at gmail.com> wrote:
> Hi,Jim. I created two txt files as:
>
> x.txt
>
> id b1 b2 b3
> a1 2 4 6
> a2 1 2 NA
> a3 4 6 NA
>
> y.txt
> id b1 b2 b3
> a1 NA 4 4
> a2 2 2 NA
> a3 1 2 2
>
>
> I tried it one more time but got different z:
>
> > x<-read.table(file="x.txt",header=TRUE,row.names=1,na.strings = "NA")
> Warning message:
> In read.table(file = "x.txt", header = TRUE, row.names = 1, na.strings = "NA") :
> incomplete final line found by readTableHeader on 'x.txt'
> > x
> b1 b2 b3
> a1 2 4 6
> a2 1 2 NA
> a3 4 6 NA
> > y<-read.table(file="y.txt",header=TRUE,row.names=1,na.strings = "NA")
> Warning message:
> In read.table(file = "y.txt", header = TRUE, row.names = 1, na.strings = "NA") :
> incomplete final line found by readTableHeader on 'y.txt'
> > y
> b1 b2 b3
> a1 NA 4 4
> a2 2 2 NA
> a3 1 2 2
> > z <- mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
> > dim(z) <- dim(x)
> Error in dim(z) <- dim(x) :
> dims [product 9] do not match the length of object [3]
> > z <- mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
> > z
> b1 b2 b3
> 2.000000 3.333333 4.000000
> >
>
>
> Allen
>
> On Nov 11, 2007 10:41 PM, jim holtman <jholtman at gmail.com> wrote:
> > What did your text files look like? It would appear that there was
> > not a line feed on the last line of the file. Also what does 'str' of
> > x and y show? It appears that one is a data frame and one is a
> > matrix. That might be causing some of the problems.
> >
> >
> > On Nov 11, 2007 10:30 PM, affy snp <affysnp at gmail.com> wrote:
> > > Hi Jim,
> > >
> > > Thanks a lot! I am wondering why I ended up getting the result as follows:
> > >
> > > > x<-read.table(file="x.txt",header=TRUE,row.names=1,na.strings = "NA")
> > > Warning message:
> > > In read.table(file = "x.txt", header = TRUE, row.names = 1, na.strings = "NA") :
> > > incomplete final line found by readTableHeader on 'x.txt'
> > > > x
> > > b1 b2 b3
> > > a1 2 4 6
> > > a2 1 2 NA
> > > a3 4 6 NA
> > > > y<-as.matrix(read.table(file="y.txt",header=TRUE,row.names=1,na.strings = "NA"))
> > > Warning message:
> > > In read.table(file = "y.txt", header = TRUE, row.names = 1, na.strings = "NA") :
> > > incomplete final line found by readTableHeader on 'y.txt'
> > > > y
> > > b1 b2 b3
> > > a1 NA 4 4
> > > a2 2 2 NA
> > > a3 1 2 2
> > > > z <- mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
> > > > z
> > > b1 b2 b3 <NA> <NA> <NA> <NA> <NA>
> > > 2.333333 3.500000 3.500000 2.750000 3.500000 4.000000 2.750000 4.000000
> > > <NA>
> > > 4.000000
> > > > dim(z) <- dim(x)
> > > > z
> > > [,1] [,2] [,3]
> > > [1,] 2.333333 2.75 2.75
> > > [2,] 3.500000 3.50 4.00
> > > [3,] 3.500000 4.00 4.00
> > > > is.na(z) <- is.nan(z)
> > > > z
> > > [,1] [,2] [,3]
> > > [1,] 2.333333 2.75 2.75
> > > [2,] 3.500000 3.50 4.00
> > > [3,] 3.500000 4.00 4.00
> > > >
> > >
> > >
> > > Allen
> > >
> > >
> > > On Nov 11, 2007 5:27 PM, jim holtman <jholtman at gmail.com> wrote:
> > > > Here is one way of doing it:
> > > >
> > > > > x
> > > > [,1] [,2] [,3]
> > > > [1,] 2 4 6
> > > > [2,] 1 2 NA
> > > > [3,] 4 6 NA
> > > > > y
> > > > [,1] [,2] [,3]
> > > > [1,] NA 4 4
> > > > [2,] 2 2 NA
> > > > [3,] 1 2 2
> > > > > z <- mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y)
> > > > > dim(z) <- dim(x)
> > > > > z
> > > > [,1] [,2] [,3]
> > > > [1,] 2.0 4 5
> > > > [2,] 1.5 2 NaN
> > > > [3,] 2.5 4 2
> > > > > # to change it to NA
> > > > > is.na(z) <- is.nan(z)
> > > > > z
> > > > [,1] [,2] [,3]
> > > > [1,] 2.0 4 5
> > > > [2,] 1.5 2 NA
> > > > [3,] 2.5 4 2
> > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > On Nov 11, 2007 4:52 PM, affy snp <affysnp at gmail.com> wrote:
> > > > > Dear list,
> > > > >
> > > > > I am new to R and very inexperienced. Sorry for the trouble.
> > > > > I have two txt files and want to merge them by taking the average.
> > > > > More specifically, for example, the txt file1, with row names and column names,
> > > > > consists of 238000 rows and 196 columns. Each column corresponds
> > > > > to a sample. The data is mixed with numeric or NA. So what I plan to
> > > > > do is:
> > > > >
> > > > > (1) Take the 1st column from txt file 1 and txt file 2, calculate the average
> > > > > if both numbers are numeric. If one is numeric and the other one is NA or
> > > > > the opposite, just use the numeric; If both are NA, then use NA, Do all this
> > > > > for all columns
> > > > > (2) Create txt file 3 with the numbers from the above and add the row names and
> > > > > column names.
> > > > >
> > > > > So an illustrative example could be:
> > > > >
> > > > > txt file 1
> > > > >
> > > > > A B C
> > > > > row1 2 4 6
> > > > > row2 1 2 NA
> > > > > row3 4 6 NA
> > > > >
> > > > > txt file 2
> > > > >
> > > > > A B C
> > > > > row1 NA 4 4
> > > > > row2 2 2 NA
> > > > > row3 1 2 2
> > > > >
> > > > > then txt file 3 will be created as:
> > > > >
> > > > > A B C
> > > > > row1 2 4 5
> > > > > row2 1.5 2 NA
> > > > > row3 2.5 4 2
> > > > >
> > > > > Any help will be appreciated.
> > > > >
> > > > > Thanks!
> > > > > Allen
> > > > >
> > > > > ______________________________________________
> > > > > R-help at r-project.org mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Jim Holtman
> > > > Cincinnati, OH
> > > > +1 513 646 9390
> > > >
> > > > What is the problem you are trying to solve?
> > > >
> > >
> >
> >
> >
> > --
> >
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem you are trying to solve?
> >
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?
More information about the R-help
mailing list