[R] replacing value in column of data frame

Daniel Malter daniel at umd.edu
Wed Jul 9 11:19:15 CEST 2008




cuncta stricte discussurus

-----Ursprüngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von Booman, M
Gesendet: Wednesday, July 09, 2008 5:14 AM
An: r-help at r-project.org
Betreff: [R] replacing value in column of data frame

Dear all,
Probably a very basic question but I need some help.
I have a data frame (made by read.table from a text file) of microarray
data, of which the first column is a factor and the rest of the columns are
The factor column contains chromosome names, so values 1 through 22 plus X,
Y and XY. The numeric columns contain positions or intensity measurements.
What I need to do is change the X's in the first column to a value of 23. 
This is what I thought I would do:
BAF_temp <- read.table("BAF_all.txt", sep="\t", header=T)  #to read in the
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23                           #"in rows
where the first column of BAF_temp is X, change the first column of BAF_temp
to 23"
However with this last line I get an error: "Invalid factor level, NAs
generated in '[<-.factor'('*tmp*', BAF_temp[,1]=="X", value=23)"
(I tested if my syntax for selecting the rows of chromosome X was correct by
trying BAF_X <- BAF_temp[BAF_temp[,1]=="X",] which worked to give me a data
frame with only the rows of the X chromosome.)
I then thought it might work better if I changed the data frame to a matrix.
When I change the BAF_temp data frame into a matrix (by BAF_matrix <-
as.matrix(BAF_temp)), then the command I used above:
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23
works fine and the end result is as I meant it to be, with all the X's
changed into 23's.
However, by using as.matrix all columns are changed to 'character' including
the numeric measurements (I understand this is because one of the columns of
the data frame is 'factor')
I would like some help on what is the best option to solve this. I have
thought of a few options myself and would like your comment/help:
1. Is there another syntax I can use on the data frame to change the X's to
23's, so I don't have to change the data frame into a matrix first?
2. I could change the data frame into a matrix and run the syntax as I
described, resulting in all columns becoming 'character'; is there then an
easy way to turn the columns with measurements (columns 2 and further) back
into 'numeric' while leaving the first column with the chromosome numbers as
3. I thought of using data.matrix(BAF_temp) and making use of the fact that
the first column of factors would be changed to the underlying numbers
(because X being the 23rd level in the list would automaticly be changed to
23). However because the levels (chromosome names) of the factor column are
ordered as "1", "10", "11", "12",....,"19", "2", "20", "21", "3", "4", etc.
(I see this when using str(BAF_temp)) , this results in chromosome 10 being
changed into a value of 2, chromosome 11 into 3, chromosome 2 into 12 etc.
For info: the chromosome names in the text file that is imported are ordered
just 1, 2, 3, etc.
If anyone has some tips for me I would greatly appreciate it.
Best wishes,

De inhoud van dit bericht is vertrouwelijk en alleen bestemd voor de
geadresseerde(n). Anderen dan de geadresseerde(n) mogen geen gebruik maken
van dit bericht, het niet openbaar maken of op enige wijze verspreiden of
vermenigvuldigen. Het UMCG kan niet aansprakelijk gesteld worden voor een
incomplete aankomst of vertraging van dit verzonden bericht.

The contents of this message are confidential and only intended for the eyes
of the addressee(s). Others than the addressee(s) are not allowed to use
this message, to make it public or to distribute or multiply this message in
any way. The UMCG cannot be held responsible for incomplete reception or
delay of this transferred message.

	[[alternative HTML version deleted]]

R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list