x=c(1:25)
x[23]="X"
x
x.new=ifelse(x=="X",23,x)
x.new=as.numeric(x.new)
Dear all,
Probably a very basic question but I need some help.
I have a data frame (made by read.table from a text file) of microarray
data, of which the first column is a factor and the rest of the columns are
numeric.
The factor column contains chromosome names, so values 1 through 22 plus X,
Y and XY. The numeric columns contain positions or intensity measurements.
What I need to do is change the X's in the first column to a value of 23.
This is what I thought I would do:
BAF_temp <- read.table("BAF_all.txt", sep="\t", header=T) #to read in the
table
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23 #"in rows
where the first column of BAF_temp is X, change the first column of BAF_temp
to 23"
However with this last line I get an error: "Invalid factor level, NAs
generated in '[<-.factor'('*tmp*', BAF_temp[,1]=="X", value=23)"
(I tested if my syntax for selecting the rows of chromosome X was correct by
trying BAF_X <- BAF_temp[BAF_temp[,1]=="X",] which worked to give me a data
frame with only the rows of the X chromosome.)
I then thought it might work better if I changed the data frame to a matrix.
When I change the BAF_temp data frame into a matrix (by BAF_matrix <-
as.matrix(BAF_temp)), then the command I used above:
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23
works fine and the end result is as I meant it to be, with all the X's
changed into 23's.
However, by using as.matrix all columns are changed to 'character' including
the numeric measurements (I understand this is because one of the columns of
the data frame is 'factor')
I would like some help on what is the best option to solve this. I have
thought of a few options myself and would like your comment/help:
1. Is there another syntax I can use on the data frame to change the X's to
23's, so I don't have to change the data frame into a matrix first?
2. I could change the data frame into a matrix and run the syntax as I
described, resulting in all columns becoming 'character'; is there then an
easy way to turn the columns with measurements (columns 2 and further) back
into 'numeric' while leaving the first column with the chromosome numbers as
'character'?
3. I thought of using data.matrix(BAF_temp) and making use of the fact that
the first column of factors would be changed to the underlying numbers
(because X being the 23rd level in the list would automaticly be changed to
23). However because the levels (chromosome names) of the factor column are
ordered as "1", "10", "11", "12",....,"19", "2", "20", "21", "3", "4", etc.
(I see this when using str(BAF_temp)) , this results in chromosome 10 being
changed into a value of 2, chromosome 11 into 3, chromosome 2 into 12 etc.
For info: the chromosome names in the text file that is imported are ordered
just 1, 2, 3, etc.
If anyone has some tips for me I would greatly appreciate it.
Best wishes,
Marije
