[R] paste first row string onto every string in column
Patrick Connolly
p_connolly at slingshot.co.nz
Thu Aug 13 08:13:11 CEST 2009
On Wed, 12-Aug-2009 at 09:06AM -0700, Jill Hollenbach wrote:
|>
|> Thanks so much everybody, this has been incredibly helpful--not only is my
|> immediate issue solved but I've learned a lot in the process. The lapply
|> solution is best for me, as I need flexibility to edit df's with varying
|> numbers of columns.
|>
|> Now, one more question: after appending the string from the first line, I am
|> manipulating the df further(recoding the original contents; this I have
|> working fine), and afterwards I will need to strip back off that string. It
|> seems relatively straightforward, except that, as shown in the example above
|> (df2), there is an astersik involved (I need to remove all characters up to
|> and including the asterisk) which seems problematic.
|> Any suggestions?
check out strsplit. You'll probably first need to get the columns
into character instead of the factors that they'll be.
HTH
|> Many thanks,
|> Jill
|>
|>
|>
|> Don MacQueen wrote:
|> >
|> > Let's start with something simple and relatively easy to understand,
|> > since you're new to this.
|> >
|> > First, here's an example of the core of the idea:
|> >> paste('a',1:4)
|> > [1] "a 1" "a 2" "a 3" "a 4"
|> >
|> > Make it a little closer to your situation:
|> >> paste('a*',1:4, sep='')
|> > [1] "a*1" "a*2" "a*3" "a*4"
|> >
|> > Sometimes it helps to save the number of rows in your dataframe in a
|> > new variable
|> >
|> > nr <- nrow(df)
|> >
|> > Then, for your first column, the "a*" in the above example is df$V1[1]
|> > For the 1:4 in the example, you use df$V1[ 2:nr]
|> > Put it together and you have:
|> >
|> > dfnew <- df
|> > dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] )
|> >
|> > But you can use "-1" instead of "2:nr", and you get
|> >
|> > dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] )
|> >
|> > That's how you can do it one column at a time.
|> > Since you have only four columns, just do the same thing to V2, V3, and
|> > V4.
|> >
|> > But if you want a more general method, one that works no matter how
|> > many columns you have, and no matter what they are named, then you
|> > can use lapply() to loop over the columns. This is what Patrick
|> > Connolly suggested, which is
|> >
|> > as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = "")))
|> >
|> > Note, though, that this will do it to all columns, so if you ever
|> > happen to have a dataframe where you don't want to do all columns,
|> > you'll have to be a little trickier with the lapply() solution.
|> >
|> > -Don
|> >
|> > At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote:
|> >>Hi,
|> >>I am trying to edit a data frame such that the string in the first line is
|> >>appended onto the beginning of each element in the subsequent rows. The
|> data
|> >>looks like this:
|> >>
|> >>> df
|> >> V1 V2 V3 V4
|> >>1 DPA1* DPA1* DPB1* DPB1*
|> >>2 0103 0104 0401 0601
|> >>3 0103 0103 0301 0402
|> >>.
|> >>.
|> >> and what I want is this:
|> >>
|> >>>dfnew
|> >> V1 V2 V3 V4
|> >>1 DPA1* DPA1* DPB1* DPB1*
|> >>2 DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601
|> >>3 DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402
|> >>
|> >>any help is much appreciated, I am new to this and struggling.
|> >>Jill
|> >>
|> >>___
|> >> Jill Hollenbach, PhD, MPH
|> >> Assistant Staff Scientist
|> >> Center for Genetics
|> >> Children's Hospital Oakland Research Institute
|> >> jhollenbach at chori.org
|> >>
|> >>--
|> >>View this message in context:
|> >>http://*www.*nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html
|> >>Sent from the R help mailing list archive at Nabble.com.
|> >>
|> >>______________________________________________
|> >>R-help at r-project.org mailing list
|> >>https://*stat.ethz.ch/mailman/listinfo/r-help
|> >>PLEASE do read the posting guide
|> http://*www.*R-project.org/posting-guide.html
|> >>and provide commented, minimal, self-contained, reproducible code.
|> >
|> >
|> > --
|> > --------------------------------------
|> > Don MacQueen
|> > Environmental Protection Department
|> > Lawrence Livermore National Laboratory
|> > Livermore, CA, USA
|> > 925-423-1062
|> >
|> > ______________________________________________
|> > R-help at r-project.org mailing list
|> > https://stat.ethz.ch/mailman/listinfo/r-help
|> > PLEASE do read the posting guide
|> > http://www.R-project.org/posting-guide.html
|> > and provide commented, minimal, self-contained, reproducible code.
|> >
|> >
|>
|> --
|> View this message in context: http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24939755.html
|> Sent from the R help mailing list archive at Nabble.com.
|>
|> ______________________________________________
|> R-help at r-project.org mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.
--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___ Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_ Average minds discuss events
(:_~*~_:) Small minds discuss people
(_)-(_) ..... Eleanor Roosevelt
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
More information about the R-help
mailing list