[R] How to assign scores to rows based on column values
David Winsemius
dwinsemius at comcast.net
Sun Apr 25 16:27:53 CEST 2010
On Apr 25, 2010, at 1:08 AM, burgundy wrote:
>
> Hi,
>
> I'm trying to assign a score to each row which allow me to identify
> which
> rows differ. In the example file below, I've used "," to indicate
> column
> separators. In this example, I'd like to identify that row 1 and row
> 5 are
> the same, and row 2 and row 4 are teh same.
> Any help much appreciated. Also, any comments on what the command
> lines do
> would be fantastic.
> Thanks!!
>
> example file:
> 0,0,1,0,1,0,0
> 0,1,0,0,0,0,1
> 0,0,0,0,0,0,0
> 0,1,0,0,0,0,1
> 0,0,1,0,1,0,0
> 0,0,0,1,0,0,0
>
> example request output:
> 1
> 2
> 3
> 2
> 1
> 4
If you use apply by rows with paste and a collapse argument you can
get a text column. Using factor on that text column and then setting
levels=unique(fac) one can extract the ordered elements with
as.numeric(fac).
On a dataframe, rrr, with those elements and such a factor, fac:
> as.numeric(factor(rrr$fac, levels=unique(rrr$fac)))
[1] 1 2 3 2 1 4
One needs to use factor a second time because the levels after the
first call were set to an alpha-sorted version of fac.
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list