[R] subsetting like in SAS
    Denis Chabot 
    chabotd at globetrotter.net
       
    Thu Jan 13 11:52:04 CET 2005
    
    
  
Hi,
Being in the process of translating some of my SAS programs to R, I 
encountered one difficulty. I have a solution, but it is not elegant 
(and not pleasant to implement).
I have a large dataset with many variables needed to identify the 
origin of a sample, many to describe sample characteristics, others to 
describe site characteristics.
I want only a (shorter) list of sites and their characteristics.
If "origin", "ship_cat", "ship_nb", "trip" and "set" are needed to 
identify a site, in SAS you'd sort on those variables, then read the 
data with:
data sites;
	set alldata;
	by origin ship_cat ship_nb trip set;
	if first.set;
	keep list-of-variables-detailing-sites;
run;
In R I did this with the Lag function of Hmisc, and the original data 
set also needs to be sorted first:
oL <- Lag(origin)
scL <- Lag(ship_cat)
snL <- Lag(ship_nb)
tL <- Lag(trip)
sL <- Lag(set)
same <- origin==oL & ship_cat==scL & ship_nb==snL & trip==tL & set==sL
sites <- subset(alldata, !same, 
select=c(list-of-variables-detailing-sites)
Could I do better than this?
Thanks in advance,
Denis Chabot
    
    
More information about the R-help
mailing list