[R] Subsampling data
Eik Vettorazzi
E.Vettorazzi at uke.uni-hamburg.de
Thu Aug 11 19:06:37 CEST 2011
Hi Stefán
you might not to see the wood for the trees, but ?subset is a R function
as well.
MalesData <- subset(Datatemp,Datatemp$sex==1)
btw. your selection
> MalesData <- Datatemp[Datatemp $sex==1]
went wrong for two reasons:
(a) the extra space befor $
(b) incorrect indexing. Datatemp is a data.frame and has 2 dimensions
(and to my surprise indexing on one dimension only returns the
respective columns, which is different from matrix indexing), so
MalesData <- Datatemp[Datatemp$sex==1,]
should work as well.
Am 11.08.2011 16:16, schrieb Stefán Hrafn Jónsson:
> *Dear R community*
>
> * *
>
> *I have two questions on data subsample manipulation. I am starting to use R
> again after a long brake and feel a bit rusty.*
>
> * *
>
> *I want to select a subsample of data for males and females separately*
>
> * *
>
>
>
> library(foreign)
>
> Datatemp <- read.spss("H:/Skjol/Data/HL/t1and2b.sav", use.value.labels = F)
>
>
>
>
>
>
>> table(Datatemp$sex)
>
>
>
> 1 2
>
> 3049 3702
>
>
>
>> attributes(Datatemp)
>
> …
>
> $names
>
> [1] "nomiss" "Bin" "rad09" "year"
> "sex"
> "income" "adults"
>
> [8] "children" "student" "retired" "disabled"
> "homemaker" "unemployed" "employed"
>
> [15] "occupation" "residencysize" "educ" "agemean"
> "age"
> "marital"
>
>
>
> $codepage
>
> [1] 1252
>
>
>
>> MalesData <- Datatemp[Datatemp $sex==1]
>
>> MalesData
>
> named list()
>
>> attributes(MalesData)
>
> $names
>
> character(0)
>
>
>
>
> Females.Data <- Datatemp[Datatemp $sex==2]
>
>
>
>
>
>
>
> *This subset extraction is not working. Is there anyone who can tell me what
> I did wrong?*
>
> * *
>
> * *
>
> *A different but related question is the use of the function paste or if I
> need another function to do the following: *
>
> * *
>
> * *
>
> *Rather than this*:
>
>
>
>> m2 <- gee( Bin ~ agemean + year, id = rad09 , data = datause ,
> subset=kyn== 1 ,
>
> family = binomial, corstr ="exchangeable" )
>
>
>
>
>
>
>
> *I want to do this (modified in a loop). *
>
>
>
>
>
>> subsampl <- "kyn== 1 "
>
>
>
>> m2 <- gee( Bin ~ agemean + year, id = rad09 , data = datause ,
> subset=paste(subsampl) ,
>
> family = binomial, corstr ="exchangeable" )
>
>
>
> Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
>
> Error in gee(Bin ~ agemean + year, id = rad09, data = datause, subset =
> paste(subsampl), :
>
> rank-deficient model matrix
>
>
>
>
>
>
>
> *I hope you can see what I want to do, but I think I may need other function
> than paste()*
>
> *
> *
>
> *I appreciate a lot any help. *
>
> *
> Stefan Hrafn*
>
> [[alternative HTML version deleted]]
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Eik Vettorazzi
Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf
Martinistr. 52
20246 Hamburg
T ++49/40/7410-58243
F ++49/40/7410-57790
More information about the R-help
mailing list