[R] Reading hierarchical data
jim holtman
jholtman at gmail.com
Sun Feb 7 18:05:07 CET 2010
Will this do it for you:
> input <- readLines(textConnection("06470 1 1
+ 1 232 0
+ 2 230 1
+ 07470 1 0
+ 1 240 1
+ 08470 1 0
+ 1 227 0
+ 09470 1 0
+ 1 213 1
+ 2 222 0
+ 3 224 1
+ 10470 1 1
+ 1 220 0
+ 2 211 1
+ 11470 1 0
+ 1 217 0
+ 2 210 1
+ 3 226 1"))
> closeAllConnections()
> fid <- NULL
> dwell <- NULL
> result <- do.call(rbind, lapply(input, function(.line){
+ values <- as.integer(substring(.line, c(1, 7, 9), c(5, 7, 9)))
# assume family record
+ if (values[2] == '1'){
+ fid <<- values[1]
+ dwell <<- values[3]
+ return(NULL)
+ } else {
+ values <- as.integer(substring(.line, c(1, 7, 8, 11), c(5, 7, 9, 11)))
+ return(c(fid=fid, dwell=dwell, pid=values[1], age=values[3],
sex=values[4]))
+ }
+ }))
>
> result
fid dwell pid age sex
[1,] 6470 1 1 32 0
[2,] 6470 1 2 30 1
[3,] 7470 0 1 40 1
[4,] 8470 0 1 27 0
[5,] 9470 0 1 13 1
[6,] 9470 0 2 22 0
[7,] 9470 0 3 24 1
[8,] 10470 1 1 20 0
[9,] 10470 1 2 11 1
[10,] 11470 0 1 17 0
[11,] 11470 0 2 10 1
[12,] 11470 0 3 26 1
On Sun, Feb 7, 2010 at 10:57 AM, Saba(Home) <sabaric at charter.net> wrote:
>
> I would like to read the following hierarchical data set. There is a family
> record followed by one or more personal records.
> If col. 7 is "1" it is a family record. If it is "2" it is a personal
> record.
> The family record is formatted as follows:
> col. 1-5 family id
> col. 7 "1"
> col. 9 dwelling type code
> The personal record is formatted as follows:
> col. 1-5 personal id
> col. 7 "2"
> col. 8-9 age
> col. 11 sex code
>
> The first six family and accompanying personal records look like this:
> 06470 1 1
> 1 232 0
> 2 230 1
> 07470 1 0
> 1 240 1
> 08470 1 0
> 1 227 0
> 09470 1 0
> 1 213 1
> 2 222 0
> 3 224 1
> 10470 1 1
> 1 220 0
> 2 211 1
> 11470 1 0
> 1 217 0
> 2 210 1
> 3 226 1
>
> I want to create a dataset containing
> . family ID
> . dwelling code
> . person ID
> . age
> . sex code
> The dataset will contain one observation per person, and the with family
> information repeated for people in the same family.
> Can anyone help?
> Thanks,
> Richard Saba
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list