[R] Expanding dataset on the values of one of its variables
arun
smartpink111 at yahoo.com
Sun Jul 6 18:14:22 CEST 2014
Hi,
Not sure about the expected output.
If `dat` is the dataset:
res <- dat[rep(1:nrow(dat), dat$score),]
head(res,7)
team year time score out top goals host format formed culture wcups cholder
1 ARG 1986 1 6 0 1 4 0 0 1893 93 8 0
1.1 ARG 1986 1 6 0 1 4 0 0 1893 93 8 0
1.2 ARG 1986 1 6 0 1 4 0 0 1893 93 8 0
1.3 ARG 1986 1 6 0 1 4 0 0 1893 93 8 0
1.4 ARG 1986 1 6 0 1 4 0 0 1893 93 8 0
1.5 ARG 1986 1 6 0 1 4 0 0 1893 93 8 0
2 ARG 1990 2 5 1 0 1 0 0 1893 97 9 0
times cards gdp
1 6 12 6146.155
1.1 6 12 6146.155
1.2 6 12 6146.155
1.3 6 12 6146.155
1.4 6 12 6146.155
1.5 6 12 6146.155
2 6 25 5800.057
A.K.
On Sunday, July 6, 2014 11:38 AM, Clive Nicholas <clivelists at googlemail.com> wrote:
Hello!
I have a dataset which is perhaps rather topical at about this time:
> wc=read.delim("/home/openclive/Documents/worldcup.csv",header=T,sep="\t",fill=T)> head(wc,n=20) team year time score out top goals host format formed culture wcups cholder times
1 ARG 1986 1 6 0 1 4 0 0 1893 93 8
0 6
2 ARG 1990 2 5 1 0 1 0 0 1893 97 9
0 6
3 ARG 1994 3 1 1 0 3 0 0 1893 101 10
1 6
4 ARG 1998 4 2 1 1 7 0 1 1893 105 11
0 6
5 ARG 2006 6 2 1 1 7 0 1 1893 113 13
0 6
6 ARG 2010 7 2 1 1 6 0 1 1893 117 14
0 6
7 AUS 2006 6 1 1 0 0 0 1 1961 45 1
1 1
8 BEL 1986 1 3 1 0 0 0 0 1895 91 6
0 4
9 BEL 1990 2 1 1 0 3 0 0 1895 95 7
0 4
10 BEL 1994 3 1 1 0 1 0 0 1895 99 8
0 4
11 BEL 2002 5 1 1 0 1 0 1 1895 107 9
0 4
12 BRA 1986 1 2 1 1 5 0 0 1914 72 12
0 7
13 BRA 1990 2 1 1 1 3 0 0 1914 76 13
1 7
14 BRA 1994 3 6 0 1 5 0 0 1914 80 14
0 7
15 BRA 1998 4 5 1 1 3 0 1 1914 84 15
1 7
16 BRA 2002 5 6 0 1 8 0 1 1914 88 16
0 7
17 BRA 2006 6 2 1 1 6 0 1 1914 92 17
1 7
18 BRA 2010 7 2 1 1 3 0 1 1914 96 18
0 7
19 BUL 1986 1 1 1 0 -2 0 0 1923 63 4
0 2
20 BUL 1994 3 3 1 0 3 0 0 1923 71 5
0 2
cards gdp
1 12 6146.155
2 25 5800.057
3 9 7162.093
4 14 7994.116
5 15 8107.975
6 7 9933.229
7 12 9933.229
8 8 16273.539
9 3 18222.221
10 6 18964.370
11 6 22801.777
12 3 3334.000
13 7 3564.636
14 11 3380.128
15 12 3693.276
16 10 3692.840
17 11 3976.619
18 13 4424.759
19 3 1508.592
20 25 1438.153
Don't worry about they all denote; I merely show it to you for the purposes
of demonstration. Basically, I want to expand the dataset on the value of
the -score- variable in order to prepare the data for survival analysis.
Thus, where Argentina achieved a score of 6 in 1986, I want R to create
five additional records. Where Bulgaria achieved a score of 3 in 1994, I
want it to create two additional records. Rows containing scores of 1
shouldn't create any extra records at all.
Doing this should be straightforward, but the -expand- command in the
-reshape- package doesn't appear to do this ... unless I've missed
something or there is another command from another package that does what I
need.
I'd be most grateful if anybody has a solution to this.
--
Clive Nicholas
"My colleagues in the social sciences talk a great deal about methodology.
I prefer to call it style." -- Freeman J. Dyson
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list