[R] Data transformation & cleaning
pip56789
pde3p at virginia.edu
Wed Sep 28 05:13:42 CEST 2011
Hi,
I have a few methodological and implementation questions for ya'll. Thank
you in advance for your help. I have a dataset that reflects people's
preference choices. I want to see if there's any kind of clustering effect
among certain preference choices (e.g. do people who pick choice A also pick
choice D).
I have a data set that has one record per user ID, per preference choice.
It's a "long" form of a data set that looks like this:
ID | Page
123 | Choice A
123 | Choice B
456 | Choice A
456 | Choice B
...
I thought that I should do the following
1. Make the data set "wide", counting the observations so the data looks
like this:
ID | Count of Preference A | Count of Preference B
123 | 1 | 1
...
Using
table1 <- dcast(data,ID ~ Page,fun.aggregate=length,value_var='Page' )
2. Create a correlation matrix of preferences
cor(table2[,-1])
How would I restrict my correlation to show preferences that met a minimum
sample threshold? Can you confirm if the two following commands do the same
thing? What would I do from here (or am I taking the wrong approach)
table1 <- dcast(data,Page ~ Page,fun.aggregate=length,value_var='Page' )
table2 <- with(data, table(Page,Page))
many thanks,
Peter
--
View this message in context: http://r.789695.n4.nabble.com/Data-transformation-cleaning-tp3849889p3849889.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list