[R-sig-ME] Managing person identifier variable

MACDOUGALL Margaret Margaret.MacDougall at ed.ac.uk
Wed Oct 5 22:33:08 CEST 2016


Thanks to all those who kindly provided a swift reply to my query at the foot of this email. I am sharing Steve's reply below, as I found it particularly helpful and trust that others may also benefit. My query has now been addressed. 
Best wishes
Margaret


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


-----Original Message-----
From: Steve Pierce [mailto:Steve.Pierce at cstat.msu.edu] 
Sent: 05 October 2016 20:28
To: MACDOUGALL Margaret <Margaret.MacDougall at ed.ac.uk>
Subject: RE: [R-sig-ME] Managing person identifier variable

Margaret,

Convert that variable to a factor. Suppose your data frame is (cleverly) called mydata, and the variable is called PID. The following code will coerce your numerical PID variable into a categorical factor variable called PIDf. You can also just overwrite the original PID variable. 

mydata$PIDf <- factor(mydata$PID)    # Creates a new variable 
mydata$PID  <- factor(mydata$PID)    # Overwrites the original variable


Steven J. Pierce, Ph.D.
Associate Director
Center for Statistical Training & Consulting (CSTAT) Michigan State University Giltner Hall
293 Farm Lane, Room 178
East Lansing, MI 48824

Office Phone: (517) 353-9288
Office Fax: (517) 353-9307
E-mail: Steve.Pierce at cstat.msu.edu
Web: http://www.cstat.msu.edu 


-----Original Message-----
From: MACDOUGALL Margaret [mailto:Margaret.MacDougall at ed.ac.uk]
Sent: Wednesday, October 05, 2016 3:00 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Managing person identifier variable

Hello



I would be most grateful for some advice in relation to the interpretation of a person identifier variable (persID, say),  in R. I would like to represent persons, as an independent variable, by a random effect. However, there are over 200 such persons. Each person is allocated a random numerical code as a unique identifier.  Currently, R is reading the identifier variable as a numeric variable. Is there a quick way of addressing this problem by recoding the variable?  (I do not wish to bin the values into category ranges; rather, I wish to avoid the numerical codes being interpreted literally.)



Many thanks



Margaret



More information about the R-sig-mixed-models mailing list