[R] spss, string factors, selecting

Katherine Jones kajones at connect.carleton.ca
Tue Nov 27 02:31:00 CET 2007


Hi,

This is probably a case where someone has to see what is happening on  
my computer and it is complicated by my data being from SPSS (not my  
choice). It is quite hard to give my data, because it is such a large  
dataset. I have analysed 9 other datasets that work fine, but this  
particular dataset was inputted wrong so requires merging of two  
datasets. This may be the problem.

Example of data:-
File 1.
[1] Individual [2] Habitat type [3] Weight
File 2.
[1] Individual [2] Fat [3] Fat method.

I merge the two files to create:-
[1] Individual [2] Habitat type [3] Weight [4] Fat [5] Fat method

My merging appears to work in the sense that I can plot Weight versus  
Fat and I get data, but if I ask to see the data file I see a sea of  
"NAs". So I'm not sure how there can be data there to plot, see  
levels for and create tables for but I can't see it as a dataframe. I  
do get the plot I want.

Fat method contains either blank cells, " B" or " E".

I wish to select all the rows in columns 1-4 which contain an " E" in  
Fat method.

e.g.
120, 3, 20.2, 4, E
121, 4, 20.0, 5, B
132, 3, 21.2, 4,

I want to select only the row containing " E", so I can plot Fat vs  
Habitat and Weight vs. Fat.

I have been doing this by using

selectE<-Data[Fatmethod==" E",].

However, this does not work. It removes all of my data in the other  
columns to "NA" and I am left only with fatmethod and fat scores.

It is odd it works with other datasets but not this one. Although  
with my other datasets when I ask to select " E", I can still see "  
B" using levels(Fat method) but there is no data there, so my plots  
are correct.

Sorry this is long. I'm having difficulty explaining it.

Katherine


On 26-Nov-07, at 5:09 PM, jim holtman wrote:

> That should give you back a subset of 'data' (with all its columns),
> for those with " E" in 'column'.  Can you show an example of your data
> and what the desired output would be.  The posting guide asks "provide
> commented, minimal, self-contained, reproducible code" so we don't
> have to speculate on what you want.
>
> On Nov 26, 2007 5:04 PM, Katherine Jones  
> <kajones at connect.carleton.ca> wrote:
>> This sort of works. It does select the E data, but unfortunately  
>> it doesn't
>> select the data from the other columns; I want to select data  
>> across about 5
>> columns by the factor " E" in one of the columns. It should be  
>> easy, but for
>> some reason it is not working. The spaces being added don't help.
>>
>> It seems to work on my non-merged data files, although the merged  
>> file
>> contains all the data I need.
>>
>> Thanks for the subset command though. Hadn't thought of using that.
>>
>>
>>
>> On 26-Nov-07, at 4:46 PM, jim holtman wrote:
>> ?subset
>>
>>
>> subset(data, column == " E")
>>
>
>
>
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?



More information about the R-help mailing list