[R] spss, string factors, selecting
Katherine Jones
kajones at connect.carleton.ca
Tue Nov 27 02:31:00 CET 2007
Hi,
This is probably a case where someone has to see what is happening on
my computer and it is complicated by my data being from SPSS (not my
choice). It is quite hard to give my data, because it is such a large
dataset. I have analysed 9 other datasets that work fine, but this
particular dataset was inputted wrong so requires merging of two
datasets. This may be the problem.
Example of data:-
File 1.
[1] Individual [2] Habitat type [3] Weight
File 2.
[1] Individual [2] Fat [3] Fat method.
I merge the two files to create:-
[1] Individual [2] Habitat type [3] Weight [4] Fat [5] Fat method
My merging appears to work in the sense that I can plot Weight versus
Fat and I get data, but if I ask to see the data file I see a sea of
"NAs". So I'm not sure how there can be data there to plot, see
levels for and create tables for but I can't see it as a dataframe. I
do get the plot I want.
Fat method contains either blank cells, " B" or " E".
I wish to select all the rows in columns 1-4 which contain an " E" in
Fat method.
e.g.
120, 3, 20.2, 4, E
121, 4, 20.0, 5, B
132, 3, 21.2, 4,
I want to select only the row containing " E", so I can plot Fat vs
Habitat and Weight vs. Fat.
I have been doing this by using
selectE<-Data[Fatmethod==" E",].
However, this does not work. It removes all of my data in the other
columns to "NA" and I am left only with fatmethod and fat scores.
It is odd it works with other datasets but not this one. Although
with my other datasets when I ask to select " E", I can still see "
B" using levels(Fat method) but there is no data there, so my plots
are correct.
Sorry this is long. I'm having difficulty explaining it.
Katherine
On 26-Nov-07, at 5:09 PM, jim holtman wrote:
> That should give you back a subset of 'data' (with all its columns),
> for those with " E" in 'column'. Can you show an example of your data
> and what the desired output would be. The posting guide asks "provide
> commented, minimal, self-contained, reproducible code" so we don't
> have to speculate on what you want.
>
> On Nov 26, 2007 5:04 PM, Katherine Jones
> <kajones at connect.carleton.ca> wrote:
>> This sort of works. It does select the E data, but unfortunately
>> it doesn't
>> select the data from the other columns; I want to select data
>> across about 5
>> columns by the factor " E" in one of the columns. It should be
>> easy, but for
>> some reason it is not working. The spaces being added don't help.
>>
>> It seems to work on my non-merged data files, although the merged
>> file
>> contains all the data I need.
>>
>> Thanks for the subset command though. Hadn't thought of using that.
>>
>>
>>
>> On 26-Nov-07, at 4:46 PM, jim holtman wrote:
>> ?subset
>>
>>
>> subset(data, column == " E")
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
More information about the R-help
mailing list