[R] "haven" - read_spss: How to avoid extracting value labels instead of long labels?

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Fri Nov 13 02:37:15 CET 2015


I have to rephrase my question again - it's clearly a small bug in
haven. Here is what it is about:

If I have a column in SPSS that has BOTH a long label and value
labels, then everything works fine - I access one with 'label' and
another with 'labels':

attr(spss1$MYVAR, "label")
[1] "LONG LABEL"
attr(spss1$MYVAR, "labels")
    DEFINITELY CONSIDER       PROBABLY CONSIDER   PROBABLY NOT
CONSIDER DEFINITELY NOT CONSIDER
                      1                       2
3                       4

However, if I have a column that has no long label and ONLY value
labels, then it's not working properly:

> attr(spss1$MYVAR, "label")
VERY/SOMEWHAT FAMILIAR    NOT AT ALL FAMILIAR
                     1                      2
> attr(spss1$MYVAR, "labels")
VERY/SOMEWHAT FAMILIAR    NOT AT ALL FAMILIAR
                     1                      2

And I actually need to be able to identify if label is empty.
Thank you for looking into it!

Dimitri


On Thu, Nov 12, 2015 at 5:55 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:
> Looks like a little bug in 'haven':
>
> When I actually look at the attributes of one variable that has no
> long label in SPSS but has Value Labels, I am getting:
> attr(spss1$WAVE, "label")
> NULL
>
> But when I sapply my function longlabels to my data frame and ask it
> to print the long labels for each column, for the same column "WAVE" I
> am getting - instead of NULL:
> NULL
> VERY/SOMEWHAT FAMILIAR    NOT AT ALL FAMILIAR
>                      1                      2
>
> This is, of course, incorrect, because it grabs the next attribute
> (which one? And replaces NULL with it).
> Any suggestions?
> Thanks!
>
>
>
>
> On Thu, Nov 12, 2015 at 11:56 AM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>> Hello!
>>
>> I don't have an example file, but I think my question should be clear
>> without it.
>> I have an SPSS file. I read it in using 'haven':
>>
>> library(haven)
>> spss1 <- read_spss("SPSS_Example.sav")
>>
>> I created a function that extracts the long labels (in SPSS - "Label"):
>>
>> fix_labels <- function(x, TextIfMissing) {
>>       val <- attr(x, "label")
>>       if (is.null(val)) TextIfMissing else val
>> }
>> longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO LABLE IN SPSS")
>>
>> This function is supposed to create a vector of long labels and
>> usually it does, e.g.:
>>
>> str(longlabels)
>>  Named chr [1:64] "Serial number" ...
>>  - attr(*, "names")= chr [1:64] "Respondent_Serial" "weight" "r7_1" "r7_2" ...
>>
>> However, I just got an SPSS file with 92 columns and ran exactly the
>> same function on it. Now, I am getting not a vector, but a list
>>
>> str(longlabels)
>> List of 92
>>  $ VEHRATED      : chr "VEHICLE RATED"
>>  $ RESPID        : chr "RESPONDENT ID"
>>  $ RESPID8       : chr "8 DIGIT RESPONDENT NUMBER"
>>
>> An observation about the structure of longlabels here: those columns
>> that do NOT have a long lable in SPSS but DO have Values (value
>> labels) - for them my function grabs their value labels, so that now
>> my long label is recorded as a numeric vector with names, e.g.:
>>
>>  $ AWARE2        : Named num [1:2] 1 2
>>   ..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT FAMILIAR" "NOT AT ALL FAMILIAR"
>>
>> Question: How could I avoid the extraction of the Value Labels for the
>> columns that have no long labels?
>>
>> Thank you very much!
>> --
>> Dimitri Liakhovitski
>
>
>
> --
> Dimitri Liakhovitski



-- 
Dimitri Liakhovitski



More information about the R-help mailing list