[R] bug(?) in str() with strict.width = "cut" when appliedtodataframe with numeric component AND factor or character component withlongerlevels/strings
Gerrit Eichner
Gerrit.Eichner at math.uni-giessen.de
Wed Oct 16 10:07:03 CEST 2013
Thanks, Duncan,
for the good (indirect) hint: after a restart of R the problem is --
fortunately :-) -- not reproducible anymore for me either. The R session
had been running for a longer time and I recall doing some
(system-related) things outside of R that may have interfered with it; I
just forgot to take that possibility into consideration. :(
Regards -- Gerrit
On Tue, 15 Oct 2013, Duncan Murdoch wrote:
> On 15/10/2013 7:53 AM, Gerrit Eichner wrote:
>> Dear list subscribers,
>>
>> here is a small artificial example to demonstrate the problem that I
>> encountered when looking at the structure of a (larger) data frame that
>> comprised (among other components)
>>
>> a numeric component of elements of the order of > 10000, and
>>
>> a factor or character component with longer levels/strings:
>>
>>
>> k <- 43 # length of levels or character strings
>> n <- 11 # number of rows of data frame
>> M <- 10000 # order of magnitude of numerical values
>>
>> set.seed( 47) # to reproduce the following artificial character string
>> longer.char.string <- paste( sample( letters, k, replace = TRUE),
>> collapse = "")
>>
>> X <- data.frame( A = 1:n * M,
>> B = rep( longer.char.string, n))
>>
>>
>> The following call to str() gives apparently a wrong result
>>
>> str( X, strict.width = "cut")
>>
>> 'data.frame': 11 obs. of 2 variables:
>> $ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
>> $ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
>>
>>
>> whereas the correct result appears for str( X) or if you decrease k to 42
>> (isn't that "the answer"? ;-) ) or n to 10 or M to 1000 (or smaller,
>> respectively).
>>
>>
>> I tried to dig into the entrails of str.default(), where the cause may
>> lie, but got lost pretty soon. So, I am hoping that someone may already
>> have a work-around or patch (or dares to dig further)? Thank you for any
>> feedback!
>
> I can't reproduce this. I don't have a 64 bit copy of 3.0.2 handy, but I
> don't see it in 64 bit 3.0.1, or 64 bit 3.0.2-patched, or various 32 bit
> versions.
>
> Is it reproducible for you? It looks to me as though (if it isn't just
> something weird on your system, e.g. an old copy of str() in your workspace),
> it might be a memory protection problem: something needed to be duplicated
> but wasn't. But unless I can see it happen, I can't start to fix it.
>
> Duncan Murdoch
>
>>
>> Best regards -- Gerrit
>>
>> PS:
>>
>> > sessionInfo()
>>
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
>> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
>> [5] LC_TIME=German_Germany.1252
>>
>> attached base packages:
>> [1] splines stats graphics grDevices utils datasets
>> [7] methods base
>>
>> other attached packages:
>> [1] nparcomp_2.0 multcomp_1.2-21 mvtnorm_0.9-9996
>> [4] car_2.0-19 Hmisc_3.12-2 Formula_1.1-1
>> [7] survival_2.37-4 fortunes_1.5-0
>>
>> loaded via a namespace (and not attached):
>> [1] cluster_1.14.4 grid_3.0.2 lattice_0.20-23 MASS_7.3-29
>> [5] nnet_7.3-7 rpart_4.1-3 stats4_3.0.2 tools_3.0.2
>>
>> ---------------------------------------------------------------------
>> Dr. Gerrit Eichner Mathematical Institute, Room 212
>> gerrit.eichner at math.uni-giessen.de Justus-Liebig-University Giessen
>> Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany
>> Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/eichner
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list