[R] How can I improve an ugly, dumb hack

Bert Gunter gunter.berton at gene.com
Thu Sep 6 21:58:39 CEST 2012


As Linus would say:

AGHHH...!

1. Rui's solution clearly violates my conditions 1) and 2) and so does not work.

2. David's violated my UNSTATED condition 4: The order of the columns
cannot be changed. Any matrix "columns" must be expanded "in place"
between their flanking columns of the data frame. Note that the
solution below changes the column order.

... and I apologize for my continuing -- and embarrassing -- lack of clarity.

However, even so, I take some comfort that David needed to resort to
sapply(...) -- so my original "solution" may not be as dumb as I
thought.

In any case, if anyone wants to bother with this further, please
respond offline, as this has continued here long enough.

Thanks.

-- Bert

On Thu, Sep 6, 2012 at 11:06 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Sep 6, 2012, at 10:28 AM, Bert Gunter wrote:
>
>> On Thu, Sep 6, 2012 at 10:20 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>> <snipped>
>>
>>> I guess this means you are not the one performing the d$c <- m step? If you were under control of that step, you can get different (and more to your liking)  behavior with 'cbind.data.frame':
>>
>> Correct. d is given to me already, as described. I constructed it in
>> my post only to provide an example of what it might look like. I
>> apologize for evidently being unclear about this (and I tried real
>> hard ... sigh....).
>>
>
> OK, then:
>
>> cbind(d[, !sapply(d, is.matrix)], d[, sapply(d, is.matrix)])
>   a b x y
> 1 1 4 a d
> 2 2 5 b e
> 3 3 6 c f
>
> HTH;
>
> --
> DW
>
>> '
>
>> -- Bert
>>
>>
>>>
>>>> cbind(d, m)
>>>  a b x y
>>> 1 1 4 a d
>>> 2 2 5 b e
>>> 3 3 6 c f
>>>> ncol( cbind(d, m) )
>>> [1] 4
>>>
>>>
>>>>
>>>> Now what I wish to do is programmatically convert d to a 4 column
>>>> frame with names c("a","b","x","y"). Of course:
>>>>
>>>> 1. The column classes/modes must be preserved (character going to
>>>> factor and numeric remaining numeric).
>>>>
>>>> 2. I assume that I do not know a priori which of d's
>>>> components/columns are matrices and which are vectors.
>>>>
>>>> 3. There may be many more columns which are vectors or matrix than
>>>> just the three in this little example.
>>>>
>>>> I can easily and sensibly accomplish these 3 tasks, but the problem is
>>>> that I run afoul of data frame column naming procedures in doing so,
>>>> about which the data.frame Help page says rather enigmatically:
>>>>
>>>> "How the names of the data frame are created is complex, and the rest
>>>> of this paragraph is only the basic story." Indeed!
>>>> (This, of course, is shorthand for "Go look at the source if you want
>>>> to know!" )
>>>>
>>>> Anyway, AFAICT from the Help, any "simple" approach to conversion
>>>> using data.frame results in "c.x" and "c.y" for the names of the last
>>>> two columns. I **can** get what I want by explicitly constructing the
>>>> vector of names via the following ugly hack; my question is, can it be
>>>> improved?
>>>>
>>>>> dd <- do.call(data.frame,d)
>>>>
>>>>> dd
>>>> a b c.x c.y
>>>> 1 1 4   a   d
>>>> 2 2 5   b   e
>>>> 3 3 6   c   f
>>>>
>>>>> ncol(dd)
>>>> [1] 4
>>>>
>>>>> cnames <- sapply(d,colnames)
>>>>> cnames
>>>> $a
>>>> NULL
>>>>
>>>> $b
>>>> NULL
>>>>
>>>> $c
>>>> [1] "x" "y"
>>>>
>>>>
>>>>> names(dd) <-  unlist(ifelse(sapply(cnames,is.null),names(d),cnames))
>>>> ##Yuck!
>>>>
>>>>> dd
>>>> a b x y
>>>> 1 1 4 a d
>>>> 2 2 5 b e
>>>> 3 3 6 c f
>>>>
>>>> Cheers to all,
>>>> Bert
>>>>
>>>>
>>>> --
>>>>
>>>> Bert Gunter
>>>> Genentech Nonclinical Biostatistics
>>>>
>>>> Internal Contact Info:
>>>> Phone: 467-7374
>>>> Website:
>>>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> David Winsemius, MD
> Alameda, CA, USA
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




More information about the R-help mailing list