[R] Data frame with 3 columns to matrix

Tue Apr 19 17:17:21 CEST 2011

David Winsemius <dwinsemius at comcast.net> writes:

> On Apr 19, 2011, at 8:16 AM, Michael Bach wrote:
>
>> David Winsemius <dwinsemius at comcast.net> writes:
>>
>>> Perhaps but only if the third row of your example was incorrectly constructed:
>>>> dta <- rd.txt("   x y   z
>>> 1 1.00 5 0.5
>>> 2 1.02 5 0.7
>>> 3 1.04 7 0.1
>>> 4 1.06 9 0.4")
>>> #rd.txt() is a combo fn of read.table and textConnection
>>>
>>>> mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1)
>>>> mat[2:NROW(mat),1] <- dta[["x"]]
>>>> mat[1,2:NROW(mat)] <- dta[["y"]]
>>>> diag(mat) <- c(NA, dta[["z"]])
>>>> mat
>>>     [,1] [,2] [,3] [,4] [,5]
>>> [1,]   NA  5.0  5.0  7.0  9.0
>>> [2,] 1.00  0.5   NA   NA   NA
>>> [3,] 1.02   NA  0.7   NA   NA
>>> [4,] 1.04   NA   NA  0.1   NA
>>> [5,] 1.06   NA   NA   NA  0.4
>>>
>>>
>>
>> Thanks for your answer David,
>>
>> but this yields a diagonal matrix only.  I think I did not make myself
>> clear enough.  In the original 3 column data frame, there could have
>> been a pair of x and y with identical y's but different x's and z's.
>> The way my data source is derived, there is a guarantee that there is
>> are no two rows with identical x and y in the original data frame.  In
>> the end, x and y serve as a grid, with z values at each point in the
>> grid or NA's if there is no z value for a x and y pair.  The number of
>> rows in the data frame is then equal to the number of non-NA values in
>> the resulting matrix.
>>
>> Another try, lets assume this original data frame:
>>
>>  x  y z
>> 1 2  5 1
>> 2 2  6 1
>> 3 3  7 1
>> 4 3  8 1
>> 5 3  9 1
>> 6 5 10 2
>> 7 5 11 2
>> 8 5 12 2
>>
>> Then I would like to get
>>
>>     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
>> [1,]   NA    5    6    7    8    9   10   11   12
>> [2,]    2    1    1
>> [3,]    2
>> [4,]    3              1    1    1
>> [5,]    3
>> [6,]    3
>> [7,]    5                             2    2    2
>> [8,]    5
>> [9,]    5
>>
>> I left out all the NA's, except the first, where there is no z value,
>> say e.g. x=5 and y=8.
>>
>> Do you see what I mean?
>
> I do, ... now anyway. Your earlier data example had non-integer x and y values which made
> what I will now offer infeasible (or at the very  least ambiguous). Indexing with decimal
> numbers does not provoke an  error and that the truncated value is used.  With integer
> indices you  can use a two column matrix as an argument to "["
>
>> mat <- matrix(NA, nrow=max(dta[[1]])+1, ncol=max(dta[[2]])+1 )
>> mat[data.matrix(dta[,1:2])] <- dta[,3]
>> mat
>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
> [1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
> [2,]   NA   NA   NA   NA    1    1   NA   NA   NA    NA    NA    NA
> [3,]   NA   NA   NA   NA   NA   NA    1    1    1    NA    NA    NA
> [4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
> [5,]   NA   NA   NA   NA   NA   NA   NA   NA   NA     2     2     2
> [6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA    NA    NA
>      [,13]
> [1,]    NA
> [2,]    NA
> [3,]    NA
> [4,]    NA
> [5,]    NA
> [6,]    NA
>
> I leave the insertion of the first row and columns and removal of the extra columns
> induced by the mismatch of the values and row numbers to  you, since .....
>> mat[, 4:12]
>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
> [1,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
> [2,]   NA    1    1   NA   NA   NA   NA   NA   NA
> [3,]   NA   NA   NA    1    1    1   NA   NA   NA
> [4,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
> [5,]   NA   NA   NA   NA   NA   NA    2    2    2
> [6,]   NA   NA   NA   NA   NA   NA   NA   NA   NA
>
> --
>
> David Winsemius, MD
> West Hartford, CT

Thanks for your tips and advice!

I will see what I can work out alone from here on...