[R] Data frame with 3 columns to matrix
Michael Bach
phaebz at gmail.com
Tue Apr 19 17:17:21 CEST 2011
David Winsemius <dwinsemius at comcast.net> writes:
> On Apr 19, 2011, at 8:16 AM, Michael Bach wrote:
>
>> David Winsemius <dwinsemius at comcast.net> writes:
>>
>>> Perhaps but only if the third row of your example was incorrectly constructed:
>>>> dta <- rd.txt(" x y z
>>> 1 1.00 5 0.5
>>> 2 1.02 5 0.7
>>> 3 1.04 7 0.1
>>> 4 1.06 9 0.4")
>>> #rd.txt() is a combo fn of read.table and textConnection
>>>
>>>> mat <- matrix(NA, ncol=NROW(dta)+1, nrow=NROW(dta)+1)
>>>> mat[2:NROW(mat),1] <- dta[["x"]]
>>>> mat[1,2:NROW(mat)] <- dta[["y"]]
>>>> diag(mat) <- c(NA, dta[["z"]])
>>>> mat
>>> [,1] [,2] [,3] [,4] [,5]
>>> [1,] NA 5.0 5.0 7.0 9.0
>>> [2,] 1.00 0.5 NA NA NA
>>> [3,] 1.02 NA 0.7 NA NA
>>> [4,] 1.04 NA NA 0.1 NA
>>> [5,] 1.06 NA NA NA 0.4
>>>
>>>
>>
>> Thanks for your answer David,
>>
>> but this yields a diagonal matrix only. I think I did not make myself
>> clear enough. In the original 3 column data frame, there could have
>> been a pair of x and y with identical y's but different x's and z's.
>> The way my data source is derived, there is a guarantee that there is
>> are no two rows with identical x and y in the original data frame. In
>> the end, x and y serve as a grid, with z values at each point in the
>> grid or NA's if there is no z value for a x and y pair. The number of
>> rows in the data frame is then equal to the number of non-NA values in
>> the resulting matrix.
>>
>> Another try, lets assume this original data frame:
>>
>> x y z
>> 1 2 5 1
>> 2 2 6 1
>> 3 3 7 1
>> 4 3 8 1
>> 5 3 9 1
>> 6 5 10 2
>> 7 5 11 2
>> 8 5 12 2
>>
>> Then I would like to get
>>
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
>> [1,] NA 5 6 7 8 9 10 11 12
>> [2,] 2 1 1
>> [3,] 2
>> [4,] 3 1 1 1
>> [5,] 3
>> [6,] 3
>> [7,] 5 2 2 2
>> [8,] 5
>> [9,] 5
>>
>> I left out all the NA's, except the first, where there is no z value,
>> say e.g. x=5 and y=8.
>>
>> Do you see what I mean?
>
> I do, ... now anyway. Your earlier data example had non-integer x and y values which made
> what I will now offer infeasible (or at the very least ambiguous). Indexing with decimal
> numbers does not provoke an error and that the truncated value is used. With integer
> indices you can use a two column matrix as an argument to "["
>
>> mat <- matrix(NA, nrow=max(dta[[1]])+1, ncol=max(dta[[2]])+1 )
>> mat[data.matrix(dta[,1:2])] <- dta[,3]
>> mat
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
> [1,] NA NA NA NA NA NA NA NA NA NA NA NA
> [2,] NA NA NA NA 1 1 NA NA NA NA NA NA
> [3,] NA NA NA NA NA NA 1 1 1 NA NA NA
> [4,] NA NA NA NA NA NA NA NA NA NA NA NA
> [5,] NA NA NA NA NA NA NA NA NA 2 2 2
> [6,] NA NA NA NA NA NA NA NA NA NA NA NA
> [,13]
> [1,] NA
> [2,] NA
> [3,] NA
> [4,] NA
> [5,] NA
> [6,] NA
>
> I leave the insertion of the first row and columns and removal of the extra columns
> induced by the mismatch of the values and row numbers to you, since .....
>> mat[, 4:12]
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
> [1,] NA NA NA NA NA NA NA NA NA
> [2,] NA 1 1 NA NA NA NA NA NA
> [3,] NA NA NA 1 1 1 NA NA NA
> [4,] NA NA NA NA NA NA NA NA NA
> [5,] NA NA NA NA NA NA 2 2 2
> [6,] NA NA NA NA NA NA NA NA NA
>
> --
>
> David Winsemius, MD
> West Hartford, CT
Thanks for your tips and advice!
I will see what I can work out alone from here on...
More information about the R-help
mailing list