[R] applying math/stat functions to rows in data frame

Marc Schwartz marc_schwartz at comcast.net
Sat Sep 15 18:32:11 CEST 2007


On Sat, 2007-09-15 at 09:02 -0700, Gerard Smits wrote:
> Hi All,
> 
> There are a variety of functions that can be applied to a variable 
> (column) in a data frame: mean, min, max, sd, range, IQR, etc.
> 
> I am aware of only two that work on the rows, using q1-q3 as example 
> variables:
> 
> rowMeans(cbind(q1,q2,q3),na.rm=T)   #mean of multiple variables
> rowSums (cbind(q1,q2,q3),na.rm=T)   #sum of multiple variables
> 
> Can the standard column functions (listed in the first sentence) be 
> applied to rows, with the use of correct indexes to reference the 
> columns of interest?  Or, must these summary functions be programmed 
> separately to work on a row?
> 
> Thanks,
> 
> Gerard

The answer is: it depends

If the row can be coerced to a numeric vector, then yes. This presumes
that the data frame contains a single data type or the subset of columns
you need contains a single data type.

If the row contains multiple data types, then the row becomes a single
row data frame or a list and you would have to consider other possible
approaches.

For example:

Taking the first row of the 'iris' dataset becomes a single row data
frame:

> str(iris[1, ])
'data.frame':   1 obs. of  5 variables:
 $ Sepal.Length: num 5.1
 $ Sepal.Width : num 3.5
 $ Petal.Length: num 1.4
 $ Petal.Width : num 0.2
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1

or if you set 'drop = TRUE', a list:

> str(iris[1, , drop = TRUE])
List of 5
 $ Sepal.Length: num 5.1
 $ Sepal.Width : num 3.5
 $ Petal.Length: num 1.4
 $ Petal.Width : num 0.2
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1


If however, you remove the last column Species, which is a factor, you
can coerce the remaining object to a numeric matrix:

> str(as.matrix(iris[, -5]))
 num [1:150, 1:4] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"



Some functions will do this coercion internally:

For example:

> rowSums(iris)
Error in rowSums(x, prod(dn), p, na.rm) : 'x' must be numeric


However:

> head(rowSums(iris[, -5]))
[1] 10.2  9.5  9.4  9.4 10.2 11.4


HTH,

Marc Schwartz



More information about the R-help mailing list