[R] Can't compute row means of two columns of a dataframe.

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Sat Jun 8 20:15:46 CEST 2024


John,

Maybe you can clarify what you want the output to look like. It took me a
while to realize what you may want as it is NOT properly described as
wanting rowsums.

There is a standard function called rowMeans() that probably does what you
want if you want the mean of all rows as in:

> rowMeans(xxxz)
 [1]  84.33333  87.00000  89.66667  92.33333  95.00000  97.66667 100.33333
103.66667 106.33333 109.00000 112.33333 115.00000
[13] 118.00000 121.33333 124.00000 127.33333 130.66667 134.00000 137.00000

It does not add the means to the original data.frame if you wanted it there
but that is easy enough to do.

> xxxz$Average20 <-rowMeans(xxxz)
> head(xxxz)
  TotalInches Low20 High20 Average20
1          58    84    111  84.33333
2          59    87    115  87.00000
3          60    90    119  89.66667
4          61    93    123  92.33333
5          62    96    127  95.00000
6          63    99    131  97.66667

Your construct is more complex and it looks like you want to do this to a
subset of two columns. Again, straightforward:

xxxz$Average20 <-rowMeans(xxxz[, c("Low20", "High20")])

And I probably would do this using a dplyr mutate but that is outside the
scope.

This does not help explain your error, so let me look at what you are trying
to do.


What  did you expect to use by() for in the second argument? You seem to be
giving it INDICES of the first column entries. What is that for?

by(xxxz[,c("Low20","High20")],
   xxxz[,"TotalInches"],
   mean)

The documentation suggest this is for splitting by factors. I do not  see
there are multiple instances of some TotalInches so why is this needed for
some kind of grouping?

My guess is you are using the wrong function or the wrong way for your
needs. The warnings may relate to that.


-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Sorkin, John
Sent: Saturday, June 8, 2024 1:38 PM
To: r-help using r-project.org (r-help using r-project.org) <r-help using r-project.org>
Subject: [R] Can't compute row means of two columns of a dataframe.

I have a data frame with three columns, TotalInches, Low20, High20. For each
row of the dataset, I am trying to compute the mean of Low20 and High20. 

xxxz <- structure(list(TotalInches = 
                 c(58, 59, 60, 61, 62, 63, 64, 65, 
                   66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 =
c(84, 87, 
                   90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126,
129, 
                   133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127,
131, 
                   135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181,
186, 191
                   )), class = "data.frame", row.names = c(NA, -19L))
xxxz
str(xxxz)
xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean)
warnings()

When I run the code above, I don't get the means by row. I get the following
warning messages, one for each row of the dataframe.

Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA

 Can someone tell my what I am doing wrong, and how I can compute the row
means?

Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list