[R] help needed using t.test with factors
Thomas Adams
Thomas.Adams at noaa.gov
Thu Feb 4 20:45:15 CET 2010
Dennis,
Thank you for the suggestion, but I get this error:
> t.test(MAE ~ type,data=data)
Error in t.test.formula(MAE ~ type, data = data) :
grouping factor must have exactly 2 levels
Tom
Dennis Murphy wrote:
> Hi:
>
> On Thu, Feb 4, 2010 at 11:07 AM, Thomas Adams <Thomas.Adams at noaa.gov
> <mailto:Thomas.Adams at noaa.gov>> wrote:
>
> I am trying to use t.test on the following data:
>
> date type INTERVAL nCASES MTF SDF MTO SDO
> nFST MF nOBS MO MB BIASCV BIASEV ME MAE
> RMSE CRCF
> 2001-06-15 avn GE1.00 4385 0.246 0.300 1.502
> 0.556 1367 1.373 4385 1.502 1.471 0.285
> 0.164 -1.256 1.266 1.399 0.056
> 2001-06-15 avn 0.00LT0.01 852225 0.018 0.066
> 0.000 0.001 708406 0.001 852225 0.000 0.000
> 1.663 71.664 0.018 0.018 0.068 0.176
> 2001-06-15 avn 0.01LT0.10 77643 0.097 0.151
> 0.039 0.025 176129 0.040 77643 0.039 0.040
> 2.331 2.486 0.058 0.086 0.162 0.096
> 2001-06-15 avn 0.10LT0.25 29388 0.145 0.186
> 0.162 0.043 74164 0.160 29388 0.162 0.160
> 2.493 0.897 -0.017 0.129 0.189 0.056
> 2001-06-15 avn 0.25LT0.50 17592 0.177 0.208
> 0.353 0.070 25189 0.336 17592 0.353 0.343
> 1.365 0.503 -0.175 0.238 0.279 0.033
> 2001-06-15 avn 0.50LT1.00 10503 0.208 0.245
> 0.693 0.138 6481 0.666 10503 0.693 0.683
> 0.593 0.300 -0.485 0.517 0.560 0.017
> 2001-06-15 avn GE1.00 4385 0.246 0.300 1.502
> 0.556 1367 1.373 4385 1.502 1.471 0.285
> 0.164 -1.256 1.266 1.399 0.056
> 2001-06-15 eta GE1.00 4385 0.242 0.308 1.502
> 0.556 577 1.338 4385 1.502 1.483 0.117 0.161
> -1.261 1.272 1.398 0.111
> 2001-06-15 eta 0.00LT0.01 852225 0.013 0.055
> 0.000 0.001 799424 0.000 852225 0.000 0.000
> 1.368 50.193 0.013 0.013 0.057 0.175
> 2001-06-15 eta 0.01LT0.10 77643 0.079 0.139
> 0.039 0.025 113987 0.043 77643 0.039 0.041
> 1.617 2.013 0.040 0.079 0.144 0.083
> 2001-06-15 eta 0.10LT0.25 29388 0.116 0.169
> 0.162 0.043 47461 0.160 29388 0.162 0.161
> 1.596 0.719 -0.045 0.139 0.178 0.055
> 2001-06-15 eta 0.25LT0.50 17592 0.147 0.197
> 0.353 0.070 23284 0.345 17592 0.353 0.348
> 1.296 0.417 -0.205 0.258 0.291 0.040
> 2001-06-15 eta 0.50LT1.00 10503 0.180 0.230
> 0.693 0.138 7003 0.643 10503 0.693 0.673
> 0.619 0.260 -0.513 0.532 0.576 0.041
> 2001-06-15 eta GE1.00 4385 0.242 0.308 1.502
> 0.556 577 1.338 4385 1.502 1.483 0.117 0.161
> -1.261 1.272 1.398 0.111
> 2001-06-15 hpc GE1.00 4385 0.339 0.345 1.502
> 0.556 1326 1.265 4385 1.502 1.447 0.255
> 0.225 -1.163 1.172 1.314 0.144
> 2001-06-15 hpc 0.00LT0.01 852225 0.014 0.057
> 0.000 0.001 777147 0.000 852225 0.000 0.000
> 0.823 54.824 0.014 0.014 0.059 0.195
> 2001-06-15 hpc 0.01LT0.10 77643 0.092 0.148
> 0.039 0.025 123342 0.048 77643 0.039 0.045
> 1.967 2.346 0.053 0.085 0.156 0.109
> 2001-06-15 hpc 0.10LT0.25 29388 0.147 0.190
> 0.162 0.043 56107 0.161 29388 0.162 0.161
> 1.896 0.908 -0.015 0.137 0.192 0.077
> 2001-06-15 hpc 0.25LT0.50 17592 0.195 0.219
> 0.353 0.070 25677 0.344 17592 0.353 0.348
> 1.424 0.552 -0.158 0.237 0.276 0.057
> 2001-06-15 hpc 0.50LT1.00 10503 0.251 0.265
> 0.693 0.138 8137 0.659 10503 0.693 0.678
> 0.737 0.362 -0.442 0.480 0.529 0.066
> 2001-06-15 hpc GE1.00 4385 0.339 0.345 1.502
> 0.556 1326 1.265 4385 1.502 1.447 0.255
> 0.225 -1.163 1.172 1.314 0.144
> 2001-06-15 ngm GE1.00 4385 0.157 0.199 1.502
> 0.556 297 1.119 4385 1.502 1.478 0.050 0.105
> -1.345 1.345 1.474 -0.062
> 2001-06-15 ngm 0.00LT0.01 852225 0.017 0.063
> 0.000 0.001 771901 0.000 852225 0.000 0.000
> 0.703 65.457 0.017 0.017 0.065 0.132
> 2001-06-15 ngm 0.01LT0.10 77643 0.070 0.127
> 0.039 0.025 133779 0.041 77643 0.039 0.040
> 1.803 1.784 0.031 0.073 0.131 0.073
> 2001-06-15 ngm 0.10LT0.25 29388 0.100 0.152
> 0.162 0.043 54850 0.161 29388 0.162 0.161
> 1.859 0.620 -0.061 0.137 0.168 0.050
> 2001-06-15 ngm 0.25LT0.50 17592 0.130 0.177
> 0.353 0.070 24526 0.344 17592 0.353 0.348
> 1.360 0.369 -0.222 0.263 0.291 0.047
> 2001-06-15 ngm 0.50LT1.00 10503 0.152 0.196
> 0.693 0.138 6383 0.643 10503 0.693 0.674
> 0.564 0.219 -0.541 0.551 0.591 0.025
> 2001-06-15 ngm GE1.00 4385 0.157 0.199 1.502
> 0.556 297 1.119 4385 1.502 1.478 0.050 0.105
> -1.345 1.345 1.474 -0.062
> 2001-06-15 rfc GE1.00 4385 0.343 0.349 1.502
> 0.556 1192 1.239 4385 1.502 1.446 0.224
> 0.228 -1.159 1.168 1.310 0.157
> 2001-06-15 rfc 0.00LT0.01 852225 0.014 0.055
> 0.000 0.001 773777 0.000 852225 0.000 0.000
> 0.719 53.984 0.014 0.014 0.056 0.200
> 2001-06-15 rfc 0.01LT0.10 77643 0.091 0.141
> 0.039 0.025 123689 0.047 77643 0.039 0.044
> 1.899 2.333 0.052 0.084 0.150 0.114
> 2001-06-15 rfc 0.10LT0.25 29388 0.148 0.184
> 0.162 0.043 58569 0.159 29388 0.162 0.160
> 1.957 0.913 -0.014 0.134 0.186 0.081
> 2001-06-15 rfc 0.25LT0.50 17592 0.197 0.214
> 0.353 0.070 26386 0.340 17592 0.353 0.345
> 1.448 0.558 -0.156 0.232 0.271 0.055
> 2001-06-15 rfc 0.50LT1.00 10503 0.253 0.262
> 0.693 0.138 8123 0.643 10503 0.693 0.671
> 0.718 0.365 -0.440 0.476 0.525 0.074
> 2001-06-15 rfc GE1.00 4385 0.343 0.349 1.502
> 0.556 1192 1.239 4385 1.502 1.446 0.224
> 0.228 -1.159 1.168 1.310 0.157
> 2001-07-15 avn GE1.00 3258 0.194 0.233 1.399
> 0.400 1323 1.440 3258 1.399 1.410 0.418
> 0.139 -1.204 1.209 1.287 0.039
> 2001-07-15 avn 0.00LT0.01 879285 0.021 0.073
> 0.000 0.001 736915 0.001 879285 0.000 0.000
> 1.541 73.048 0.020 0.020 0.075 0.137
> 2001-07-15 avn 0.01LT0.10 84628 0.081 0.139
> 0.039 0.025 179228 0.040 84628 0.039 0.040
> 2.200 2.104 0.043 0.078 0.146 0.079
>
>
> This wouldn't read for me:
>
> Error: unexpected string constant in:
> "79285 0.000 0.000 1.541 73.048 0.020 0.020
> 0.075 0.137
> 2001-07-15 avn 0.01LT0.10 84628 0.081 0.139 0.039
> 0.025
>
>
>
> of which this is just a small portion of the data. What I want to
> do is to test the difference between the MAE values for those that
> are, for example, 'hpc' vs those that are 'rfc', that is, by
> 'type' in the header.
>
>
> t.test(MAE ~ type, data = yourdf, ...)
>
> By default, t.test uses var.equal = FALSE and paired = FALSE. If you
> want to
> assume equal population variances, set var.equal = TRUE. Since the
> sample sizes
> are going to be large, this is essentially a Z-test.
>
>
> I have looked for many examples and have tried to construct the
> correct syntax, but no luck so far. If possible, I would further
> like to break down the test, not only by type, but type and INTERVAL.
>
>
> If you want this type of breakdown, you're going to be doing a two-way
> ANOVA.
> Individual t-tests in this case would be an extremely inefficient use
> of the data.
>
> HTH,
> Dennis
>
>
> --
> Thomas E Adams
> National Weather Service
> Ohio River Forecast Center
> 1901 South State Route 134
> Wilmington, OH 45177
>
> EMAIL: thomas.adams at noaa.gov <mailto:thomas.adams at noaa.gov>
>
> VOICE: 937-383-0528
> FAX: 937-383-0033
>
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177
EMAIL: thomas.adams at noaa.gov
VOICE: 937-383-0528
FAX: 937-383-0033
More information about the R-help
mailing list