[R] help needed using t.test with factors
Peter Ehlers
ehlers at ucalgary.ca
Thu Feb 4 20:52:22 CET 2010
Tom,
t.test(MAE ~ type, data=data, subset=type %in% c('hpc','rfc'))
-Peter Ehlers
Thomas Adams wrote:
> Dennis,
>
> Thank you for the suggestion, but I get this error:
>
> > t.test(MAE ~ type,data=data)
> Error in t.test.formula(MAE ~ type, data = data) :
> grouping factor must have exactly 2 levels
>
> Tom
>
>
>
> Dennis Murphy wrote:
>> Hi:
>>
>> On Thu, Feb 4, 2010 at 11:07 AM, Thomas Adams <Thomas.Adams at noaa.gov
>> <mailto:Thomas.Adams at noaa.gov>> wrote:
>>
>> I am trying to use t.test on the following data:
>>
>> date type INTERVAL nCASES MTF SDF MTO SDO
>> nFST MF nOBS MO MB BIASCV BIASEV ME MAE
>> RMSE CRCF
>> 2001-06-15 avn GE1.00 4385 0.246 0.300 1.502
>> 0.556 1367 1.373 4385 1.502 1.471 0.285
>> 0.164 -1.256 1.266 1.399 0.056
>> 2001-06-15 avn 0.00LT0.01 852225 0.018 0.066
>> 0.000 0.001 708406 0.001 852225 0.000 0.000
>> 1.663 71.664 0.018 0.018 0.068 0.176
>> 2001-06-15 avn 0.01LT0.10 77643 0.097 0.151
>> 0.039 0.025 176129 0.040 77643 0.039 0.040
>> 2.331 2.486 0.058 0.086 0.162 0.096
>> 2001-06-15 avn 0.10LT0.25 29388 0.145 0.186
>> 0.162 0.043 74164 0.160 29388 0.162 0.160
>> 2.493 0.897 -0.017 0.129 0.189 0.056
>> 2001-06-15 avn 0.25LT0.50 17592 0.177 0.208
>> 0.353 0.070 25189 0.336 17592 0.353 0.343
>> 1.365 0.503 -0.175 0.238 0.279 0.033
>> 2001-06-15 avn 0.50LT1.00 10503 0.208 0.245
>> 0.693 0.138 6481 0.666 10503 0.693 0.683
>> 0.593 0.300 -0.485 0.517 0.560 0.017
>> 2001-06-15 avn GE1.00 4385 0.246 0.300 1.502
>> 0.556 1367 1.373 4385 1.502 1.471 0.285
>> 0.164 -1.256 1.266 1.399 0.056
>> 2001-06-15 eta GE1.00 4385 0.242 0.308 1.502
>> 0.556 577 1.338 4385 1.502 1.483 0.117 0.161
>> -1.261 1.272 1.398 0.111
>> 2001-06-15 eta 0.00LT0.01 852225 0.013 0.055
>> 0.000 0.001 799424 0.000 852225 0.000 0.000
>> 1.368 50.193 0.013 0.013 0.057 0.175
>> 2001-06-15 eta 0.01LT0.10 77643 0.079 0.139
>> 0.039 0.025 113987 0.043 77643 0.039 0.041
>> 1.617 2.013 0.040 0.079 0.144 0.083
>> 2001-06-15 eta 0.10LT0.25 29388 0.116 0.169
>> 0.162 0.043 47461 0.160 29388 0.162 0.161
>> 1.596 0.719 -0.045 0.139 0.178 0.055
>> 2001-06-15 eta 0.25LT0.50 17592 0.147 0.197
>> 0.353 0.070 23284 0.345 17592 0.353 0.348
>> 1.296 0.417 -0.205 0.258 0.291 0.040
>> 2001-06-15 eta 0.50LT1.00 10503 0.180 0.230
>> 0.693 0.138 7003 0.643 10503 0.693 0.673
>> 0.619 0.260 -0.513 0.532 0.576 0.041
>> 2001-06-15 eta GE1.00 4385 0.242 0.308 1.502
>> 0.556 577 1.338 4385 1.502 1.483 0.117 0.161
>> -1.261 1.272 1.398 0.111
>> 2001-06-15 hpc GE1.00 4385 0.339 0.345 1.502
>> 0.556 1326 1.265 4385 1.502 1.447 0.255
>> 0.225 -1.163 1.172 1.314 0.144
>> 2001-06-15 hpc 0.00LT0.01 852225 0.014 0.057
>> 0.000 0.001 777147 0.000 852225 0.000 0.000
>> 0.823 54.824 0.014 0.014 0.059 0.195
>> 2001-06-15 hpc 0.01LT0.10 77643 0.092 0.148
>> 0.039 0.025 123342 0.048 77643 0.039 0.045
>> 1.967 2.346 0.053 0.085 0.156 0.109
>> 2001-06-15 hpc 0.10LT0.25 29388 0.147 0.190
>> 0.162 0.043 56107 0.161 29388 0.162 0.161
>> 1.896 0.908 -0.015 0.137 0.192 0.077
>> 2001-06-15 hpc 0.25LT0.50 17592 0.195 0.219
>> 0.353 0.070 25677 0.344 17592 0.353 0.348
>> 1.424 0.552 -0.158 0.237 0.276 0.057
>> 2001-06-15 hpc 0.50LT1.00 10503 0.251 0.265
>> 0.693 0.138 8137 0.659 10503 0.693 0.678
>> 0.737 0.362 -0.442 0.480 0.529 0.066
>> 2001-06-15 hpc GE1.00 4385 0.339 0.345 1.502
>> 0.556 1326 1.265 4385 1.502 1.447 0.255
>> 0.225 -1.163 1.172 1.314 0.144
>> 2001-06-15 ngm GE1.00 4385 0.157 0.199 1.502
>> 0.556 297 1.119 4385 1.502 1.478 0.050 0.105
>> -1.345 1.345 1.474 -0.062
>> 2001-06-15 ngm 0.00LT0.01 852225 0.017 0.063
>> 0.000 0.001 771901 0.000 852225 0.000 0.000
>> 0.703 65.457 0.017 0.017 0.065 0.132
>> 2001-06-15 ngm 0.01LT0.10 77643 0.070 0.127
>> 0.039 0.025 133779 0.041 77643 0.039 0.040
>> 1.803 1.784 0.031 0.073 0.131 0.073
>> 2001-06-15 ngm 0.10LT0.25 29388 0.100 0.152
>> 0.162 0.043 54850 0.161 29388 0.162 0.161
>> 1.859 0.620 -0.061 0.137 0.168 0.050
>> 2001-06-15 ngm 0.25LT0.50 17592 0.130 0.177
>> 0.353 0.070 24526 0.344 17592 0.353 0.348
>> 1.360 0.369 -0.222 0.263 0.291 0.047
>> 2001-06-15 ngm 0.50LT1.00 10503 0.152 0.196
>> 0.693 0.138 6383 0.643 10503 0.693 0.674
>> 0.564 0.219 -0.541 0.551 0.591 0.025
>> 2001-06-15 ngm GE1.00 4385 0.157 0.199 1.502
>> 0.556 297 1.119 4385 1.502 1.478 0.050 0.105
>> -1.345 1.345 1.474 -0.062
>> 2001-06-15 rfc GE1.00 4385 0.343 0.349 1.502
>> 0.556 1192 1.239 4385 1.502 1.446 0.224
>> 0.228 -1.159 1.168 1.310 0.157
>> 2001-06-15 rfc 0.00LT0.01 852225 0.014 0.055
>> 0.000 0.001 773777 0.000 852225 0.000 0.000
>> 0.719 53.984 0.014 0.014 0.056 0.200
>> 2001-06-15 rfc 0.01LT0.10 77643 0.091 0.141
>> 0.039 0.025 123689 0.047 77643 0.039 0.044
>> 1.899 2.333 0.052 0.084 0.150 0.114
>> 2001-06-15 rfc 0.10LT0.25 29388 0.148 0.184
>> 0.162 0.043 58569 0.159 29388 0.162 0.160
>> 1.957 0.913 -0.014 0.134 0.186 0.081
>> 2001-06-15 rfc 0.25LT0.50 17592 0.197 0.214
>> 0.353 0.070 26386 0.340 17592 0.353 0.345
>> 1.448 0.558 -0.156 0.232 0.271 0.055
>> 2001-06-15 rfc 0.50LT1.00 10503 0.253 0.262
>> 0.693 0.138 8123 0.643 10503 0.693 0.671
>> 0.718 0.365 -0.440 0.476 0.525 0.074
>> 2001-06-15 rfc GE1.00 4385 0.343 0.349 1.502
>> 0.556 1192 1.239 4385 1.502 1.446 0.224
>> 0.228 -1.159 1.168 1.310 0.157
>> 2001-07-15 avn GE1.00 3258 0.194 0.233 1.399
>> 0.400 1323 1.440 3258 1.399 1.410 0.418
>> 0.139 -1.204 1.209 1.287 0.039
>> 2001-07-15 avn 0.00LT0.01 879285 0.021 0.073
>> 0.000 0.001 736915 0.001 879285 0.000 0.000
>> 1.541 73.048 0.020 0.020 0.075 0.137
>> 2001-07-15 avn 0.01LT0.10 84628 0.081 0.139
>> 0.039 0.025 179228 0.040 84628 0.039 0.040
>> 2.200 2.104 0.043 0.078 0.146 0.079
>>
>>
>> This wouldn't read for me:
>>
>> Error: unexpected string constant in:
>> "79285 0.000 0.000 1.541 73.048 0.020 0.020
>> 0.075 0.137
>> 2001-07-15 avn 0.01LT0.10 84628 0.081 0.139 0.039
>> 0.025
>>
>>
>>
>> of which this is just a small portion of the data. What I want to
>> do is to test the difference between the MAE values for those that
>> are, for example, 'hpc' vs those that are 'rfc', that is, by
>> 'type' in the header.
>>
>> t.test(MAE ~ type, data = yourdf, ...)
>>
>> By default, t.test uses var.equal = FALSE and paired = FALSE. If you
>> want to
>> assume equal population variances, set var.equal = TRUE. Since the
>> sample sizes
>> are going to be large, this is essentially a Z-test.
>>
>>
>> I have looked for many examples and have tried to construct the
>> correct syntax, but no luck so far. If possible, I would further
>> like to break down the test, not only by type, but type and INTERVAL.
>>
>>
>> If you want this type of breakdown, you're going to be doing a two-way
>> ANOVA.
>> Individual t-tests in this case would be an extremely inefficient use
>> of the data.
>>
>> HTH,
>> Dennis
>>
>>
>> -- Thomas E Adams
>> National Weather Service
>> Ohio River Forecast Center
>> 1901 South State Route 134
>> Wilmington, OH 45177
>>
>> EMAIL: thomas.adams at noaa.gov <mailto:thomas.adams at noaa.gov>
>>
>> VOICE: 937-383-0528
>> FAX: 937-383-0033
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
--
Peter Ehlers
University of Calgary
More information about the R-help
mailing list