[R] OT: A philosophical question about statistics

Chris Ryan cwr @end|ng |rom @gency@t@t|@t|c@|@com
Mon May 5 23:02:42 CEST 2025


I've often wondered how the field of statistics, and statistical
education, would have evolved if modern-day computers and software and
programming were available in the early years. Would the "traditional"
methods, requiring simplifying assumptions, have been developed at all?

--Chris Ryan

avi.e.gross using gmail.com wrote:
> A brief answer to this OT question is that many disciplines do the same
> thing and teach multiple methods, including some that are historical and are
> no longer really used.
> 
> But since you say this was an intro course, it would not prepare you well if
> later courses and the real world expose you to uses of the other methods
> such as being asked to maintain or extend applications already in use from a
> while back that use one or another or combinations.
> 
> As others have noted, this is not really a case of either/or. It is both.
> Would you make US students choose between knowing the metric system and the
> one more commonly used now? I see many things labeled with both kinds of
> measures, including car speedometers.
> 
> 
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Bert Gunter
> Sent: Monday, May 5, 2025 3:09 PM
> To: Ebert,Timothy Aaron <tebert using ufl.edu>
> Cc: R-help email list <r-help using r-project.org>; Kevin Zembower
> <kevin using zembower.org>
> Subject: Re: [R] OT: A philosophical question about statistics
> 
> Heh. I suspect you'll get some interesting responses, but I won't try to
> answer your questions. Instead, I'll just say:
> 
> (All just imo, so caveat emptor)
> 
> 1. What you have been taught is mostly useless for addressing "real"
> statistical issues;
> 
> 2. Most of my 40 or so years of statistical practice involved trying to
> define the questions of interest and determining whether there existed or
> how to best obtain relevant data to answer those questions. Once/if that
> was done, how to obtain answers from the data was usually straightforward.
> 
> Cheers,
> 
> Bert
> "An educated person is one who can entertain new ideas, entertain others,
> and entertain herself."
> 
> 
> On Mon, May 5, 2025, 18:12 Ebert,Timothy Aaron <tebert using ufl.edu> wrote:
> 
>> (adding slightly to Gregg's answer)
>> Why do professionals use both? Computer intensive methods (bootstrap,
>> randomization, jackknife) are data hungry. They do not work well if I have
>> a sample size of 4. One could argue that the traditional methods also have
>> trouble, but one could also think of the traditional approach as assuming
>> unobserved values. Assuming that the true distribution is represented by
> my
>> 4 observations then ...
>>    Computer intensive approaches have not been readily available until the
>> invention of widely available faster computers. There is a large body of
>> information and long experience with the traditional methods in all
>> scientific disciplines. If you are unfamiliar with these approaches, then
>> you may not fully understand that key paper published 30 years ago.
>>    We like to think we have "the answer" but there are times where the
>> answer we get depends on how we ask the question. The different tests ask
>> the same question in different ways. Does the answer for your data change
>> depending on what approach is used? If so, then what assumption or which
>> test is problematic and why?
>>
>> Tim
>>
>>
>> -----Original Message-----
>> From: R-help <r-help-bounces using r-project.org> On Behalf Of Gregg Powell via
>> R-help
>> Sent: Monday, May 5, 2025 12:06 PM
>> To: Kevin Zembower <kevin using zembower.org>
>> Cc: R-help email list <r-help using r-project.org>
>> Subject: [R] OT: A philosophical question about statistics
>>
>> [External Email]
>>
>> Hi Kevin,
>> It might seem like simulation methods (bootstrapping and randomization)
>> and traditional formulas (Normal or t-distributions) are just two ways to
>> do the same job. So why learn both? Each approach has its own strengths,
>> and statisticians use both in practice.
>>
>> Why do professionals use both?
>> Each method offers something the other can't. In practice, both
>> simulation-based and theoretical techniques have unique strengths and
>> weaknesses, and the better choice depends on the problem and its
>> assumptions (check out - biopharmaservices.com). Simulation methods are
>> very flexible. They don't need strict formulas and still work even if
>> classical conditions (like "data must be Normal") aren't true. Theoretical
>> methods are quicker and widely understood. When their assumptions hold,
>> they give fast, exact results (a simple formula can yield a confidence
>> interval, again, check out - biopharmaservices.com).
>>
>> Advantages of each approach
>> * Simulation-based methods: Intuitive and flexible. They require fewer
>> assumptions, so they work well even for odd datasets.
>> * Theoretical methods: Quick to calculate and convenient. Based on
>> well-known formulas and widely trusted (when standard assumptions hold).
>>
>> Why learn both?
>> Knowing both makes you versatile. Simulations give you a feel for what's
>> happening behind the scenes, while theory provides quick shortcuts and
>> deeper insight. A statistician might use a t-test formula for a simple
> case
>> but switch to bootstrapping for a complex one. Each method can cross-check
>> the other. Mastering both approaches gives you confidence in your results.
>>
>> Will future students learn both?
>> Probably yes. Computers now make simulation methods easy to use, so
>> they're more common in teaching. Meanwhile, classic Normal and t methods
>> aren't going away - they're fundamental and still useful. Future students
>> will continue to learn both, getting the best of both worlds.
>>
>> Good luck in your studies!
>> gregg
>>
>>
>>
>> On Monday, May 5th, 2025 at 8:17 AM, Kevin Zembower via R-help <
>> r-help using r-project.org> wrote:
>>
>>>
>>>
>>> I marked this posting as Off Topic because it doesn't specifically
>>> apply to R and Statistics, but is rather a general question about
>>> statistics and the teaching of statistics. If this is annoying to you,
>>> I apologize.
>>>
>>> As I wrap up my work in my beginning statistics course, I'd like to
>>> ask a philosophical question regarding statistics.
>>>
>>> In my course, we've learned two different ways to solve statistical
>>> problems: simulations, using bootstraps and randomized distributions,
>>> and theoretical methods, using Normal (z) and t-distributions. We've
>>> learned that both systems solve all the questions we've asked of them,
>>> and that both give comparable answers. Out of six chapters that we've
>>> studied in our textbook, the first four only used simulation methods.
>>> Only the last two used theoretical methods.
>>>
>>> My questions are:
>>>
>>> 1) Why don't professional statisticians settle on one or the other,
>>> and just apply that system to their problems and work? What advantage
>>> does one system have over the other?
>>>
>>> 2) As beginning statistics students, why is it important for us to
>>> learn both systems? Do you think that beginning statistics students
>>> will still be learning both systems in the future?
>>>
>>> Thank you very much for your time and effort in answering my questions.
>>> I really appreciate the thoughts of the members of this group.
>>>
>>> -Kevin
>>>
>>>
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
>>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu
>>> %7C17e2085007584244e78708dd8beebce9%7C0d4da0f84a314d76ace60a62331e1b84
>>> %7C0%7C0%7C638820579678440788%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGki
>>> OnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ
>>> %3D%3D%7C0%7C%7C%7C&sdata=C26Jn2LVk5CW1IXEglWxFRCuLfjC7LB3p6QBH2KkVCI%
>>> 3D&reserved=0 PLEASE do read the posting guide
>>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
>>> r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7C1
>>> 7e2085007584244e78708dd8beebce9%7C0d4da0f84a314d76ace60a62331e1b84%7C0
>>> %7C0%7C638820579678469839%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy
>>> dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%
>>> 3D%7C0%7C%7C%7C&sdata=arwwwchCqqRHcCLVTXQSfneEUX2yp6ucFp%2B4IBhrkv8%3D
>>> &reserved=0 and provide commented, minimal, self-contained,
>>> reproducible code.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> https://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list