[R] OT: A philosophical question about statistics

Wed May 7 01:43:32 CEST 2025

Bert,

I noted similar things which indicate the limits a statistician or a programmer in R can run into about what can be measured as meaningful.

But, you can get numbers out of a system that can be helpful even if some confounding factors are not accounted for.

I also have sleep apnea and have access to data on a chip in my CPAP machine and have used software to produce graphs from the data that was supplied as well as done some analysis in R for myself. In one sense, you are generally comparing two versions of yourself if you look at statistics over months when you have your machine settings one way and then months later, another way. It likely would be more useful to switch settings every once in a while during the night as they do in a sleep study accompanied by other forms of monitoring and keep adjusting to find a relatively optimum level. 

But circumstances change. If a person loses a lot of weight, it may well be their form of apnea can even go away, or a lower setting works for them. There may be changes to effectiveness based on the temperature and humidity in the room as you sleep so comparing winter to summer conditions may also not work as well. Sleep is affected by many things including dogs barking or babies crying or by lighting conditions as well as medications, fatigue and so much more.

But Kevin is not doing a scientific study nor claiming his analysis is highly valid. He is just curious and doing some experiments on himself. And, he has a goal of trying to make his own adjustments in hopes of lowering some numbers (or raising) and one question is whether any changes noted as he makes adjustments make a statistically significant difference. Presumably, after some adjusting, he may leave the settings alone and since the measurements happen automatically, perhaps check once in a while to see if his reaction has changed and maybe it is time for a tune-up.

If he were a doctor trying to do this for a patient, then indeed it might be wise to also consider other factors and measures. Examples might include measuring the pulse and oxygen saturation and see how much time is spent in things like REM sleep or in restless leg movements and  even measures to see how alert and with-it they are the next day to see if they seem rested. But, for the use he is considering, it seems like a reasonable start. The reality is most people on CPAP are not at an optimal level, or not for long. 

His other example is even more subjective for reasons you mention and many more. But, it seems reasonable enough to consider the question of whether the system he was sold does really pay for itself. Clearly there can be major factors that are large enough so comparisons are not useful. If I converted a room in my house to house dozens of servers running around the clock and ran my air conditioning to keep them cool while also installing a dozen freezers to store the food for a catering business and even added on a few rooms to the house to rent out, then the electricity used would indeed go up way more than any expected savings. If, on the other hand, I went away for a 6-month sabbatical and disconnected all appliances, ...

In such analyses, the signal can be drowned by the noise. A serious analysis works best if many such variables are controlled or held within a range where the defects are smaller than the expected results. If someone suggests you may see a 1% improvement, the noise may prevail. If they claim your costs will drop in half, and you believe you have not changed your habits substantially, then seeing a 2% improvement suggests you were misled. Seeing an improvement of 40% or 60% might be compatible with enough "confidence" after doing some loose statistical sense. But perhaps only enough to convince yourself, not something you can take to court.

I think a great way to learn is along the lines of what Kevin is doing. Don't just learn. Try to use it and see what happens and get feedback from others.

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Bert Gunter
Sent: Tuesday, May 6, 2025 5:27 PM
To: Kevin Zembower <kevin using zembower.org>
Cc: R-help <r-help using r-project.org>
Subject: Re: [R] OT: A philosophical question about statistics

I am out of the country and will reply more fully to you (privately) when I
return. But briefly, and subject to my possible
misunderstanding/misinterpretation of your specification, I would say both
example demonstrate my points. In the first, the clear question is how
exactly will you objectively and unbiasedly  measure your health. Note that
your day to day subjective ratings or whatever are subject to a host of
outside influences that you will need to randomize against or somehow
include as covvariates. You will also need to decide exactly how to make
whatever changes you want to make. These are all issues of experimental
design, about which you were taught nothing I expect. Anything you come up
with on the basis of your stats 101 course are likely to be pretty
worthless. Except as a placebo, of course(which actually can be effective).
As for your AC example, clearly how much electricity you use depends on
temperature, humidity, how much you were around, etc. Without this info
over several years before and after the change, there is no way that you
can make a meaningful comparison. In other words, you don't have the data
to answer the question.

Bert

On Tue, May 6, 2025, 21:58 Bert Gunter <bgunter.4567 using gmail.com> wrote:

> I am out of the country and will reply more fully to you (privately) when
> I return. But briefly, and subject to my possible
> misunderstanding/misinterpretation of your specification, I would say both
> your examples illustrate exactly what I said. In the first, the clea
>
> On Tue, May 6, 2025, 14:23 Kevin Zembower via R-help <r-help using r-project.org>
> wrote:
>
>> Thank you to everyone who responded. I gained a lot of insight into
>> statistical methods and the nature of statistical thinking. I replied
>> to some people privately, to limit the traffic on this OT question.
>>
>> And thank you for the patience of all who were annoyed by this off-
>> topic question, and who didn't write to complain. I promise to limit
>> off-topic questions in the future.
>>
>> -Kevin
>>
>> On Mon, 2025-05-05 at 15:17 +0000, Kevin Zembower wrote:
>> > I marked this posting as Off Topic because it doesn’t specifically
>> > apply to R and Statistics, but is rather a general question about
>> > statistics and the teaching of statistics. If this is annoying to
>> > you,
>> > I apologize.
>> >
>> > As I wrap up my work in my beginning statistics course, I’d like to
>> > ask
>> > a philosophical question regarding statistics.
>> >
>> > In my course, we’ve learned two different ways to solve statistical
>> > problems: simulations, using bootstraps and randomized distributions,
>> > and theoretical methods, using Normal (z) and t-distributions. We’ve
>> > learned that both systems solve all the questions we’ve asked of
>> > them,
>> > and that both give comparable answers. Out of six chapters that we’ve
>> > studied in our textbook, the first four only used simulation methods.
>> > Only the last two used theoretical methods.
>> >
>> > My questions are:
>> >
>> > 1) Why don’t professional statisticians settle on one or the other,
>> > and
>> > just apply that system to their problems and work? What advantage
>> > does
>> > one system have over the other?
>> >
>> > 2) As beginning statistics students, why is it important for us to
>> > learn both systems? Do you think that beginning statistics students
>> > will still be learning both systems in the future?
>> >
>> > Thank you very much for your time and effort in answering my
>> > questions.
>> > I really appreciate the thoughts of the members of this group.
>> >
>> > -Kevin
>> >
>> >
>> >
>> >
>>
>>
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> https://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.