[R] [EXT] Re: Creating NA equivalent
David K Stevens
d@v|d@@teven@ @end|ng |rom u@u@edu
Wed Dec 22 01:50:54 CET 2021
Hello all,
My two cents. We use the term "below the detection limit" for any
physical measurement that is cannot be distinguished from noise in the
measurement system. This may either be instance specific (determine the
detection limit for each instance) or "below the reporting limit" which
is usually set at the maximum of the detection limits found for each
instance as an administrative simplification. Either way the
interpretation is "the value is between 0 and the limit" which carries
information just as the "at least this much" limit in survival analysis.
Some data sets have both lower and upper censoring and survival analysis
appears to be the most appropriate. This is discussed in detail by
Dennis Helsel from Practical Stats and is captured in the NADA package
in R for environmental and hydrological data.
regards
David Stevens
On 12/21/2021 4:35 PM, Jim Lemon wrote:
> Hi Bert,
> What troubles me about this is that something like detectable level(s)
> is determined at a particular time and may change. Censoring in
> survival tells us that the case lasted "at least this long". While a
> less than detectable value doesn't give any useful information apart
> from perhaps "non-zero", an over limit value gives something like
> censoring with "at least this much". However, it is more difficult to
> conceptualize and I suspect, to quantify. To me, the important
> information is that we think there _may be_ a value but we don't
> (yet?) know it.
>
> Jim
>
> On Wed, Dec 22, 2021 at 9:56 AM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>> But you appear to be missing something, Jim -- see inline below (and
>> the original post):
>>
>> Bert
>>
>>
>> On Tue, Dec 21, 2021 at 2:00 PM Jim Lemon <drjimlemon using gmail.com> wrote:
>>> Please pardon a comment that may be off-target as well as off-topic.
>>> This appears similar to a number of things like fuzzy logic, where an
>>> instance can take incompatible truth values.
>>>
>>> It is known that an instance may have an attribute with a numeric
>>> value, but that value cannot be determined.
>> Yes, but **something** about the value is known: that it is > an upper
>> value or < a lower value. Such information should be used
>> (censoring!), not characterized as completely unknown. Think about it
>> in terms of survival time: saying that a person lasted longer than k
>> months is much more informative than saying that how long they lasted
>> is completely unknown!
>>
>>> It seems to me that an appropriate designation for the value is Unk,
>>> perhaps with an associated probability of determination to distinguish
>>> it from NA (it is definitely not known).
>>>
>>> Jim
>>>
>>> On Wed, Dec 22, 2021 at 6:55 AM Avi Gross via R-help
>>> <r-help using r-project.org> wrote:
>>>> I wonder if the package Adrian Dușa created might be helpful or point you along the way.
>>>>
>>>> It was eventually named "declared"
>>>>
>>>> https://cran.r-project.org/web/packages/declared/index.html
>>>>
>>>> With a vignette here:
>>>>
>>>> https://cran.r-project.org/web/packages/declared/vignettes/declared.pdf
>>>>
>>>> I do not know if it would easily satisfy your needs but it may be a step along the way. A package called Haven was part of the motivation and Adrian wanted a way to import data from external sources that had more than one category of NA that sounds a bit like what you want. His functions should allow the creation of such data within R, as well. I am including him in this email if you want to contact him or he has something to say.
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: R-help <r-help-bounces using r-project.org> On Behalf Of Duncan Murdoch
>>>> Sent: Tuesday, December 21, 2021 5:26 AM
>>>> To: Marc Girondot <marc_grt using yahoo.fr>; r-help using r-project.org
>>>> Subject: Re: [R] Creating NA equivalent
>>>>
>>>> On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote:
>>>>> Dear members,
>>>>>
>>>>> I work about dosage and some values are bellow the detection limit. I
>>>>> would like create new "numbers" like LDL (to represent lower than
>>>>> detection limit) and UDL (upper the detection limit) that behave like
>>>>> NA, with the possibility to test them using for example is.LDL() or
>>>>> is.UDL().
>>>>>
>>>>> Note that NA is not the same than LDL or UDL: NA represent missing data.
>>>>> Here the data is available as LDL or UDL.
>>>>>
>>>>> NA is built in R language very deep... any option to create new
>>>>> version of NA-equivalent ?
>>>>>
>>>> There was a discussion of this back in May. Here's a link to one approach that I suggested:
>>>>
>>>> https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html
>>>>
>>>> Read the followup messages, I made at least one suggested improvement.
>>>> I don't know if anyone has packaged this, but there's a later version of the code here:
>>>>
>>>> https://stackoverflow.com/a/69179441/2554330
>>>>
>>>> Duncan Murdoch
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> CAUTION: This email originated from outside of USU. If this appears to be a USU employee, beware of impersonators. Do not click links, reply, download images, or open attachments unless you verify the sender’s identity and know the content is safe.
>
--
David K Stevens, PhD,PE
Professor
Civil and Environmental Engineering
Utah State University
Logan, UT 84322-8200
david.stevens using usu.edu
014357973229
More information about the R-help
mailing list