[R] extract parts of a list before symbol

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Fri May 26 06:02:50 CEST 2023


All true Jeff, but why do things the easy way! LOL!

My point was that various data structures, besides the list we started with,
store the names as an attribute. Yes, names(listname) works fine to extract
whatever parts they want. My original idea of using a data.frame was because
it creates names when they are absent.  

And you are correct that if the original list was not as shown with only all
items of length 1, converting to a data.frame fails.

>From what you say, it is a harder think to write a function that returns a
"name" for column N given a list. As you note, you get a null when there are
no names.  You get empty strings when one or more (but not all) have no
names. But it can be done.

The OP initially was looking at a way to get a text version of a variable
they could use using perhaps regular expressions to parse.  Of course that
is not as easy as just looking at the names attribute in one of several
ways. But it may help in a sense to deal with the cases mentioned above.
The problem is that str() does not return anything except to stdout so it
must be captured to do silly things.

> test <- list(a=3,b=5,c=11)

> str(test)
List of 3
 $ a: num 3
 $ b: num 5
 $ c: num 11

> str(test[1])
List of 1
 $ a: num 3

> str(test[2])
List of 1
 $ b: num 5

> str(list(a=1, 2, c=3))
List of 3
 $ a: num 1
 $  : num 2
 $ c: num 3

> str(list(1, 2, 3))
List of 3
 $ : num 1
 $ : num 2
 $ : num 3

> text <- str(list(a=1, 2, c=3)[1])
List of 1
 $ a: num 1

> text <- capture.output(str(list(a=1, 2, c=3)))
> text
[1] "List of 3"   " $ a: num 1" " $  : num 2" " $ c: num 3"
So you could use some imaginative code that extracts what you want. I
repeat, this is not a suggested way nor the best, just something that seems
to work:

> sub("(^[\\$ ]*)(\\w+|)(:.*$)", "\\2", text[2:length(text)])
[1] "a" ""  "c"

Obviously the first line of output needs to be removed as it does not fit
the pattern. 

Perhaps in this case a way less complex way is to use summary() rather than
str as it does return the output as text.

> summary(list(a=1, 2, c=3)) -> text
> text
  Length Class  Mode   
a 1      -none- numeric
  1      -none- numeric
c 1      -none- numeric

This puts the variable name, if any, at the start but parsing that is not
trivial as it is not plain text. 

Bottom line, try not to do things the hard way. Just carefully use names()
...

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Jeff Newmiller
Sent: Thursday, May 25, 2023 10:32 PM
To: r-help using r-project.org
Subject: Re: [R] extract parts of a list before symbol

What a remarkable set of detours, Avi, all deriving apparently from a few
gaps in your understanding of R.

As Rolf said, "names(test)" is the answer.

a) Lists are vectors. They are not atomic vectors, but they are vectors, so
as.vector(test) is a no-op.

test <- list( a = 1, b = 2, c=3 )
attributes(test)
attributes(as.vector(test))

(Were you thinking of the unlist function? If so, there is no reason to
convert the value of the list to an atomic vector in order to look at the
value of an attribute of that list.)

b) Data frames are lists, with the additional constraint that all elements
have the same length, and that a names attribute and a row.names attribute
are both required. Converting a list to a data frame to get the names is
expensive in CPU cycles and breaks as soon as the list elements have a
variety of lengths.

c) All data in R is stored as vectors. Worrying about whether a data value
is a vector is pointless.

d) All objects can have attributes, including the name attribute. However,
not all objects must have a name attribute... including lists. Omitting a
name for any of the elements of a list in the constructor will lead to
having a zero-length character values in the name attribute where the names
were omitted. Omitting all names in the list constructor will cause no names
attribute to be created for that list.

test2 <- list( 1, 2, 3 )
attributes(test2)

e) The names() function returns the value of the names attribute. If that
attribute is missing, it returns NULL. For dataframes, the colnames function
is equivalent to the names function (I rarely use the colnames function).
For lists, colnames returns NULL... there are no "columns" in a list,
because there is no constraint on the (lengths of the) contents of a list.

names(test2)

f) The names attribute, if it exists, is just a character vector. It is
never necessary to convert the output of names() to a character vector. If
the names attribute doesn't exist, then it is up to the user to write code
that creates it.

names(test2) <- c( "A", "B", "C" )
attributes(test2)
names(test2)
# or use the argument names in the list function

names(test2) <- 1:3 # integer
names(test2) # character
attributes(test2)$names <- 1:3 # integer
attributes(test2) # character
test2[[ "2" ]] == 2  # TRUE
test2$`2`  == 2 # TRUE



On May 25, 2023 6:17:37 PM PDT, avi.e.gross using gmail.com wrote:
>Evan,
>
>List names are less easy than data.frame column names so try this:
>
>> test <- list(a=3,b=5,c=11)
>> colnames(test)
>NULL
>> colnames(as.data.frame(test))
>[1] "a" "b" "c"
>
>But note an entry with no name has one made up for it.
>
>
>> test2 <- list(a=3,b=5, 666, c=11)
>> colnames(data.frame(test2))
>[1] "a"    "b"    "X666" "c"   
>
>But that may be overkill as simply converting to a vector if ALL parts are
>of the same type will work too:
>
>> names(as.vector(test))
>[1] "a" "b" "c"
>
>To get one at a time:
>
>> names(as.vector(test))[1]
>[1] "a"
>
>You can do it even simple by looking at the attributes of your list:
>
>> attributes(test)
>$names
>[1] "a" "b" "c"
>
>> attributes(test)$names
>[1] "a" "b" "c"
>> attributes(test)$names[3]
>[1] "c"
>
>
>-----Original Message-----
>From: R-help <r-help-bounces using r-project.org> On Behalf Of Evan Cooch
>Sent: Thursday, May 25, 2023 1:30 PM
>To: r-help using r-project.org
>Subject: [R] extract parts of a list before symbol
>
>Suppose I have the following list:
>
>test <- list(a=3,b=5,c=11)
>
>I'm trying to figure out how to extract the characters to the left of 
>the equal sign (i.e., I want to extract a list of the variable names, a, 
>b and c.
>
>I've tried the permutations I know of involving sub - things like 
>sub("\\=.*", "", test), but no matter what I try, sub keeps returning 
>(3, 5, 11). In other words, even though I'm trying to extract the 
>'stuff' before the = sign, I seem to be successful only at grabbing the 
>stuff after the equal sign.
>
>Pointers to the obvious fix? Thanks...
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list