[R] Dataframe columns are accessible by incomplete column names, is this a bug?

Thu Jul 18 19:28:36 CEST 2019

But it's also a convenience feature. Note that $E returned null
because there was an ambiguity. By the time you got to $Ex the column
you were referencing was unambiguous and you didn't have to type out
the whole thing. Useful if you have very long column names, for
example imported from a spreadsheet.

That said, I agree that relying on it can be risky.

Also, please use plain-text to post to this list. Your table would
have been much easier to read.

Pat

On Thu, Jul 18, 2019 at 11:56 AM Sarah Goslee <sarah.goslee using gmail.com> wrote:
>
> Hello Yannick,
>
> That behavior is documented in the help for subsetting ( ?'$' ):
>
>      Both ‘[[’ and ‘$’ select a single element of the list.  The main
>      difference is that ‘$’ does not allow computed indices, whereas
>      ‘[[’ does.  ‘x$name’ is equivalent to ‘x[["name", exact =
>      FALSE]]’.  Also, the partial matching behavior of ‘[[’ can be
>      controlled using the ‘exact’ argument.
>
> You can avoid it by using [[]] instead:
>
> > swiss[['Ex']]
> NULL
> > head(swiss[['Examination']])
> [1] 15  6  5 12 17  9
>
> That's one of the major reasons using $ is sometimes discouraged.
>
> Sarah
>
> On Thu, Jul 18, 2019 at 11:38 AM <Yannick.Suter using coop.ch> wrote:
> >
> > Hello all
> > I noticed today that you can access dataframe columns by using incomplete names. This is a really unexpected behavior which led to some unexpected errors and I was wondering whether it's a bug or not and whether it should be changed in the future.
> > Here's a working example using the preinstalled "swiss" dataset:
> >
> > > head(swiss)
> >              Fertility Agriculture Examination Education Catholic
> > Courtelary        80.2        17.0          15        12     9.96
> > Delemont          83.1        45.1           6         9    84.84
> > Franches-Mnt      92.5        39.7           5         5    93.40
> > Moutier           85.8        36.5          12         7    33.77
> > Neuveville        76.9        43.5          17        15     5.16
> > Porrentruy        76.1        35.3           9         7    90.57
> >              Infant.Mortality
> > Courtelary               22.2
> > Delemont                 22.2
> > Franches-Mnt             20.2
> > Moutier                  20.3
> > Neuveville               20.6
> > Porrentruy               26.6
> > > swiss$E
> > NULL
> > > swiss$Ex
> > [1] 15  6  5 12 17  9 16 14 12 16 14 21 14 19 22 18 17 26 31 19 22 14 22 20 12
> > [26] 14  6 16 25 15  3  7  5 12  7  9  3 13 26 29 22 35 15 25 37 16 22
> > > swiss$Ed
> > [1] 12  9  5  7 15  7  7  8  7 13  6 12  7 12  5  2  8 28 20  9 10  3 12  6  1
> > [26]  8  3 10 19  8  2  6  2  6  3  9  3 13 12 11 13 32  7  7 53 29 29
> >
> > So in order to access the column "Examination", I can type any substring from "Ex" to "Examination" and will always get the column swiss$Examination.
> >
> > Thanks for reading and Greetings
> > Yannick Suter
> >
> --
> Sarah Goslee (she/her)
> http://www.numberwright.com
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.