[R] Row exclude
Avi Gross
@v|gro@@ @end|ng |rom ver|zon@net
Sat Jan 29 04:25:09 CET 2022
You may need a few more steps than that, Val.
I commend you for stating your need clearly and showing a reasonable set of test data and spelling out the expected result.
If your data is polluted the way you describe, then read.table() likely will treat those columns as character and not numeric. In your example you want to recognize "13X" as having an X. Similarly "3BC" has a B. Those two columns can be handled by the same technique and later made numeric. You seem to want any numerals in the first column to disqualify it too.
So consider what techniques you have learned and thus what you are allowed to do for such an assignment. Unless we know otherwise, we may assume this is homework of some sort.
We, reading this, have no idea what parts of basic R you can use and I hope nobody jumps in offering tidyverse packages.
So ask yourself how to create one of dozens of ways to make a copy of your data that includes only rows where column 1 follows the rule of containing no digits between 0 and 9. You can use things that say count characters of some kind and compare it to the length of the item, for example. You might use regular expressions. Whatever you do, should remove your sixth row in the example and nothing else.
Can you now take the result and shorten it by removing anything in column 2 using some new technique that shows if there are one or more letters? An example might be to try converting the value to an integer and back to character and seeing if they match. Again, lots of possibilities but you need only one that works.
Can you take that shorter version and repeat pretty much the same filter on column 3?
That should work and if ambitious, you can even find a way to create a compound filter that does all three columns at once.
-----Original Message-----
From: Val <valkremk using gmail.com>
To: r-help using R-project.org (r-help using r-project.org) <r-help using r-project.org>
Sent: Fri, Jan 28, 2022 10:08 pm
Subject: [R] Row exclude
Hi All,
I want to remove rows that contain a character string in an integer
column or a digit in a character column.
Sample data
dat1 <-read.table(text="Name, Age, Weight
Alex, 20, 13X
Bob, 25, 142
Carol, 24, 120
John, 3BC, 175
Katy, 35, 160
Jack3, 34, 140",sep=",",header=TRUE,stringsAsFactors=F)
If the Age/Weight column contains any character(s) then remove
if the Name column contains an digit then remove that row
Desired output
Name Age weight
1 Bob 25 142
2 Carol 24 120
3 Katy 35 160
Thank you,
______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help
mailing list