[R] HOW TO FILTER DATA
Rui Barradas
ruipbarradas at sapo.pt
Wed Jan 3 22:45:51 CET 2018
Hello,
If you want to select rows with just one IPC, use `==`.
If you want to select rows with several IPC's, use `%in%`.
See the code below for the two ways of doing this.
oecd <- read.table(text = "
Appln_id|Prio_Year|App_year|IPC
1|1999|2000|H04Q007/32
1|1999|2000|G06K019/077
1|1999|2000|H01R012/18
1|1999|2000|G06K017/00
1|1999|2000|H04M001/2745
1|1999|2000|G06K007/00
1|1999|2000|H04M001/02
1|1999|2000|H04M001/275
2|1991|1992|C12N015/62
2|1991|1992|C12N015/09
2|1991|1992|C07K019/00
2|1991|1992|C07K016/26
", header = TRUE, sep = "|")
select_one <- "H04Q007/32"
select_many <- c("H04Q007/32", "H04M001/275")
oecd2 <- subset(oecd, IPC == select_one)
oecd3 <- subset(oecd, IPC %in% select_many)
Hope this helps,
Rui Barradas
On 1/3/2018 7:53 PM, Saptorshee Kanto Chakraborty wrote:
> Hello,
>
> I have a data of Patents from OECD in delimited text format with IPC being
> one column, I want to filter the data by selecting only certain IPC in that
> column and delete other rows which do not have my required IPCs. Please,
> can anybody guide me doing it, also the IPC codes are string variables.
>
> The data is somewhat like below, but its a huge dataset containing more
> than 11 million rows
>
>
> Appln_id|Prio_Year|App_year|IPC
> 1|1999|2000|H04Q007/32
> 1|1999|2000|G06K019/077
> 1|1999|2000|H01R012/18
> 1|1999|2000|G06K017/00
> 1|1999|2000|H04M001/2745
> 1|1999|2000|G06K007/00
> 1|1999|2000|H04M001/02
> 1|1999|2000|H04M001/275
> 2|1991|1992|C12N015/62
> 2|1991|1992|C12N015/09
> 2|1991|1992|C07K019/00
> 2|1991|1992|C07K016/26
>
>
>
> Thanking You
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list