[R] Split Strings
Miluji Sb
milujisb at gmail.com
Mon Jan 18 09:46:10 CET 2016
Thank you everyone for the codes and the link. They work well!
Mr. Lemon, thank you for the detailed code and the explanations. I
appreciate it. One thing though, in the last line
sapply(split_strings,fill_strings,list(max_length,element_sets))
should it be unlist instead of list - I get this error "Error in
FUN(X[[i]], ...) : (list) object cannot be coerced to type 'integer'".
Thanks again!
On Mon, Jan 18, 2016 at 9:19 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
> Hi Miluji,
> While the other answers are correct in general, I noticed that your
> request was for the elements of an incomplete string to be placed in the
> same positions as in the complete strings. Perhaps this will help:
>
> strings<-list("pc_m2_45_ssp3_wheat","pc_m2_45_ssp3_wheat",
> "ssp3_maize","m2_wheat","pc_m2_45_ssp3_maize")
> split_strings<-strsplit(unlist(strings),"_")
> max_length <- max(sapply(split_strings,length))
> complete_sets<-split_strings[sapply(split_strings,length)==max_length]
> element_sets<-list()
>
> # build a list with the unique elements of each complete string
> for(i in 1:max_length)
> element_sets[[i]]<-unique(sapply(complete_sets,"[",i))
>
> # function to guess the position of the elements in a partial string
> # and return them in the hopefully correct positions
> fill_strings<-function(split_string,max_length,element_sets) {
> if(length(split_string) < max_length) {
> new_split_string<-rep(NA,max_length)
> for(i in 1:length(split_string)) {
> for(j in 1:length(complete_sets)) {
> if(grep(split_string[i],element_sets[j]))
> new_split_string[j]<-split_string[i]
> }
> }
> return(new_split_string)
> }
> return(split_string)
> }
>
> # however, if you know that the incomplete strings will always
> # be composed of the last elements in the complete strings
> fill_strings<-function(split_string,max_length) {
> lenstring<-length(split_string)
> if(lenstring < max_length)
> split_string<-c(rep(NA,max_length-lenstring),split_string)
> return(split_string)
> }
>
> sapply(split_strings,fill_strings,list(max_length,element_sets))
>
> Jim
>
> On Mon, Jan 18, 2016 at 7:56 AM, Miluji Sb <milujisb at gmail.com> wrote:
>
>> I have a list of strings of different lengths and would like to split each
>> string by underscore "_"
>>
>> pc_m2_45_ssp3_wheat
>> pc_m2_45_ssp3_wheat
>> ssp3_maize
>> m2_wheat
>>
>> I would like to separate each part of the string into different columns
>> such as
>>
>> pc m2 45 ssp3 wheat
>>
>> But because of the different lengths - I would like NA in the columns for
>> the variables have fewer parts such as
>>
>> NA NA NA m2 wheat
>>
>> I have tried unlist(strsplit(x, "_")) to split, it works for one variable
>> but not for the list - gives me "non-character argument" error. I would
>> highly appreciate any help. Thank you!
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list