[R] regular expression
Uwe Ligges
ligges at statistik.uni-dortmund.de
Sat Apr 7 13:18:44 CEST 2007
Laurent Rhelp wrote:
> Uwe Ligges a écrit :
>
>>
>>
>> Laurent Rhelp wrote:
>>
>>> Dear R-List,
>>>
>>> I have a great many files in a directory and I would like to
>>> replace in every file the character " by the character ' and in the
>>> same time, I have to change ' by '' (i.e. the character ' twice and
>>> not the unique character ") when the character ' is embodied in "....."
>>> So, "....." becomes '.....' and ".....'......" becomes '.....''......'
>>> Certainly, regular expression could help me but I am not able to use it.
>>>
>>> How can I do that with R ?
>>
>>
>>
>> In fact, you do not need to know anything about regular expressions in
>> this case, since you are simply going to replace certain characters by
>> others without any fuzzy restrictions:
>>
>> x <- "\".....'......\""
>> cat(x, "\n")
>> xn <- gsub('"', "'", gsub("'", "''", x))
>> cat(xn, "\n")
>>
>>
>> Uwe Ligges
>>
>>
>>> Thank you very much
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>
> Yes, You are right. So I wrote the code below (that I find a little
> awkward but it works).
>
> ##-----
>
> dirdata <- getwd()
> fichnames <- list.files(path=paste(dirdata,"\\initial\\",sep=""))
see ?file.path to improve the above.
> for( i in 1:length(fichnames)){
see ?seq to improve the above: seq(along = fichnames)
Or even better, just work on the names (see below).
> filein <- paste(dirdata,"\\initial\\",fichnames[i],sep="")
again, file.path() is your friend
> conin <- file(filein)
> open(conin)
> nbrows <- length( readLines(conin,n=-1) )
> close(conin)
You can simply use readLines() with the filename which open the
connection to a file itself. And I do not see why you want to read the
file here. Since your code becomes really complicated now, let me
suggest the following procedure (untested!):
dirdata <- getwd()
fichnames <- list.files(file.path(dirdata, "initial"))
for(i in fichnames){
temp <- readLines(file.path(dirdata, "initial", i))
temp <- gsub('"', "'", gsub("'", "''", temp))
writeLines(temp, con = file.path(dirdata, "result", i))
}
Uwe Ligges
> fileout <- paste(dirdata,"\\result\\",fichnames[i],sep="")
> conout <- file(fileout,"w")
>
> conin <- file(filein)
> open(conin)
>
>
> for( l in 1:nbrows )
> {
> text <- gsub('"',"'",gsub("'","''",readLines(conin,n=1)))
> writeLines(con=conout,text=text)
> }
>
> close(conin)
> close(conout)
> }
>
> ##------
More information about the R-help
mailing list