[R] Reorder file names read by list.files function
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Thu Oct 11 06:37:15 CEST 2018
Hello,
> month.names
Erro: objeto 'month.names' não encontrado
> month.name
[1] "January" "February" "March" "April" "May" "June"
[7] "July" "August" "September" "October" "November" "December"
Hope this helps,
Rui Barradas
Às 01:05 de 11/10/2018, William Dunlap via R-help escreveu:
> You can paste the directory names, dir.names(files), back on, with
> file.path(), after you do the sorting. A better idiom is to use order()
> instead of sort() and usng order's output to subscript file.names. E.g.,
> the following sorts by year and month number.
>
>> file.names <- c("C:/tmp/June_2018.PDF", "C:/tmp/May_2018.PDF",
> "C:/tmp/October_2016.PDF")
>> bfile.names <- sub("\\..*$", "", basename(file.names))
>> bfile.names
> [1] "June_2018" "May_2018" "October_2016"
>> month <- sub("^([[:alpha:]]+)_.*$", "\\1", bfile.names)
>> month
> [1] "June" "May" "October"
>> month.names
> Error: object 'month.names' not found
>> month.names <-
> c("January","February","March","April","May","June","July","August","September","October","November","December")
>> month.number <- match(month, month.names)
>> month.number
> [1] 6 5 10
>> file.names[ order(year, month.number) ]
> [1] "C:/tmp/October_2016.PDF" "C:/tmp/May_2018.PDF"
> "C:/tmp/June_2018.PDF"
>
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Oct 10, 2018 at 4:23 PM, Ek Esawi <esawiek using gmail.com> wrote:
>
>> Thank you Bill and RUI. I use month.name with sort and basename, as
>> suggested by Bill. i got the sorted numerical values, then i use
>> month.name to get proper ordered month names. The problem is that i
>> have to paste to the names the extension PDF giving me the correct
>> ordered file names, but then i get the same error message which
>> suggest that the code is not reading the files properly
>>
>> I have not tried RUI's yet, but i will if nothing else works out.
>>
>> Thanks again--EK
>>
>> had to strip off file.names from the extension PDF, but when i paste
>> the month.name with .PDF to get the correct file names, i am getting
>> the same error.
>> On Tue, Oct 9, 2018 at 4:47 PM William Dunlap <wdunlap using tibco.com> wrote:
>>>
>>> Use basename(filename) to remove the lead parts of the full path to the
>> file. E.g., replace
>>> FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>> with (the untested)
>>> FNs <- sort(match(sub("\\.PDF", "", basename(file.names)),
>> month.name))
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <esawiek using gmail.com> wrote:
>>>>
>>>> Hi again,
>>>>
>>>> I worked with RUi's idea of using the match function with month.name.
>>>> I got numerical values for months then i sorted and pasted the PDF
>>>> file extension. It gave me the file order i wanted, but now statements
>>>> 8,9,&10 don't work and i kept getting an error which is listed below.
>>>> The dilemma is if i add full.names=TRUE in statement 6 then statements
>>>> 9 and 10 don't produce what they did earlier. If i put
>>>> full.names=FALSE, then i am back to square 1.
>>>> Any idea is greatly appreciated.:
>>>>
>>>> The code
>>>>
>>>> 1. nstall.packages("tabulizer")
>>>> 2. installed.packages("stringr")
>>>> 3. library(stringr)
>>>> 4. library(tabulizer)
>>>> 5. path = "C:/Users/namei/Documents/TextMining/S2017"
>>>> 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>>>> 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>>>> 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>>> 9. FNs1 <- paste0(month.name[FNs],".","PDF")
>>>> 10 A <- lapply(FNs1, function(i) extract_tables(i))
>>>>
>>>> Output and the error message.
>>>>
>>>> path = "C:/Users/eesawi/Documents/TextMining/S2017"
>>>>> file.names <- dir(path, pattern =".PDF",full.names = TRUE)
>>>>> file.names <- str_remove(file.names,"\\s[0-9][0-9]")
>>>>> FNs <- sort(match(sub("\\.PDF", "", file.names), month.name))
>>>>> FNs1 <- paste0(month.name[FNs],".","PDF")
>>>>> A <- lapply(FNs1, function(i) extract_tables(i))
>>>> Show Traceback
>>>>
>>>> Error in normalizePath(path.expand(path), winslash, mustWork) :
>>>> path[1]=".PDF": The system cannot find the file specified
>>>> On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <esawiek using gmail.com> wrote:
>>>>>
>>>>> Hi All--
>>>>>
>>>>> I used base R list.file function to read files from a directory. The
>>>>> file names are months (April, August, etc). That's the system reads
>>>>> them in alphabetical order., but i want to reordered them in calendar
>>>>> order (January, February, ...December).. I thought i might be able to
>>>>> do it via RegEx or possibly gtools package, I am wondering if there is
>>>>> an easier way.
>>>>>
>>>>> Thanks--EK
>>>>>
>>>>> Example
>>>>> path = "C:/Users/name/Downloads/MyFiles"
>>>>> file.names <- dir(path, pattern =".PDF")
>>>>>
>>>>> Example output
>>>>> Output:
>>>>> "February.PDF" "January.PDF" "March.PDF"
>>>>> Desired output
>>>>> "January.PDF" "February.PDF" "March.PDF"
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list