[R] R - Reading a horizontally arranged csv file, replacing values and saving again

Manoranjan Muthusamy r@nj@nm@no167 @end|ng |rom gm@||@com
Thu Oct 4 18:03:59 CEST 2018


That's exactly what I was looking for. Never used readLines before which is
why I was struggling.  Many thanks. I asked the same question in Stack
Overflow as well. But no answers yet. I will post the link to your answer.
Thanks again.

On Wed, Oct 3, 2018 at 5:51 PM William Dunlap <wdunlap using tibco.com> wrote:

> Using read.table to read data with a variable number of entries per row is
> probably a mistake - data.frames (which read.table returns) are not meant
> for this sort of data.  You want to store the data in R as a list with
> names.  E.g.,
>
> readYourFormat <- function(connection) {
>     # connection is a file name or connection object referring
>     # to text of the form
>     # Header A,Some text
>     # Header B,<number>,<number>,<number>
>     # Header C,<number>
>     # Header D,<number>,<number>,<number>,<number>
>     lines <- readLines(connection)
>     lines <- lines[ nchar(lines)>0 ] # omit empty lines
>     fields <- strsplit(lines, ",") # split lines by commas into fields
>     headers <- vapply(fields, function(x)x[1], FUN.VALUE="") # first entry
> is header
>     names(fields) <- headers
>     contents <- lapply(fields, function(x)x[-1]) # remaining entries are
> contents
>     contents[-1] <- lapply(contents[-1], as.numeric) # convert all but
> first line's entries to numeric
>     contents
> }
>
> After reading the data and fiddling with it (but keeping it a named list),
> you can write it out in a similar format with
>
> writeYourFormat <- function (data, connection = stdout())
> {
>     stopifnot(is.list(data), !is.null(names(data)))
>     cat(file = connection, sep = "\n", paste(sep = ",", names(data),
>         vapply(data, paste, collapse = ",", FUN.VALUE = "")))
> }
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Oct 3, 2018 at 3:05 AM, Manoranjan Muthusamy <
> ranjanmano167 using gmail.com> wrote:
>
>> Thanks for the short but informative answer, Bill. But still, each row
>> has four columns..right? Although the *NA* is replaced by a blank cell,
>> because of the extra comma it still is a four column row. Is there any way
>> to avoid/remove the extra comma when NA is replaced which will make it a
>> three column row?
>>
>>
>>
>> On Mon, Oct 1, 2018 at 4:56 PM William Dunlap <wdunlap using tibco.com> wrote:
>>
>>> Read the help file for write.table
>>>    ?write.table
>>> and look at the descriptions of its arguments.
>>>   > df <- data.frame(Text=c("Abe","Ben",NA,"David"), Age=c(19, NA, 12,
>>> 10))
>>>   > write.table(file=stdout(), t(df), sep=",")
>>>   "V1","V2","V3","V4"
>>>   "Text","Abe","Ben",NA,"David"
>>>   "Age","19",NA,"12","10"
>>>   > write.table(file=stdout(), t(df), quote=FALSE, na="", sep=",")
>>>   V1,V2,V3,V4
>>>   Text,Abe,Ben,,David
>>>   Age,19,,12,10
>>>
>>>
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Mon, Oct 1, 2018 at 2:24 AM, Manoranjan Muthusamy <
>>> ranjanmano167 using gmail.com> wrote:
>>>
>>>> I have a horizontally arranged CSV file
>>>> <https://www.dropbox.com/s/0ukyuifvpq1olqk/samplefile.csv?dl=0> (
>>>> samplefile.csv) with headers are in the first column. Also, each row
>>>> has a
>>>> different number of columns. I want to read this CSV file, replace one
>>>> of
>>>> the cell value and save again as a CSV file with the same format as the
>>>> original file with exactly same number of columns and rows. It sounds
>>>> like
>>>> a simple task, but I am struggling to find a way. I tried to do this
>>>> with
>>>> the help of this
>>>> <
>>>> https://stackoverflow.com/questions/17288197/reading-a-csv-file-organized-horizontally
>>>> >
>>>>  and this
>>>> <
>>>> https://stackoverflow.com/questions/20308621/how-do-i-write-a-csv-file-in-r-where-my-input-is-written-to-the-file-as-row
>>>> >
>>>> posts,
>>>> but still, I can't get the output the way I want. Can somebody help me
>>>> here?
>>>>
>>>> My attempt using the answer in this post
>>>> <
>>>> https://stackoverflow.com/questions/17288197/reading-a-csv-file-organized-horizontally
>>>> >
>>>> to
>>>> read the CSV file (samplefile.csv) gives me the following output where
>>>> headers are kinda messed up and empty cells are replaced with NA values
>>>> which is not what I want
>>>>
>>>> aaa <- read.tcsv("samplefile.csv")
>>>> aaa
>>>>
>>>>      Header.1 Header.2..units. Header.3..units. Header.3..units..11
>>>> Some text           0.0625                0             2648962
>>>> <NA>           0.0625             1200             6647473        <NA>
>>>>           0.0625             1380                  14        <NA>
>>>>      0.2500             1500                  15        <NA>
>>>> 0.6250             1620                 NA6        <NA>
>>>> 1.3125             1740                 NA7        <NA>
>>>> 2.4375             1860                 NA8        <NA>
>>>> 3.5625             1980                 NA9        <NA>
>>>> 4.6250             2100                 NA10       <NA>
>>>> 5.0000             2220                 NA11       <NA>
>>>> 5.0000             2340                 NA12       <NA>
>>>> 4.6250             2460                 NA13       <NA>
>>>> 3.5625             2580                 NA14       <NA>
>>>> 2.4375             2700                 NA15       <NA>
>>>> 1.3125             2820                 NA16       <NA>
>>>> 0.6250             2940                 NA17       <NA>
>>>> 0.2500             3060                 NA18       <NA>
>>>> 0.0625             3180                 NA19       <NA>
>>>> 0.0000             3300                 NA20       <NA>
>>>> 0.0000            18000                 NA
>>>>
>>>> Also, I am not sure how to go back to original format when I save the
>>>> file
>>>> again after a modification (for example after replacing a cell value)
>>>>
>>>> I tried saving the file again by using t (transpose) as given below
>>>>
>>>> write.csv(t(aaa), file ="samplefile_e.csv", row.names=T)
>>>>
>>>> but still, there are following issues in the saved file
>>>>
>>>>    1. messed up headers
>>>>    2. empty cells are replaced with NA
>>>>    3. when I open the file in a text editor all the values are shown as
>>>>    characters
>>>>
>>>> Thanks,
>>>> Mano
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>

	[[alternative HTML version deleted]]




More information about the R-help mailing list