[R] how to read this kind of csv in R?
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Mon Oct 7 18:55:33 CEST 2019
Hello,
OK, I had some spare time. Try
readCSVFile <- function(filename){
lns <- readLines(filename)
lns <- lns[sapply(lns, nchar) > 0]
lns <- gsub(" ", "", lns)
lns <- sub(";$", "", lns)
i_title <- grep("[[:alpha:]]", lns)
blocks <- lapply(seq_along(i_title)[-1], function(i){
if(i == length(i_title)){
j <- i_title[i] + 1
k <- length(lns)
}else{
j <- i_title[i] + 1
k <- i_title[i + 1] - 1
}
lns[j:k]
})
n <- length(unlist(strsplit(blocks[[1]][1], ";")))
first <- unlist(strsplit(lns[i_title[1] + 1], ";"))
first <- as.numeric(first)
first <- rep(first, each = n)
blocks <- lapply(blocks, function(x){
unlist(strsplit(x, ";"))
})
res <- do.call(cbind.data.frame, blocks)
res <- cbind.data.frame(first, res)
names(res) <- sub("\\[.*\\]$", "", lns[i_title])
res
}
df1 <- readCSVFile("strange.csv")
If this function doesn't do it, please try to make an effort on your
own, R-Help is not a code writing service, it's a mail list for *doubts*
on R code.
Hope this helps,
Rui Barradas
Às 09:18 de 07/10/19, vodvos using zoho.com escreveu:
> I am mad about importing this strange csv format type.
>
> The real csv has been attached now. The raw data points are huge.
>
> Many thanks.
>
>
>
>
> ---- 在 星期日, 06 十月 2019 07:58:37 -0700 Rui Barradas <ruipbarradas using sapo.pt> 撰写 ----
> > Hello,
> >
> > It is not clear if all files have
> >
> > * a first block with just one data line
> > * all other blocks with as many rows as the numbers in that first data line.
> >
> > If yes, maybe something like this?
> >
> > lns <- readLines("strange.csv")
> > lns <- lns[sapply(lns, nchar) > 0]
> > lns <- sub(",$", "", lns)
> > i_title <- grep("[[:alpha:]]", lns)
> >
> > tmp <- lapply(seq_along(i_title), function(i){
> > tmp <- if(i < length(i_title)){
> > lns[(i_title[i] + 1):(i_title[i + 1] - 1)]
> > }else{
> > lns[(i_title[i] + 1):length(lns)]
> > }
> > list(n = length(tmp), text = unlist(strsplit(tmp, ",")))
> > })
> >
> > n <- max(sapply(tmp, '[[', 'n'))
> > tmp <- lapply(tmp, function(x) as.numeric(x$text))
> > tmp[[1]] <- rep(tmp[[1]], each = n)
> > res <- do.call(cbind.data.frame, tmp)
> > names(res) <- lns[i_title]
> > res
> >
> >
> > If you have hundreds of files, you should make a function out of the
> > code above.
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Às 12:29 de 06/10/19, vod vos via R-help escreveu:
> > > I got hundreds of csv files. The real formats in each csv file are as follows:
> > >
> > > aa(cm)
> > > 1, 2 , 3,
> > >
> > > bb(mm)
> > > 1, 2, 3,
> > > 4, 5, 6,
> > > 7, 8, 9,
> > >
> > > cc(mm)
> > > 3, 4, 5,
> > > 7, 5, 9,
> > > 6, 5, 8,
> > >
> > > How can I use read.table or read.csv to convert the csv files
> > > to a tidy data frame format as follow:
> > >
> > > aa, bb, cc
> > > 1, 1, 3
> > > 1, 2, 4
> > > 1, 3, 5
> > > 2, 4, 7
> > > 2, 5, 5
> > > 2, 6, 9
> > > 3, 7, 6
> > > 3, 8, 5
> > > 3, 9, 8
> > >
> > > many thanks.
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>
More information about the R-help
mailing list