[R] read. table()
arun
smartpink111 at yahoo.com
Sun Dec 9 00:11:12 CET 2012
HI Pradip,
Try this:
source("Muhuri.txt")
#Muhuri.txt
Lines<- "race age percent sepercent flag_var
Mexican 12-17 5.7926 0.64195 any-------------------------------------------------------------
--------------------------------------------------------
"
Lines1<-readLines(textConnection(Lines))
Col1new<-gsub(" ","",gsub("\\s+(\\D+)[[:digit:]]+\\+.*","\\1",gsub("\\s+(\\D+)[[:digit:]]+\\-.*","\\1",Lines1[-1])))
Col2<-gsub("\\s+\\D+([[:digit:]]+\\+.*)","\\1",gsub("\\s+\\D+([[:digit:]]+\\-.*)","\\1",Lines1[-1]))
dat1<-data.frame(Col1new,read.table(text=Col2,stringsAsFactors=FALSE,sep=""),stringsAsFactors=FALSE)
heading<-unlist(strsplit(Lines1[1]," "))
colnames(dat1)<-heading[heading!=""]
head(dat1,6)
# race age percent sepercent flag_var
#1 Mexican 12-17 5.7926 0.64195 any
#2 PuertoRican 12-17 5.1975 0.24929 any
#3 Cuban 12-17 3.7977 1.00487 any
#4 C-SAmerican 12-17 4.3665 0.55329 any
#5 Dominican 12-17 1.8149 0.46677 any
#6 Spanish(Spain) 12-17 6.1971 0.98386 any
str(dat1)
'data.frame': 195 obs. of 5 variables:
$ race : chr "Mexican" "PuertoRican" "Cuban" "C-SAmerican" ...
$ age : chr "12-17" "12-17" "12-17" "12-17" ...
$ percent : num 5.79 5.2 3.8 4.37 1.81 ...
$ sepercent: num 0.642 0.249 1.005 0.553 0.467 ...
$ flag_var : chr "any" "any" "any" "any" ...
A.K.
----- Original Message -----
From: "Muhuri, Pradip (SAMHSA/CBHSQ)" <Pradip.Muhuri at samhsa.hhs.gov>
To: 'arun' <smartpink111 at yahoo.com>
Cc: David L Carlson <dcarlson at tamu.edu>; R help <r-help at r-project.org>
Sent: Saturday, December 8, 2012 5:20 PM
Subject: RE: [R] read. table()
Dear Arun,
The issue is that the column names are incorrect. I will also look into the comment by Prof Ripley.
Thanks for your continued support and help.
Pradip
> str(read.delim(textConnection(xd1),header=TRUE,sep="\t"))
'data.frame': 195 obs. of 1 variable:
$ race....age...percent..sepercent..flag_var: Factor w/ 195 levels " Cuban 26+ 0.6653 0.31239 mrj",..: 27 148 13 140 108 193 169 100 85 67 ...
> names(agerace)
[1] "race....age...percent..sepercent..flag_var"
> head(agerace)
race....age...percent..sepercent..flag_var
1 Mexican 12-17 5.7926 0.64195 any
2 Puerto Rican 12-17 5.1975 0.24929 any
3 Cuban 12-17 3.7977 1.00487 any
4 C-S American 12-17 4.3665 0.55329 any
5 Dominican 12-17 1.8149 0.46677 any
6 Spanish (Spain) 12-17 6.1971 0.98386 any
Pradip K. Muhuri, PhD
Statistician
Substance Abuse & Mental Health Services Administration
The Center for Behavioral Health Statistics and Quality
Division of Population Surveys
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857
Tel: 240-276-1070
Fax: 240-276-1260
e-mail: Pradip.Muhuri at samhsa.hhs.gov
The Center for Behavioral Health Statistics and Quality your feedback. Please click on the following link to complete a brief customer survey: http://cbhsqsurvey.samhsa.gov
-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com]
Sent: Saturday, December 08, 2012 5:13 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ)
Cc: David L Carlson; R help
Subject: Re: [R] read. table()
Hi,
You can check the str()
I assume it will be like this:
str(read.delim(textConnection(Lines),header=TRUE,sep="\t"))
#'data.frame': 195 obs. of 1 variable:
# $ race....age...percent..sepercent..flag_var: Factor w/ 195 levels " C-S American 12-17 0.2399 0.15804 coc",..: 50 170 20 5 35 185 65 155 110 80 ...
A.K.
----- Original Message -----
From: "Muhuri, Pradip (SAMHSA/CBHSQ)" <Pradip.Muhuri at samhsa.hhs.gov>
To: 'Prof Brian Ripley' <ripley at stats.ox.ac.uk>; "r-help at r-project.org" <r-help at r-project.org>
Cc:
Sent: Saturday, December 8, 2012 5:05 PM
Subject: Re: [R] read. table()
Dear Prof Ripley,
Your hint is helpful, and I see considerable improvements in the results.
The only issue is that the column names do not seem to be correct. I did not understand part of your comment, which says "fortunes::fortune(14) applies" although I read about the double colon operator- ns-dblcolon {base}.
Could you please provide a little more hint for me to resolve the issue?
Thanks and regards,
######### new result ########
> agerace <- read.delim(textConnection(xd1), sep="\t", header=TRUE, as.is=TRUE)
> names(agerace)
[1] "race....age...percent..sepercent..flag_var"
> head(agerace)
race....age...percent..sepercent..flag_var
1 Mexican 12-17 5.7926 0.64195 any
2 Puerto Rican 12-17 5.1975 0.24929 any
3 Cuban 12-17 3.7977 1.00487 any
4 C-S American 12-17 4.3665 0.55329 any
5 Dominican 12-17 1.8149 0.46677 any
6 Spanish (Spain) 12-17 6.1971 0.98386 any
Pradip K. Muhuri, PhD
Statistician
Substance Abuse & Mental Health Services Administration
The Center for Behavioral Health Statistics and Quality
Division of Population Surveys
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857
Tel: 240-276-1070
Fax: 240-276-1260
e-mail: Pradip.Muhuri at samhsa.hhs.gov
The Center for Behavioral Health Statistics and Quality your feedback. Please click on the following link to complete a brief customer survey: http://cbhsqsurvey.samhsa.gov
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Prof Brian Ripley
Sent: Saturday, December 08, 2012 2:29 PM
To: r-help at r-project.org
Subject: Re: [R] read.table()
On 08/12/2012 19:10, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:
>
> Hi List,
>
> I have spent more than 30 minutes, but failed to read in this file using the read.table() function. I could not figure out how to fix the following error.
Well, we have a whole manual on this, mentioned on ?read.table (see See
Also) Have you read it? fortunes::fortune(14) applies.
The issue is what the separator is. You have specified whitespace, and
that is not correct. The original might have had tabs (see ?read.delim)
but as pasted into this email only a human can disentangle this file.
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 1 did not have 6 elements
>
> Any help would be be appreciated.
>
> Thanks,
>
> Pradip Muhuri
>
>
> ####### below is the reproducible example
> xd1 <- "race age percent sepercent flag_var
> Mexican 12-17 5.7926 0.64195 any
> Puerto Rican 12-17 5.1975 0.24929 any
> Cuban 12-17 3.7977 1.00487 any
> C-S American 12-17 4.3665 0.55329 any
> Dominican 12-17 1.8149 0.46677 any
> Spanish (Spain) 12-17 6.1971 0.98386 any
> Multi Hisp Eth 12-17 6.7006 1.12464 any
> NH White 12-17 4.8442 0.08660 any
> NH Black 12-17 3.6943 0.16045 any
> NH AM-AK 12-17 9.6325 1.06100 any
> NH HI-OPI 12-17 3.9189 1.08047 any
> NH Asian 12-17 1.9115 0.28432 any
> NH Multiracial 12-17 6.4255 0.51434 any
> Mexican 18-25 8.9284 0.73022 any
> Puerto Rican 18-25 6.1364 0.28394 any
> Cuban 18-25 8.6782 1.45543 any
> C-S American 18-25 5.9360 0.59899 any
> Dominican 18-25 7.7642 1.64553 any
> Spanish (Spain) 18-25 9.2632 1.15652 any
> Multi Hisp Eth 18-25 11.3566 1.79282 any
> NH White 18-25 8.6484 0.11866 any
> NH Black 18-25 7.5972 0.24926 any
> NH AM-AK 18-25 13.5041 1.57275 any
> NH HI-OPI 18-25 8.0227 1.41348 any
> NH Asian 18-25 3.2701 0.32414 any
> NH Multiracial 18-25 10.6489 0.85105 any
> Mexican 26+ 3.2110 0.51683 any
> Puerto Rican 26+ 1.6273 0.15033 any
> Cuban 26+ 1.4419 0.44118 any
> C-S American 26+ 1.0187 0.26594 any
> Dominican 26+ 0.9554 0.50275 any
> Spanish (Spain) 26+ 2.5976 0.86230 any
> Multi Hisp Eth 26+ 1.1345 0.66375 any
> NH White 26+ 1.5510 0.04156 any
> NH Black 26+ 2.8763 0.15133 any
> NH AM-AK 26+ 3.9674 0.76611 any
> NH HI-OPI 26+ 1.2919 0.66205 any
> NH Asian 26+ 0.7207 0.13870 any
> NH Multiracial 26+ 3.0668 0.52334 any
> Mexican 12-17 4.3152 0.53235 mrj
> Puerto Rican 12-17 3.7237 0.20969 mrj
> Cuban 12-17 2.0616 0.67248 mrj
> C-S American 12-17 3.3282 0.47392 mrj
> Dominican 12-17 1.3797 0.40435 mrj
> Spanish (Spain) 12-17 5.1810 0.93979 mrj
> Multi Hisp Eth 12-17 4.8915 0.94816 mrj
> NH White 12-17 3.6190 0.07379 mrj
> NH Black 12-17 2.8196 0.14042 mrj
> NH AM-AK 12-17 6.5091 0.85124 mrj
> NH HI-OPI 12-17 3.6267 1.06724 mrj
> NH Asian 12-17 1.3162 0.23575 mrj
> NH Multiracial 12-17 5.0657 0.49614 mrj
> Mexican 18-25 7.3802 0.67992 mrj
> Puerto Rican 18-25 4.3260 0.24191 mrj
> Cuban 18-25 6.1433 1.19242 mrj
> C-S American 18-25 3.9166 0.51272 mrj
> Dominican 18-25 5.8000 1.24097 mrj
> Spanish (Spain) 18-25 6.8646 1.01387 mrj
> Multi Hisp Eth 18-25 10.1134 1.75013 mrj
> NH White 18-25 5.8656 0.10100 mrj
> NH Black 18-25 6.6869 0.23643 mrj
> NH AM-AK 18-25 11.2989 1.51687 mrj
> NH HI-OPI 18-25 5.6302 1.14561 mrj
> NH Asian 18-25 2.3418 0.28309 mrj
> NH Multiracial 18-25 8.2696 0.77139 mrj
> Mexican 26+ 1.1658 0.33967 mrj
> Puerto Rican 26+ 0.6757 0.09329 mrj
> Cuban 26+ 0.6653 0.31239 mrj
> C-S American 26+ 0.3177 0.17604 mrj
> Dominican 26+ 0.5616 0.39780 mrj
> Spanish (Spain) 26+ 1.8078 0.82590 mrj
> Multi Hisp Eth 26+ 0.8468 0.63529 mrj
> NH White 26+ 0.6915 0.02791 mrj
> NH Black 26+ 1.5675 0.12031 mrj
> NH AM-AK 26+ 1.7273 0.37673 mrj
> NH HI-OPI 26+ 0.0356 0.03535 mrj
> NH Asian 26+ 0.2687 0.07564 mrj
> NH Multiracial 26+ 1.3419 0.30074 mrj
> Mexican 12-17 1.2074 0.36082 anl
> Puerto Rican 12-17 1.0772 0.11547 anl
> Cuban 12-17 1.2569 0.67109 anl
> C-S American 12-17 0.6213 0.22726 anl
> Dominican 12-17 0.1412 0.08552 anl
> Spanish (Spain) 12-17 0.9625 0.25453 anl
> Multi Hisp Eth 12-17 1.2863 0.43909 anl
> NH White 12-17 1.1490 0.04289 anl
> NH Black 12-17 0.5932 0.06220 anl
> NH AM-AK 12-17 1.9117 0.50122 anl
> NH HI-OPI 12-17 0.3833 0.20240 anl
> NH Asian 12-17 0.4782 0.14706 anl
> NH Multiracial 12-17 1.5369 0.25321 anl
> Mexican 18-25 1.1836 0.24209 anl
> Puerto Rican 18-25 1.0337 0.11015 anl
> Cuban 18-25 1.2738 0.45891 anl
> C-S American 18-25 0.5598 0.15047 anl
> Dominican 18-25 0.4720 0.31559 anl
> Spanish (Spain) 18-25 1.7871 0.64048 anl
> Multi Hisp Eth 18-25 1.2764 0.48779 anl
> NH White 18-25 2.0818 0.05831 anl
> NH Black 18-25 0.7851 0.07803 anl
> NH AM-AK 18-25 1.8964 0.46240 anl
> NH HI-OPI 18-25 1.9397 0.73301 anl
> NH Asian 18-25 0.4858 0.13528 anl
> NH Multiracial 18-25 1.7864 0.30651 anl
> Mexican 26+ 0.4014 0.08306 anl
> Puerto Rican 26+ 0.4536 0.07721 anl
> Cuban 26+ 0.2164 0.17096 anl
> C-S American 26+ 0.2233 0.09101 anl
> Dominican 26+ 0.0000 0.00000 anl
> Spanish (Spain) 26+ 1.1527 0.74125 anl
> Multi Hisp Eth 26+ 0.0303 0.03045 anl
> NH White 26+ 0.4970 0.02275 anl
> NH Black 26+ 0.3748 0.06124 anl
> NH AM-AK 26+ 1.4842 0.52284 anl
> NH HI-OPI 26+ 0.3898 0.34827 anl
> NH Asian 26+ 0.2536 0.07643 anl
> NH Multiracial 26+ 0.5120 0.18326 anl
> Mexican 12-17 0.2453 0.15761 coc
> Puerto Rican 12-17 0.4351 0.06999 coc
> Cuban 12-17 0.2472 0.24698 coc
> C-S American 12-17 0.2399 0.15804 coc
> Dominican 12-17 0.0000 0.00000 coc
> Spanish (Spain) 12-17 0.5315 0.30907 coc
> Multi Hisp Eth 12-17 0.9797 0.53981 coc
> NH White 12-17 0.3559 0.02305 coc
> NH Black 12-17 0.0220 0.01235 coc
> NH AM-AK 12-17 0.3588 0.23956 coc
> NH HI-OPI 12-17 0.0000 0.00000 coc
> NH Asian 12-17 0.1171 0.07887 coc
> NH Multiracial 12-17 0.4702 0.14823 coc
> Mexican 18-25 1.1540 0.26424 coc
> Puerto Rican 18-25 1.3422 0.12707 coc
> Cuban 18-25 1.6312 0.69363 coc
> C-S American 18-25 0.8669 0.23394 coc
> Dominican 18-25 0.6003 0.43959 coc
> Spanish (Spain) 18-25 1.9886 0.59004 coc
> Multi Hisp Eth 18-25 1.8588 0.86984 coc
> NH White 18-25 1.3990 0.04700 coc
> NH Black 18-25 0.3640 0.04961 coc
> NH AM-AK 18-25 2.2718 0.77117 coc
> NH HI-OPI 18-25 0.8386 0.47913 coc
> NH Asian 18-25 0.1947 0.05994 coc
> NH Multiracial 18-25 1.5209 0.30649 coc
> Mexican 26+ 1.4155 0.39542 coc
> Puerto Rican 26+ 0.5618 0.09323 coc
> Cuban 26+ 0.7766 0.31905 coc
> C-S American 26+ 0.3364 0.15414 coc
> Dominican 26+ 0.2632 0.26477 coc
> Spanish (Spain) 26+ 0.1596 0.07740 coc
> Multi Hisp Eth 26+ 0.2521 0.18020 coc
> NH White 26+ 0.3928 0.02073 coc
> NH Black 26+ 1.1867 0.09546 coc
> NH AM-AK 26+ 0.6865 0.24570 coc
> NH HI-OPI 26+ 0.5155 0.49176 coc
> NH Asian 26+ 0.0787 0.04558 coc
> NH Multiracial 26+ 1.1320 0.37928 coc
> Mexican 12-17 0.6556 0.23195 inh
> Puerto Rican 12-17 0.6060 0.08943 inh
> Cuban 12-17 0.4765 0.36661 inh
> C-S American 12-17 0.3629 0.12994 inh
> Dominican 12-17 0.0300 0.03006 inh
> Spanish (Spain) 12-17 0.2020 0.11445 inh
> Multi Hisp Eth 12-17 0.7095 0.32063 inh
> NH White 12-17 0.4161 0.02587 inh
> NH Black 12-17 0.2608 0.04218 inh
> NH AM-AK 12-17 1.3372 0.40763 inh
> NH HI-OPI 12-17 0.1116 0.06566 inh
> NH Asian 12-17 0.1580 0.08034 inh
> NH Multiracial 12-17 0.5472 0.13080 inh
> Mexican 18-25 0.0160 0.01601 inh
> Puerto Rican 18-25 0.2163 0.06270 inh
> Cuban 18-25 0.3252 0.32468 inh
> C-S American 18-25 0.2238 0.12254 inh
> Dominican 18-25 0.9445 0.94734 inh
> Spanish (Spain) 18-25 0.0443 0.03141 inh
> Multi Hisp Eth 18-25 0.6523 0.57082 inh
> NH White 18-25 0.1016 0.01257 inh
> NH Black 18-25 0.0617 0.02371 inh
> NH AM-AK 18-25 0.2387 0.14246 inh
> NH HI-OPI 18-25 0.0000 0.00000 inh
> NH Asian 18-25 0.1894 0.06962 inh
> NH Multiracial 18-25 0.0562 0.03261 inh
> Mexican 26+ 0.0160 0.01600 inh
> Puerto Rican 26+ 0.0185 0.01276 inh
> Cuban 26+ 0.0000 0.00000 inh
> C-S American 26+ 0.0696 0.06954 inh
> Dominican 26+ 0.0000 0.00000 inh
> Spanish (Spain) 26+ 0.1571 0.11467 inh
> Multi Hisp Eth 26+ 0.0000 0.00000 inh
> NH White 26+ 0.0174 0.00456 inh
> NH Black 26+ 0.0131 0.00757 inh
> NH AM-AK 26+ 0.2587 0.24381 inh
> NH HI-OPI 26+ 0.0000 0.00000 inh
> NH Asian 26+ 0.0607 0.03372 inh
> NH Multiracial 26+ 0.0433 0.02960 inh"
>
> agerace <- read.table(textConnection(xd1), header=TRUE, as.is=TRUE)
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list