[R] Error with text analysis data
Neha gupta
neh@@bo|ogn@90 @end|ng |rom gm@||@com
Wed Apr 13 11:05:25 CEST 2022
Thank you Jim
So what solution you do suggest? The features are text so it doesn't look
like a csv format.
Best regards
On Wednesday, April 13, 2022, Jim Lemon <drjimlemon using gmail.com> wrote:
> Hi Neha,
> The error message is about not having _factors_ with two or more
> levels. Apart from using stringsAsFactors=FALSE (meaning that you
> probably won't get any factors in "d"), your sample data doesn't look
> like CSV format. Perhaps the lines have been truncated. You may get
> something with stringsAsFactors=TRUE, but I don't know whether it will
> be sensibler.
>
> Jim
>
> On Wed, Apr 13, 2022 at 8:12 AM Neha gupta <neha.bologna90 using gmail.com>
> wrote:
> >
> > Hello everyone, I have text data with output variable have three
> subgroups.
> > I am using the following code but getting the error message (see error
> > after the code).
> >
> > d=read.csv("SONAR_RULES.csv", stringsAsFactors = FALSE)
> > d$REMEDIATION_FUNCTION=NULL
> > d$DEF_REMEDIATION_GAP_MULT=NULL
> > d$REMEDIATION_BASE_EFFORT=NULL
> >
> > index <- createDataPartition(d$TYPE, p = .70,list = FALSE)
> > tr <- d[index, ]
> > ts <- d[-index, ]
> >
> > ctrl <- trainControl(method = "cv",number=3, index = index, classProbs =
> > TRUE, summaryFunction = multiClassSummary)
> >
> > ran <- train(TYPE ~ ., data = tr,
> > method = "rpart",
> > ## Will create 48 parameter combinations
> > tuneLength = 3,
> > na.action= na.pass,
> > metric = "Accuracy",
> > preProc = c("center", "scale", "nzv"),
> > trControl = ctrl)
> > getTrainPerf(ran)
> >
> > *It gives me error:*
> >
> >
> > *Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
> > contrasts can be applied only to factors with 2 or more levels*
> >
> >
> > *My data is as follow*
> >
> > Rows: 1,819
> > Columns: 14
> > $ PLUGIN_RULE_KEY <chr> "InsufficientBranchCoverage",
> > "InsufficientLin~
> > $ PLUGIN_CONFIG_KEY <chr> "", "", "", "", "", "", "", "", "",
> "",
> > "S1120~
> > $ PLUGIN_NAME <chr> "common-java", "common-java",
> > "common-java", "~
> > $ DESCRIPTION <chr> "An issue is created on a file as
> soon
> > as the ~
> > $ SEVERITY <chr> "MAJOR", "MAJOR", "MAJOR", "MAJOR",
> > "MAJOR", "~
> > $ NAME <chr> "Branches should have sufficient
> > coverage by t~
> > $ DEF_REMEDIATION_FUNCTION <chr> "LINEAR", "LINEAR", "LINEAR",
> > "LINEAR_OFFSET",~
> > $ REMEDIATION_GAP_MULT <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA,
> NA,
> > NA, NA~
> > $ DEF_REMEDIATION_BASE_EFFORT <chr> "", "", "", "10min", "", "", "5min",
> > "5min", "~
> > $ GAP_DESCRIPTION <chr> "number of uncovered conditions",
> > "number of l~
> > $ SYSTEM_TAGS <chr> "bad-practice", "bad-practice",
> > "convention", ~
> > $ IS_TEMPLATE <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0,
> > 0, 0, 0~
> > $ DESCRIPTION_FORMAT <chr> "HTML", "HTML", "HTML", "HTML",
> "HTML",
> > "HTML"~
> > $ TYPE <chr> "CODE_SMELL", "CODE_SMELL",
> > "CODE_SMELL", "COD~
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list