Wed Apr 13 18:13:07 CEST 2022

What is the goal of having a constant in the model? To me that seems pointless. Also there is no variability in sexCode regardless of whether you call it integer or factor. So the model y ~ sexCode is just a strange way to look at the variability in y and it would be better to do something like summarize(y) or mean(y) if that was the goal.

This sounds like what I think is a bug in stats::model.matrix.default(): a numeric column with all identical entries is fine but a constant character or factor column is not.

> d <- data.frame(y=1:5, sex=rep("Female",5)) d$sexFactor <- 
> factor(d$sex, levels=c("Male","Female")) d$sexCode <- 
> as.integer(d$sexFactor) d
  y    sex sexFactor sexCode
1 1 Female    Female       2
2 2 Female    Female       2
3 3 Female    Female       2
4 4 Female    Female       2
5 5 Female    Female       2
> lm(y~sex, data=d)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels
> lm(y~sexFactor, data=d)
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels
> lm(y~sexCode, data=d)

lm(formula = y ~ sexCode, data = d)

(Intercept)      sexCode
          3           NA

Calling traceback() after the error would clarify this.


On Tue, Apr 12, 2022 at 3:12 PM Neha gupta <neha.bologna90 using gmail.com> wrote:

> Hello everyone, I have text data with output variable have three subgroups.
> I am using the following code but getting the error message (see error 
> after the code).
> d=read.csv("SONAR_RULES.csv", stringsAsFactors = FALSE) 
> index <- createDataPartition(d$TYPE, p = .70,list = FALSE) tr <- 
> d[index, ] ts <- d[-index, ]
> ctrl <- trainControl(method = "cv",number=3, index = index, classProbs 
> = TRUE, summaryFunction = multiClassSummary)
> ran <- train(TYPE ~ ., data = tr,
>                     method = "rpart",
>                     ## Will create 48 parameter combinations
>                     tuneLength = 3,
>                     na.action= na.pass,
>                     metric = "Accuracy",
>                     preProc = c("center", "scale", "nzv"),
>                     trControl = ctrl)
> getTrainPerf(ran)
> *It gives me error:*
> *Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
> contrasts can be applied only to factors with 2 or more levels*
> *My data is as follow*
> Rows: 1,819
> Columns: 14
> $ PLUGIN_RULE_KEY             <chr> "InsufficientBranchCoverage",
> "InsufficientLin~
> $ PLUGIN_CONFIG_KEY           <chr> "", "", "", "", "", "", "", "", "", "",
> "S1120~
> $ PLUGIN_NAME                 <chr> "common-java", "common-java",
> "common-java", "~
> $ DESCRIPTION                 <chr> "An issue is created on a file as soon
> as the ~
> $ SEVERITY                    <chr> "MAJOR", "MAJOR", "MAJOR", "MAJOR",
> "MAJOR", "~
> $ NAME                        <chr> "Branches should have sufficient
> coverage by t~
> $ REMEDIATION_GAP_MULT        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
> NA, NA~
> $ DEF_REMEDIATION_BASE_EFFORT <chr> "", "", "", "10min", "", "", 
> "5min", "5min", "~
> $ GAP_DESCRIPTION             <chr> "number of uncovered conditions",
> "number of l~
> $ SYSTEM_TAGS                 <chr> "bad-practice", "bad-practice",
> "convention", ~
> $ IS_TEMPLATE                 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0~
> $ DESCRIPTION_FORMAT          <chr> "HTML", "HTML", "HTML", "HTML", "HTML",
> "HTML"~
> $ TYPE                        <chr> "CODE_SMELL", "CODE_SMELL",
