[R] How to find inverse of glm model?

Thu Apr 24 21:27:07 CEST 2025

Thank you. Following your tips I tried to guess the starting values
using an approached that determined (1) the background level (the
fluorescence before the take over signal), (2) the slope between this
point and the maximum:
```
df <- data.frame(Cycle=1:35,
              Fluo=c(15908.54,  16246.92,  16722.43,  17104.29,
16794.93,  17031.44,  17299.05,  17185.49,
                     17362.28,  17127.43,  17368.96,  17513.19,
17593.95,  17626.37,  18308.29,  18768.12,
                     19955.26,  22493.85,  27088.12,  36539.44,
53694.18,  84663.18, 138835.64, 223331.89,
                     336409.69, 457729.88, 561233.12, 637536.31,
688985.88, 723158.56, 746575.62, 766274.75,
                     776087.75, 785144.81, 791573.81)
              )

# estimate starting values
S = sd(df$Fluo[1:3])*10
for (j in 4:nrow(df)) {
  x = df$Fluo[1:j]
  s = sd(x)
  q = df$Fluo[j]
  cat("SD = ", S, "Fluo = ", q, "\n")
  if(q<S) {
    break
  } else {
    S = sd(df$Fluo[1:j])*10
  }
}
MIN_FLUO = q
MIN_IDX = which (df$Fluo == MIN_FLUO)
MIN_CYC = df$Cycle[MIN_IDX]
MAX_FLUO = max(df$Fluo)
MAX_IDX = which (df$Fluo == MAX_FLUO)
MAX_CYC = df$Cycle[MAX_IDX]
HLF_CYC = MIN_CYC + (MAX_CYC - MIN_CYC)/2
q_mod = lm(Fluo~Cycle, df[MIN_IDX:MAX_IDX,])
SLP = q_mod$coefficients[2]

# plot
plot(Fluo~Cycle, df)
points(MIN_FLUO~MIN_CYC, col="blue", pch=16)
points(MAX_FLUO~MAX_CYC, col="red", pch=16)
abline(h=MAX_FLUO, col="red")
abline(q_mod, col="green")
abline(v=HLF_CYC, col="gold")
abline(h=S, col="purple")
legend("topleft", legend=c("HalfFluoCycle", "MaxFluo", "Slope", "Noise"),
       lty=1, col=c("gold", "red", "green", "purple"),
       title="Estimated parameters", lwd=2)
legend("left", legend=c("Min", "Max"), title="Selected interval",
       pch=16, col=c("blue", "red"))

# regression
mod1 = nls(Fluo ~ (MaxFluo / (1+ exp(-(Cycle - HalfFluoCycle)/Slope)) + Noise),
          data=df, start=list(MaxFluo=MAX_FLUO,
                              HalfFluoCycle=HLF_CYC,
                              Slope=SLP,
                              Noise=S))
mod2 = nls(Fluo ~ (MaxFluo / (1+ exp(-(Cycle - HalfFluoCycle)/Slope)) + Noise),
          data=df, start=list(MaxFluo=MAX_FLUO,
                              HalfFluoCycle=HLF_CYC,
                              Slope=2,
                              Noise=S))
lines(df$Cycle, predict(mod2), col = "orange", lwd = 2)
```
Model 1 failed probably because SLP = 57724.29 , which is a weird
slope value; model 2 instead worked, using your suggested value of 2.
If I could hard-code the slope to 2, unless there is a better way to
determine such a slope, then I think I could try to extend the
approach to other data...
Best regards
Luigi

On Wed, Apr 23, 2025 at 6:43 PM Gregg Powell <g.a.powell using protonmail.com> wrote:
>
> Hello Luigi,
>
> Great follow-up — Looks like you’re on the right track using nls() for nonlinear regression. You're fitting a logistic-like sigmoidal model (as in Rutledge’s paper), I think both errors you’re encountering are signs of bad starting values or a maybe model formulation issue.
>
> Breakdown of Your Model Attempt
> Your model formula:
> > Fluo ~ (MaxFluo / (1 + exp(-(Cycle - (HalfFluoCycle / Slope)))) + 1)
>
> This expression has perhaps two issues:
>
> 1.   Parameter placement: In the paper, the term inside exp() should look like -(Cycle - HalfFluoCycle)/Slope — not Cycle - (HalfFluoCycle / Slope).
>
> 2.   +1 additive constant: You called it backgroundSignal, but it’s hard-coded as +1. That constant should be a parameter (e.g., Bkg), otherwise the model fit has no way to adjust it.
>
> Alternative Model Form:
> Here’s a likely corrected version of the model based on Rutledge’s sigmoidal fit:
>
> > mod1 <- nls(Fluo ~ MaxFluo / (1 + exp(-(Cycle - HalfFluoCycle)/Slope)) > + Bkg,
> >            data = df,
> >            start = list(MaxFluo = max(df$Fluo),
> >                         HalfFluoCycle = 20,
> >                         Slope = 2,
> >                         Bkg = min(df$Fluo)))
>
> Notes on nls() Convergence
> nls() is very sensitive to starting values. Always inspect your plot and roughly estimate parameters:
>
> MaxFluo: Use max(Fluo)
>
> Bkg: Use min(Fluo)
>
> HalfFluoCycle: Approximate where the curve inflects
>
> Slope: Try values like 1, 2, 5 depending on steepness
>
> You can also add trace=TRUE to watch convergence progress:
>
> > mod1 <- nls(..., trace=TRUE)
>
> About SSmicmen
> From what I've gathered - this is meant for fitting the Michaelis-Menten model (enzyme kinetics), not sigmoidal growth. That's why mod2 fails — wrong self-starting model function.
>
>
> Visualizing the Fit
> After fitting:
>
> > plot(Fluo ~ Cycle, data = df)
> > lines(df$Cycle, predict(mod1), col = "red", lwd = 2)
>
> You might also want to compute the inverse (e.g., solving for Cycle given a fluorescence value) — that would involve solving the nonlinear equation manually or using uniroot().
>
> All the best,
> gregg
>
>
>
>
> On Wednesday, April 23rd, 2025 at 8:23 AM, Luigi Marongiu <marongiu.luigi using gmail.com> wrote:
>
> >
>
> >
>
> > Further on this, I have found a formula from a paper from Rutledge
> > (DOI: 10.1093/nar/gnh177), which I rendered as
> > `MaxFluo / (1+ exp(-(Cycle-(HalfFluoCycle/Slope)))) + backgroundSignal`
> > I then see that one can use `nls` to fit non-linear regression, so I tried:
> > `df <- data.frame(Cycle=1:35, Fluo=c(15908.54, 16246.92, 16722.43, 17104.29, 16794.93, 17031.44, 17299.05, 17185.49, 17362.28, 17127.43, 17368.96, 17513.19, 17593.95, 17626.37, 18308.29, 18768.12, 19955.26, 22493.85, 27088.12, 36539.44, 53694.18, 84663.18, 138835.64, 223331.89, 336409.69, 457729.88, 561233.12, 637536.31, 688985.88, 723158.56, 746575.62, 766274.75, 776087.75, 785144.81, 791573.81) ) plot(Fluo~Cycle, df) mod1 = nls(Fluo~(MaxFluo / (1+ exp(-(Cycle-(HalfFluoCycle/Slope)))) + 1), data=df, start=list(MaxFluo=max(df$Fluo), HalfFluoCycle=15, Slope=0.1)) mod2 = nls(Fluo~SSmicmen(Fluo, Cycle), data=df)`
> > but I got errors in both cases:
> > ```
> >
>
> > > mod1
> >
>
> > Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff,
> > nDcentral = nDcntr) :
> > singular gradient matrix at initial parameter estimates
> >
>
> > > mod2
> >
>
> > Error in qr.qty(QR.rhs, .swts * ddot(attr(rhs, "gradient"), lin)) :
> > NA/NaN/Inf in foreign function call (arg 5)
> > ```
> > How can I properly set this regression model?
> > Thank you
> >
>
> >
>
> > On Wed, Apr 16, 2025 at 7:08 AM Luigi Marongiu marongiu.luigi using gmail.com wrote:
> >
>
> > > Thank you. This topic is more complicated than anticipated. Best regards, Luigi
> > >
>
> > > On Tue, Apr 15, 2025 at 11:09 PM Andrew Robinson apro using unimelb.edu.au wrote:
> > >
>
> > > > A statistical (off-topic!) point to consider: when the GLM was fitted, you conditioned on x and let y be the random variable. Therefore the model supports predictions of y conditional on x. You’re now seeking to make predictions of x conditional on y. That’s not the same thing, even in OLS.
> > > >
>
> > > > It might not matter for your application but it’s probably worth thinking about. Simulation experiments might shed some light on that.
> > > >
>
> > > > Cheers, Andrew
> > > >
>
> > > > --
> > > > Andrew Robinson
> > > > Chief Executive Officer, CEBRA and Professor of Biosecurity,
> > > > School/s of BioSciences and Mathematics & Statistics
> > > > University of Melbourne, VIC 3010 Australia
> > > > Tel: (+61) 0403 138 955
> > > > Email: apro using unimelb.edu.au
> > > > Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/
> > > >
>
> > > > I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders.
> > > >
>
> > > > On 16 Apr 2025 at 1:01 AM +1000, Gregg Powell via R-help r-help using r-project.org, wrote:
> > > >
>
> > > > Take a look at this Luigi...
> > > >
>
> > > > # The model is: logit(p) = β₀ + β₁*Cycles
> > > > # Where p is the probability (or normalized value in your case)
> > > >
>
> > > > # The inverse function would be: Cycles = (logit⁻¹(p) - β₀)/β₁
> > > > # Where logit⁻¹ is the inverse logit function (also called the expit >function)
> > > >
>
> > > > # Extract coefficients from your model
> > > > intercept <- coef(b_model)[1]
> > > > slope <- coef(b_model)[2]
> > > >
>
> > > > # Define the inverse function
> > > > inverse_predict <- function(p) {
> > > > # p is the probability or normalized value you want to find the >cycles for
> > > > # Inverse logit: log(p/(1-p)) which is the logit function
> > > > logit_p <- log(p/(1-p))
> > > >
>
> > > > # Solve for Cycles: (logit(p) - intercept)/slope
> > > > cycles <- (logit_p - intercept)/slope
> > > >
>
> > > > return(cycles)
> > > > }
> > > >
>
> > > > # Example: What cycle would give a normalized value of 0.5?
> > > > inverse_predict(0.5)
> > > >
>
> > > > This function takes a probability (normalized value between 0 and 1) and returns the cycle value that would produce this probability according to your model.
> > > > Also:
> > > > This works because GLM with binomial family uses the logit link function by default
> > > > The inverse function will return values outside your original data range if needed
> > > > This won't work for p=0 or p=1 exactly (you'd get -Inf or Inf), so you might want to add checks
> > > >
>
> > > > All the best,
> > > > Gregg
> > > >
>
> > > > On Tuesday, April 15th, 2025 at 5:57 AM, Luigi Marongiu marongiu.luigi using gmail.com wrote:
> > > >
>
> > > > I have fitted a glm model to some data; how can I find the inverse
> > > > function of this model? Since I don't know the y=f(x) implemented by
> > > > glm (this works under the hood), I can't define a f⁻¹(y).
> > > > Is there an R function that can find the inverse of a glm model?
> > > > Thank you.
> > > >
>
> > > > The working example is:
> > > > `V = c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39, -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01, 75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94, 6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19, 10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 11684.96, 11781.77, 11863.35, 11927.44, 11980.81, 12021.88, 12058.35, 12100.63, 12133.57, 12148.89, 12137.09) df = data.frame(Cycles = 1:35, Values = V[1:35]) M = max(df$Values) df$Norm = df$Values/M df$Norm[df$Norm<0] = 0 b_model = glm(Norm ~ Cycles, data=df, family=binomial) plot(Norm ~ Cycles, df, main="Normalized view", xlab=expression(bold("Time")), ylab=expression(bold("Signal (normalized)")), type="p", col="cyan") lines(b_model$fitted.values ~ df$Cycles, col="blue", lwd=2)`
> > > >
>
> > > > ______________________________________________
> > > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
> > > --
> > > Best regards,
> > > Luigi
> >
>
> >
>
> >
>
> >
>
> > --
> > Best regards,
> > Luigi


-- 
Best regards,
Luigi