[R] How to represent the effect of one covariate on regression results?
Ana Marija
@okov|c@@n@m@r|j@ @end|ng |rom gm@||@com
Wed Sep 16 03:11:07 CEST 2020
Hi David,
thanks for the useful insight I did of course wrote to plink user
group but no answer there. I guess they are more concerned about how
to run commands with plink as oppose to interpret results.
What I can tell about my cohort is that about 80% of cases had Type 2
diabetes while about 8% had Type 1. (my TD covariate is reference for
the type of diabetes) In the attach is the description of the data.
Cheers,
Ana
On Tue, Sep 15, 2020 at 7:59 PM David Winsemius <dwinsemius using comcast.net> wrote:
>
>
> On 9/15/20 8:57 AM, Ana Marija wrote:
> > Hi Abby and David,
> >
> > Thanks for the useful tips! I will check those.
> >
> > I completed the regression analysis in plink (as R would be very slow
> > for my sample size) but as I mentioned I need to determine the
> > influence of a specific covariate in my results and Plink is of no
> > help there.
> >
> > I did Pearson correlation analysis for P values which I got in
> > regression with and without my covariate of interest and I got this:
> >
> >> cor.test(tt$P_TD, tt$P_noTD, method = "pearson", conf.level = 0.95)
> > Pearson's product-moment correlation
> >
> > data: tt$P_TD and tt$P_noTD
> > t = 20.17, df = 283, p-value < 2.2e-16
> > alternative hypothesis: true correlation is not equal to 0
> > 95 percent confidence interval:
> > 0.7156134 0.8117108
> > sample estimates:
> > cor
> > 0.7679493
> >
> > I can see the p values are very correlated in those two instances. Can
> > I conclude that my covariate then doesn't have a huge effect or what
> > kind of conclusion I can draw from that?
>
>
> I do not think it follows from the correlation of p-values that your
> covariate "does not have a huge effect". P-values are not really data,
> although they are random values. A simulation study of this would
> require a much better description of the original dataset. Again, that
> is something that the users of Plink are more likely to be able to
> intuit than are we. I still do not see why this question is not being
> addressed to the users of the software from which you are deriving your
> "data".
>
>
> --
>
> David.
>
> >
> > Thanks for all your help
> > Ana
> >
> >
> >
> > On Tue, Sep 15, 2020 at 1:26 AM David Winsemius <dwinsemius using comcast.net> wrote:
> >> There is a user-group for PLINK, easily found by looking at the page you
> >> cited. This is not the correct place to submit such questions.
> >>
> >>
> >> https://groups.google.com/g/plink2-users?pli=1
> >>
> >>
> >> --
> >>
> >> David.
> >>
> >> On 9/14/20 6:29 AM, Ana Marija wrote:
> >>> Hello,
> >>>
> >>> I was running association analysis using --glm genotypic from:
> >>> https://www.cog-genomics.org/plink/2.0/assoc with these covariates:
> >>> sex,age,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10,TD,array,HBA1C. The
> >>> result looks like this:
> >>>
> >>> #CHROM POS ID REF ALT A1 TEST OBS_CT BETA
> >>> SE Z_OR_F_STAT P ERRCODE
> >>> 10 135434303 rs11101905 G A A ADD 11863
> >>> -0.110733 0.0986981 -1.12193 0.261891 .
> >>> 10 135434303 rs11101905 G A A DOMDEV 11863
> >>> 0.079797 0.111004 0.718868 0.472222 .
> >>> 10 135434303 rs11101905 G A A sex=Female
> >>> 11863 -0.120404 0.0536069 -2.24605 0.0247006 .
> >>> 10 135434303 rs11101905 G A A age 11863
> >>> 0.00524501 0.00391528 1.33963 0.180367 .
> >>> 10 135434303 rs11101905 G A A PC1 11863
> >>> -0.0191779 0.0166868 -1.14928 0.25044 .
> >>> 10 135434303 rs11101905 G A A PC2 11863
> >>> -0.0269939 0.0173086 -1.55957 0.118863 .
> >>> 10 135434303 rs11101905 G A A PC3 11863
> >>> 0.0115207 0.0168076 0.685448 0.493061 .
> >>> 10 135434303 rs11101905 G A A PC4 11863
> >>> 9.57832e-05 0.0124607 0.0076868 0.993867 .
> >>> 10 135434303 rs11101905 G A A PC5 11863
> >>> -0.00191047 0.00543937 -0.35123 0.725416 .
> >>> 10 135434303 rs11101905 G A A PC6 11863
> >>> -0.0103309 0.0159879 -0.646172 0.518168 .
> >>> 10 135434303 rs11101905 G A A PC7 11863
> >>> 0.00790997 0.0144025 0.549207 0.582863 .
> >>> 10 135434303 rs11101905 G A A PC8 11863
> >>> -0.00205639 0.0142709 -0.144096 0.885424 .
> >>> 10 135434303 rs11101905 G A A PC9 11863
> >>> -0.00873771 0.0057239 -1.52653 0.126878 .
> >>> 10 135434303 rs11101905 G A A PC10 11863
> >>> 0.0116197 0.0123826 0.938388 0.348045 .
> >>> 10 135434303 rs11101905 G A A TD 11863
> >>> -0.670026 0.0962216 -6.96337 3.32228e-12 .
> >>> 10 135434303 rs11101905 G A A array=Biobank
> >>> 11863 0.160666 0.073631 2.18205 0.0291062 .
> >>> 10 135434303 rs11101905 G A A HBA1C 11863
> >>> 0.0265933 0.00168758 15.7583 6.0236e-56 .
> >>> 10 135434303 rs11101905 G A A GENO_2DF 11863
> >>> NA NA 0.726514 0.483613 .
> >>>
> >>> This results is shown just for one ID (rs11101905) there is about 2
> >>> million of those in the resulting file.
> >>>
> >>> My question is how do I present/plot the effect of covariate "TD" in
> >>> the example it has "P" equal to 3.32228e-12 for all IDs in the
> >>> resulting file so that I show how much effect covariate "TD" has on
> >>> the analysis. Should I run another regression without covariate "TD"
> >>> and than do scatter plot of P values with and without "TD" covariate
> >>> or there is a better way to do this from the data I already have?
> >>>
> >>> Thanks
> >>> Ana
> >>>
> >>> ______________________________________________
> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: data.png
Type: image/png
Size: 57291 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200915/2045e2f5/attachment.png>
More information about the R-help
mailing list