[R] 2 D density plot interpretation and manipulating the data

Ana Marija @okov|c@@n@m@r|j@ @end|ng |rom gm@||@com
Fri Oct 9 03:35:37 CEST 2020


My understanding is that this represents bivariate normal
approximation of the data which uses the kernel density function to
test for inclusion within a level set. (please correct me)

In order to exclude the outlier to these ellipses/contours is it
advisable to do something like this:

SNP$density <- get_density(SNP$mean, SNP$var)
> summary(SNP$density)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      0     383     696     738    1170    1789

where get_density() is function from here:
https://slowkow.com/notes/ggplot2-color-by-density/

and then do something like this:

a=SNP[SNP$density>400,]

and plot it again:

p <- ggplot(a, mapping = aes(x = mean, y = var))
p <- p +  geom_density_2d() + geom_point() + my.theme + ggtitle("SNPS_red")

On Thu, Oct 8, 2020 at 3:52 PM Ana Marija <sokovic.anamarija using gmail.com> wrote:
>
> Hello,
>
> I have a data frame like this:
>
> > head(SNP)
>                mean      var     sd
> FQC.10090295 0.0327 0.002678 0.0517
> FQC.10119363 0.0220 0.000978 0.0313
> FQC.10132112 0.0275 0.002088 0.0457
> FQC.10201128 0.0169 0.000289 0.0170
> FQC.10208432 0.0443 0.004081 0.0639
> FQC.10218466 0.0116 0.000131 0.0115
> ...
>
> and I am creating plot like this:
>
> s <- ggplot(SNP, mapping = aes(x = mean, y = var))
> s <- s +  geom_density_2d() + geom_point() + my.theme + ggtitle("SNPs")
> s
>
> I am getting plot in attach.
>
> My question is how do I:
> 1.interpret the inclusion versus exclusion within the ellipses-contours?
>
> 2. how do I extract from my data frame the points which are outside of ellipses?
>
> Thanks
> Ana



More information about the R-help mailing list