[R] project test data into principal components of training dataset
olsen
o.o.wolf at qmul.ac.uk
Wed Apr 20 19:33:54 CEST 2016
For the records, a slightly hacky answer, by modifying the ggbiplot
function, is provided now here:
http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot
On 18/04/16 17:20, olsen wrote:
> Hi there,
>
> I've a training dataset and a test dataset. My aim is to visually
> allocate the test data within the calibrated space reassembled by the
> PC's of the training data set, furthermore to keep the training data set
> coordinates fixed, so they can serve as ruler for measurement for
> additional test datasets coming up.
>
> Please find a minimum working example using the wine dataset below.
> Ideally I would like to use ggbiplot as it comes with the elegant
> features but it only accepts objects of class prcomp, princomp, PCA, or
> lda, which is not fullfilled by the predicted test data.
>
> I'm still slightly wet behind my R ears and the only solution I can
> think of is to plot the calibrated space in ggbiplot and the training
> data in ggplot and then join them, in the worst case by exporting them
> as svg and importing them in inkscape. Which is slightly complicated
> plus the scaling is different.
>
> Any indication how this mission can be accomplished very welcome!
>
> Thanks and greets
> Olsen
>
> I started a threat on stackoverflow on that issue but know relevant
> indications so far.
> http://stackoverflow.com/questions/36603268/how-to-plot-training-and-test-validation-data-in-r-using-ggbiplot
>
> ##MWE
> library(ggbiplot)
> data(wine)
>
> ##pca on the wine dataset used as training data
> wine.pca <- prcomp(wine, center = TRUE, scale. = TRUE)
>
> wine$class <- wine.class
>
> ##simulate test data by generating three new wine classes
> wine.new.1 <- wine[c(sample(1:nrow(wine), 25)),]
> wine.new.2 <- wine[c(sample(1:nrow(wine), 43)),]
> wine.new.3 <- wine[c(sample(1:nrow(wine), 36)),]
>
> ##Predict PCs for the new classes by transforming
> #them using the predict.prcomp function
> pred.new.1 <- predict(wine.pca, newdata = wine.new.1)
> pred.new.2 <- predict(wine.pca, newdata = wine.new.2)
> pred.new.3 <- predict(wine.pca, newdata = wine.new.3)
>
> #simulate the classes for the new sorts
> wine.new.1$class <- rep("new.wine.1", nrow(wine.new.1))
> wine.new.2$class <- rep("new.wine.2", nrow(wine.new.2))
> wine.new.3$class <- rep("new.wine.3", nrow(wine.new.3))
> wine.new.bind <- rbind(wine.new.1, wine.new.2, wine.new.3)
>
> ##compose the plot by joining the PCA ggbiplot training data with the
> testing data from ggplot
> #plot the calibrated space resulting from the test data
> g.train <- ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, groups =
> wine$class, ellipse = TRUE, circle = TRUE)
> g.train
> #plot the test data resulting from the prediction
> df.pred = data.frame(PC1 = wine.new.bind[,1], PC2 = wine.new.bind[,2],
> PC3 = wine.new.bind[,3], PC4 = wine.new.bind[,4],
> classes = wine.new.bind$class)
> g.test <- ggplot(df.pred, aes(PC1, PC2, color = classes, shape =
> classes)) + geom_point() + stat_ellipse()
> g.test
>
>
>
>
>
--
Our solar system is the cream of the crop
http://hasa-labs.org
More information about the R-help
mailing list