[R] Find the ideal cluster
Jovani T. de Souza
jov@n|@ouz@5 @end|ng |rom gm@||@com
Mon Dec 14 23:11:59 CET 2020
Thank you so much!
Sorry Michael, I will insert in the next.
Best regards.
[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Remetente
notificado por
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
14/12/20
19:11:49
Em sáb., 12 de dez. de 2020 às 14:06, Michael Dewey <lists using dewey.myzen.co.uk>
escreveu:
> Dear Jovani
>
> If you cross-post on CrossValidated as well as here it is polite to give
> a link so people do not answer here when someone has already answered
> there, or vice versa.
>
> Michael
>
> On 12/12/2020 15:27, Jovani T. de Souza wrote:
> > So, I and some other colleagues developed a hierarchical clustering
> > algorithm to basically find the main clusters involving agricultural
> > industries according to a particular city (e.g. London city).. We
> > structured this algorithm in R. It is working perfectly. So, according to
> > our filters that we inserted in the algorithm, we were able to generate 6
> > clustering scenarios to London city. For example, the first scenario
> > generated 2 clusters, the second scenario 5 clusters, and so on. I would
> > therefore like some help on how I can choose the most appropriate one. I
> > saw that there are some packages that help in this process, like
> `pvclust`,
> > but I couldn't use it for my case. I am inserting a brief executable code
> > below to show the essence of what I want.
> >
> > Any help is welcome! If you know how to use using another package, feel
> > free to describe.
> >
> > Best Regards.
> >
> >
> > library(rdist)
> > library(geosphere)
> > library(fpc)
> >
> >
> > df<-structure(list(Industries = c(1,2,3,4,5,6),
> > + Latitude = c(-23.8, -23.8, -23.9, -23.7,
> > -23.7,-23.7),
> > + Longitude = c(-49.5, -49.6, -49.7, -49.8,
> > -49.6,-49.9),
> > + Waste = c(526, 350, 526, 469, 534, 346)),
> class =
> > "data.frame", row.names = c(NA, -6L))
> >
> > df1<-df
> >
> > #clusters
> > coordinates<-df[c("Latitude","Longitude")]
> > d<-as.dist(distm(coordinates[,2:1]))
> > fit.average<-hclust(d,method="average")
> >
> > clusters<-cutree(fit.average, k=2)
> > df$cluster <- clusters
> > > df
> > Industries Latitude Longitude Waste cluster
> > 1 1 -23.8 -49.5 526 1
> > 2 2 -23.8 -49.6 350 1
> > 3 3 -23.9 -49.7 526 1
> > 4 4 -23.7 -49.8 469 2
> > 5 5 -23.7 -49.6 534 1
> > 6 6 -23.7 -49.9 346 2
> > >
> > clusters1<-cutree(fit.average, k=5)
> > df1$cluster <- clusters1
> > > df1
> > Industries Latitude Longitude Waste cluster
> > 1 1 -23.8 -49.5 526 1
> > 2 2 -23.8 -49.6 350 1
> > 3 3 -23.9 -49.7 526 2
> > 4 4 -23.7 -49.8 469 3
> > 5 5 -23.7 -49.6 534 4
> > 6 6 -23.7 -49.9 346 5
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list