[R] Find the ideal cluster

Mon Dec 14 23:11:59 CET 2020

Thank you so much!

Sorry Michael, I will insert in the next.

Best regards.

[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Remetente
notificado por
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
14/12/20
19:11:49

Em sáb., 12 de dez. de 2020 às 14:06, Michael Dewey <lists using dewey.myzen.co.uk>
escreveu:

> Dear Jovani
>
> If you cross-post on CrossValidated as well as here it is polite to give
> a link so people do not answer here when someone has already answered
> there, or vice versa.
>
> Michael
>
> On 12/12/2020 15:27, Jovani T. de Souza wrote:
> > So, I and some other colleagues developed a hierarchical clustering
> > algorithm to basically find the main clusters involving agricultural
> > industries according to a particular city (e.g. London city).. We
> > structured this algorithm in R. It is working perfectly. So, according to
> > our filters that we inserted in the algorithm, we were able to generate 6
> > clustering scenarios to London city. For example, the first scenario
> > generated 2 clusters, the second scenario 5 clusters, and so on. I would
> > therefore like some help on how I can choose the most appropriate one. I
> > saw that there are some packages that help in this process, like
> `pvclust`,
> > but I couldn't use it for my case. I am inserting a brief executable code
> > below to show the essence of what I want.
> >
> > Any help is welcome! If you know how to use using another package, feel
> > free to describe.
> >
> > Best Regards.
> >
> >
> >      library(rdist)
> >      library(geosphere)
> >      library(fpc)
> >
> >
> >      df<-structure(list(Industries = c(1,2,3,4,5,6),
> >      +                    Latitude = c(-23.8, -23.8, -23.9, -23.7,
> > -23.7,-23.7),
> >      +                    Longitude = c(-49.5, -49.6, -49.7, -49.8,
> > -49.6,-49.9),
> >      +                    Waste = c(526, 350, 526, 469, 534, 346)),
> class =
> > "data.frame", row.names = c(NA, -6L))
> >
> >      df1<-df
> >
> >      #clusters
> >      coordinates<-df[c("Latitude","Longitude")]
> >      d<-as.dist(distm(coordinates[,2:1]))
> >      fit.average<-hclust(d,method="average")
> >
> >      clusters<-cutree(fit.average, k=2)
> >      df$cluster <- clusters
> >      > df
> >        Industries Latitude Longitude Waste cluster
> >      1          1    -23.8     -49.5   526       1
> >      2          2    -23.8     -49.6   350       1
> >      3          3    -23.9     -49.7   526       1
> >      4          4    -23.7     -49.8   469       2
> >      5          5    -23.7     -49.6   534       1
> >      6          6    -23.7     -49.9   346       2
> >      >
> >      clusters1<-cutree(fit.average, k=5)
> >      df1$cluster <- clusters1
> >      > df1
> >        Industries Latitude Longitude Waste cluster
> >      1          1    -23.8     -49.5   526       1
> >      2          2    -23.8     -49.6   350       1
> >      3          3    -23.9     -49.7   526       2
> >      4          4    -23.7     -49.8   469       3
> >      5          5    -23.7     -49.6   534       4
> >      6          6    -23.7     -49.9   346       5
> >      >
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
>

	[[alternative HTML version deleted]]