简体   繁体   中英

Dendrogram fails to plot in R

I am doing Hierarchical Clustering in R following this tutorial .

My code is like this but it ends in an error:

> distances = dist(movies[2:20], method="euclidean")
> clusterMovies = hclust(distances, method="ward")
> plot(clusterMovies)
Error in plot.hclust(clusterMovies) : 'merge' matrix has invalid contents

It works OK for me... Be sure that you download the movieLens.txt file with the exact way shown in the previous video of the tutorial, ie do not use 'Save as' and Internet Explorer . Then this should work:

movies = read.table("movieLens.txt", header=FALSE, sep="|",quote="\"")

# Add column names
colnames(movies) = c("ID", "Title", "ReleaseDate", "VideoReleaseDate", "IMDB", "Unknown", "Action", "Adventure", "Animation", "Childrens", "Comedy", "Crime", "Documentary", "Drama", "Fantasy", "FilmNoir", "Horror", "Musical", "Mystery", "Romance", "SciFi", "Thriller", "War", "Western")

# Remove unnecessary variables
movies$ID = NULL
movies$ReleaseDate = NULL
movies$VideoReleaseDate = NULL
movies$IMDB = NULL

# Remove duplicates
movies = unique(movies)

# Compute distances
distances = dist(movies[2:20], method = "euclidean")

# Hierarchical clustering
clusterMovies = hclust(distances, method = "ward") 

# Plot the dendrogram
plot(clusterMovies)

apart from a harmless warning message, after the clustermovies command:

The "ward" method has been renamed to "ward.D"; note new "ward.D2"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM