简体   繁体   中英

Error in R: NA/NaN/Inf in foreign function call using hclust()

I'm trying to cluster MLB data by starting pitcher's name. I've combed through the data I'm using and there is nothing with a value of na and I omitted them in the code below. Clustdata looks completely good to me ClustData preview but I get this error:

NAs introduced by coercionError in hclust(d, method = "single", members = clustdata[, 1]): NA/NaN/Inf in foreign function call (arg 7)

I want to cluster that table by pitcher name by those attributes Anyone have any ideas? Thanks! I'm new to R

data7 = read.csv("GL2007.csv", header = T)

data8 = data.frame(na.omit(data7[c(10,23,24,25,26,30,31,33,105)]))
scoreagg = aggregate(v_score ~ h_starting_pitcher_name, data8, mean)
hitsagg = aggregate(v_hits ~ h_starting_pitcher_name, data8, mean)
doubagg = aggregate(v_doubles~ h_starting_pitcher_name, data8, mean)
tripagg = aggregate(v_triples~ h_starting_pitcher_name, data8, mean)
hragg = aggregate(v_homeruns ~ h_starting_pitcher_name, data8, mean)
hbpagg = aggregate(v_hit_by_pitch ~ h_starting_pitcher_name, data8, mean)
walksagg = aggregate(v_walks~ h_starting_pitcher_name, data8, mean)
SOagg = aggregate(v_strikeouts~ h_starting_pitcher_name, data8, mean)

clustdata = data.frame(scoreagg$h_starting_pitcher_name, scoreagg$v_score,hitsagg$v_hits,doubagg$v_doubles,tripagg$v_triples,hragg$v_homeruns,hbpagg$v_hit_by_pitch,walksagg$v_walks,SOagg$v_strikeouts)


library(NbClust)
d = dist(as.matrix(clustdata[,2:9]), method = "euclidean")
hc_1 = hclust(d, method = "single", members = clustdata[,1])

Since not a lot of details given in the question, it seems you are not using the members argument correctly.

Just leave it as NULL if your aim is only to obtain a clustering.

hc_1 = hclust(d, method = "single")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM