[英]R convert names to numbers
I have a data frame with donations and names of donors. 我有一个捐赠和捐赠者姓名的数据框。
**donation** **Donor**
25.00 Steve Smith
20.00 Jack Johnson
50.00 Mary Jackson
... ...
I'm trying to do some clustering using the pvclust
package. 我正在尝试使用
pvclust
包进行一些聚类。 Unfortunately the package doesn't seem to take non-numerical data. 不幸的是,包似乎没有采用非数字数据。
> rs1.pv1 <- parPvclust(cl, rs1, nboot=10)
Error in cor(x, method = "pearson", use = use.cor) : 'x' must be numeric
I have two questions. 我有两个问题。
1) Is there another package or method that would do this better? 1)是否有其他包装或方法可以做得更好?
2) Is there a way to "normalize" the donor names list? 2)有没有办法“规范化”捐赠者名单? Ie get a list of unique donor names, assign each an id number and then insert the id number into the data frame in place of the character name.
即获得唯一捐赠者名称的列表,为每个捐赠者名称分配一个ID号,然后将ID号插入数据框中以代替角色名称。
For number 2: 对于2号:
#If donor is a factor then
as.numeric(donor)
#will transform your factor to numeric.
#If it isn't, tranform it to a factor and the to numeric
as.numeric(as.factor(donor))
However, I'm not sure that transforming the donor list to a numeric and then using cor makes sense at all. 但是,我不确定将捐赠者列表转换为数字然后使用cor是有意义的。
HTH HTH
How about rs1 <- transform(rs1, Donor=as.numeric(factor(Donor)))
? rs1 <- transform(rs1, Donor=as.numeric(factor(Donor)))
怎么样? ( Warning : I haven't thought about what you're doing enough to know whether that makes sense -- so I'm only answering question #2, not question #1). ( 警告 :我没有想过你在做什么就知道这是否合理 - 所以我只回答问题#2,而不是问题#1)。 Typically
Donor
would already be a factor (this is what eg read.table
or read.csv
would do by default), so the factor()
part would be redundant. 通常,
Donor
已经是一个因素(例如read.table
或read.csv
默认情况下会这样做),因此factor()
部分将是多余的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.