简体   繁体   English

R将名称转换为数字

[英]R convert names to numbers

I have a data frame with donations and names of donors. 我有一个捐赠和捐赠者姓名的数据框。

**donation**              **Donor**
 25.00               Steve Smith
 20.00               Jack Johnson
 50.00               Mary Jackson
  ...                   ...

I'm trying to do some clustering using the pvclust package. 我正在尝试使用pvclust包进行一些聚类。 Unfortunately the package doesn't seem to take non-numerical data. 不幸的是,包似乎没有采用非数字数据。

> rs1.pv1 <- parPvclust(cl, rs1, nboot=10)
Error in cor(x, method = "pearson", use = use.cor) : 'x' must be numeric

I have two questions. 我有两个问题。

1) Is there another package or method that would do this better? 1)是否有其他包装或方法可以做得更好?

2) Is there a way to "normalize" the donor names list? 2)有没有办法“规范化”捐赠者名单? Ie get a list of unique donor names, assign each an id number and then insert the id number into the data frame in place of the character name. 即获得唯一捐赠者名称的列表,为每个捐赠者名称分配一个ID号,然后将ID号插入数据框中以代替角色名称。

For number 2: 对于2号:

#If donor is a factor then

as.numeric(donor)

#will transform your factor to numeric.
#If it isn't, tranform it to a factor and the to numeric
as.numeric(as.factor(donor))

However, I'm not sure that transforming the donor list to a numeric and then using cor makes sense at all. 但是,我不确定将捐赠者列表转换为数字然后使用cor是有意义的。

HTH HTH

How about rs1 <- transform(rs1, Donor=as.numeric(factor(Donor))) ? rs1 <- transform(rs1, Donor=as.numeric(factor(Donor)))怎么样? ( Warning : I haven't thought about what you're doing enough to know whether that makes sense -- so I'm only answering question #2, not question #1). 警告 :我没有想过你在做什么就知道这是否合理 - 所以我只回答问题#2,而不是问题#1)。 Typically Donor would already be a factor (this is what eg read.table or read.csv would do by default), so the factor() part would be redundant. 通常, Donor已经是一个因素(例如read.tableread.csv默认情况下会这样做),因此factor()部分将是多余的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM