简体   繁体   English

将列名分配给df,阻止字符自动转换为数字?

[英]Assigning column names to df, stop characters from automatically converting to numbers?

I have a df of gene names and a df of values for those genes.我有一个 df 的基因名称和一个 df 的这些基因的值。 I want to combine the two (label the data frame with the gene names).我想将两者结合起来(用基因名称标记数据框)。

I have a 1x23000 df, "genes":我有一个 1x23000 df,“基因”:

AB1G, A1CF, ..., ZYY1

and a 8000x23000 df, "expression":和一个 8000x23000 df,“表达式”:

1 0... 3

0 0... 1

2 2... 0

I want to combine them into one df:我想将它们组合成一个df:

A1BG A1CF... ZYY1

1 0... 3

0 0... 1

2 2... 0

When I use rbind, the gene names are maintained but all non-zero values in "expression" turn to N/A.当我使用 rbind 时,基因名称保持不变,但“表达式”中的所有非零值都变为 N/A。

When I use colnames(expression)=genes or colnames(expression)=paste(genes), the expression numbers are maintained but the gene names are converted to seemingly meaningless numbers.当我使用 colnames(expression)=genes 或 colnames(expression)=paste(genes) 时,表达式编号保持不变,但基因名称被转换为看似无意义的数字。

Upon further investigation, I saw that the structure of "genes" for some reason is "factor" - when I convert with as.character, all genes get reassigned what seems like a random number (AB1G is now 1143, A1CF is now 967, etc).经过进一步调查,我发现“基因”的结构出于某种原因是“因子”——当我使用 as.character 进行转换时,所有基因都被重新分配,看起来像是一个随机数(AB1G 现在是 1143,A1CF 现在是 967, ETC)。

I think there may be an issue with incompatible formats of the gene and expression dfs.我认为基因和表达dfs的格式不兼容可能存在问题。 How can I add character column names to a numeric df?如何将字符列名称添加到数字 df?

The issue of appearance of numeric value instead of string is because the columns were factor s and when it got coerced to integer storage mode values.出现数值而不是字符串的问题是因为列是factor s,并且当它被强制转换为 integer 存储模式值时。 Instead, we can convert to character class after unlist ing or convert to matrix which automatically convert the class to character and then remove the dim attributes with c (to convert to vector )相反,我们可以在 unlisting 后转换为character unlist或转换为自动将 class 转换为charactermatrix ,然后使用c删除dim属性(转换为vector

 colnames(expression) <- as.character(unlist(genes))

Or或者

colnames(expression) <- c(as.matrix(genes)) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM