简体   繁体   中英

Assigning column names to df, stop characters from automatically converting to numbers?

I have a df of gene names and a df of values for those genes. I want to combine the two (label the data frame with the gene names).

I have a 1x23000 df, "genes":

AB1G, A1CF, ..., ZYY1

and a 8000x23000 df, "expression":

1 0... 3

0 0... 1

2 2... 0

I want to combine them into one df:

A1BG A1CF... ZYY1

1 0... 3

0 0... 1

2 2... 0

When I use rbind, the gene names are maintained but all non-zero values in "expression" turn to N/A.

When I use colnames(expression)=genes or colnames(expression)=paste(genes), the expression numbers are maintained but the gene names are converted to seemingly meaningless numbers.

Upon further investigation, I saw that the structure of "genes" for some reason is "factor" - when I convert with as.character, all genes get reassigned what seems like a random number (AB1G is now 1143, A1CF is now 967, etc).

I think there may be an issue with incompatible formats of the gene and expression dfs. How can I add character column names to a numeric df?

The issue of appearance of numeric value instead of string is because the columns were factor s and when it got coerced to integer storage mode values. Instead, we can convert to character class after unlist ing or convert to matrix which automatically convert the class to character and then remove the dim attributes with c (to convert to vector )

 colnames(expression) <- as.character(unlist(genes))

Or

colnames(expression) <- c(as.matrix(genes)) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM