简体   繁体   English

根据 R 中数据帧中的现有行分配标头

[英]assign headers based on existing row in dataframe in R

After transforming a dataframe, I would like to assign heads/names to the columns based on an existing row .转换数据框后,我想根据现有行为列分配标题/名称。 My headers are currently:我的标题目前是:

row.names   X2  X3  X4  X5  X6  X7  X8  X9  ...

I would like to get rid of that and use the following row as column headers (without having to type them out since I have many).我想摆脱它并使用以下行作为列标题(因为我有很多,所以不必输入它们)。

The only solution I have for this is to export and re-load the data (with header=T).我对此的唯一解决方案是导出并重新加载数据(使用 header=T)。

The key here is to unlist the row first.这里的关键是首先取消列出该行。

colnames(DF) <- as.character(unlist(DF[1,]))
DF = DF[-1, ]

Try this:试试这个:

colnames(DF) = DF[1, ] # the first row will be the header
DF = DF[-1, ]          # removing the first row.

However, get a look if the data has been properly read.但是,请查看数据是否已正确读取。 If you data.frame has numeric variables but the first row were characters, all the data has been read as character.如果 data.frame 有数字变量,但第一行是字符,则所有数据都被读取为字符。 To avoid this problem, it's better to save the data and read again with header=TRUE as you suggest.为避免此问题,最好按照您的建议保存数据并使用 header=TRUE 再次读取。 You can also get a look to this question: Reading a CSV file organized horizontally .您还可以查看这个问题: Reading a CSV file同盟

The cleanest way is use a function of janitor package that is built for exactly this purpose.最干净的方法是使用专为此目的而构建的janitor包的功能。

janitor::row_to_names(DF,1)

If you want to use any other row than the first one, pass it in the second parameter.如果要使用除第一行之外的任何其他行,请将其传递到第二个参数中。

Very similar to Vishnu's answer but uses the lapply to map all the data to characters then to assign them as the headers.与毗湿奴的回答非常相似,但使用 lapply 将所有数据映射到字符,然后将它们分配为标题。 This is really helpful if your data is imported as factors.如果您的数据作为因子导入,这真的很有帮助。

DF[] <- lapply(DF, as.character)
colnames(DF) <- DF[1, ]
DF <- DF[-1 ,]

note that that if you have a lot of numeric data or factors you want you'll need to convert them back.请注意,如果您有大量想要的数字数据或因子,则需要将它们转换回来。 In this case it may make sense to store the character data frame, extract the row you want, and then apply it to the original data frame在这种情况下,存储字符数据框,提取所需的行,然后将其应用于原始数据框可能是有意义的

tempDF <- DF
tempDF[] <- lapply(DF, as.character)
colnames(DF) <- tempDF[1, ]
DF <- DF[-1 ,]
tempDF <- NULL

A new answer that uses dplyr and tidyr:使用 dplyr 和 tidyr 的新答案:

Extracts the desired column names and converts to a list提取所需的列名并转换为列表

library(tidyverse)

col_names <- raw_dta %>% 
  slice(2) %>%
  pivot_longer(
    cols = "X2":"X10", # until last named column
    names_to = "old_names",
    values_to = "new_names") %>% 
  pull(new_names)

Removes the incorrect rows and adds the correct column names删除不正确的行并添加正确的列名

dta <- raw_dta %>% 
  slice(-1, -2) %>% # Removes the rows containing new and original names
  set_names(., nm = col_names)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM