简体   繁体   English

如何摆脱 R 中具有相同名称的多个列?

[英]How do I get rid of multiple columns with the same name in R?

I'm gathering SAT scores by school districts in Texas and their amount of education spending.我正在收集德克萨斯州各学区的 SAT 分数及其教育支出金额。 The data for SAT scores come in csv files that are split by year. SAT 分数的数据来自按年份拆分的 csv 文件。 I want to consolidate the scores into my dataframe that has the amount of education spending without creating multiple columns for Total, Math score, Reading score, etc.我想将分数合并到我的数据框中,该数据框具有教育支出金额,而无需为总计、数学分数、阅读分数等创建多个列。

I've tried the different types of join functions, semi_join , full_join , left_join , etc. but none of these seems to address the issue I am having.我尝试了不同类型的连接函数, semi_joinfull_joinleft_join等,但这些似乎都无法解决我遇到的问题。

temp1<-left_join(temp, sat17, by= c("District","year"))%>% 

left_join(., sat16, by=c("District","year"))%>%

left_join(., sat15, by=c("District","year"))%>%

left_join(., sat14, by=c("District","year"))%>%

left_join(., sat13, by=c("District","year"))%>%

left_join(., sat12, by=c("District","year"))%>%

left_join(., sat11, by=c("District","year"))

The output gives me columns Math.x, Math.y, Total.x, Total.y, and so on for each joined dataframe.输出为每个连接的数据帧提供 Math.x、Math.y、Total.x、Total.y 等列。 Also, sat17 includes a column called ERW, instead of Reading because the test changed that year.此外,sat17 包括一个名为 ERW 的专栏,而不是 Reading,因为那一年的测试发生了变化。 I want to keep ERW separate, and the rest of the Reading, Math, and Total scores to line up under one of each column.我想将 ERW 分开,其余的阅读、数学和总分排在每一列的下面。

I think that what you want to do is to bind them together... that is to "add" them up one on the top of the other.我认为你想要做的是将它们绑定在一起......也就是说将它们“添加”到另一个之上。

Try:尝试:

do.call(rbind, dfs) # dfs is the list of dataframes

or using purrr或使用purrr

library(purrr)
bind_rows(dfs, .id = NULL)

Or say you want to just bind them at the.csv level to begin with, just throw all your files into a subdirectory called "data".或者说您只想将它们绑定到 .csv 级别开始,只需将所有文件放入名为“数据”的子目录中即可。 You can try something like this:你可以尝试这样的事情:

setwd("./data/")
library(purrr)
library(tidyverse)
binded_data <- tibble(filenames = list.files()) %>%
  mutate(yearly_sat = map(filenames, read_csv)) %>%
  unnest()

Explanation解释

dplyr is automatically going to rename any columns that you don't join by and have a matching column name in the joined data set. dplyr会自动重命名您没有加入的任何列,并且在加入的数据集中具有匹配的列名。

In your case, since you only want to join by=c("District", "year") , any other columns that have the same name will get renamed.在您的情况下,由于您只想加入by=c("District", "year") ,因此任何其他具有相同名称的列都将被重命名。

The starting data set columns getting .x appended to the end of their name, while the columns being left joined get .y appended to the end of their name.起始数据集的列将.x附加到其名称的末尾,而左连接的列将.y附加到其名称的末尾。

Solution解决方案

If you want to have Math, Reading, and Total all in the same column, then you need to stack the data sets in top of each other with dplyr::bind_rows()如果您想将数学、阅读和总计全部放在同一列中,则需要使用dplyr::bind_rows()将数据集堆叠在一起

combined_sat <- dplyr::bind_rows(sat17, sat16, sat15, sat14, sat13, sat12,  sat11)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从R中的多个列获取计数? - How do I get count from multiple columns in R? 如何搜索具有相同名称的列,添加列值并使用相同的名称替换这些列的总和? 使用R. - how do I search for columns with same name, add the column values and replace these columns with same name by their sum? Using R 在R中使用plot()时如何摆脱网格? - How do I get rid of grids when using plot() in R? 如何去掉 R shiny 中的某个列? - How do I get rid of a certain column in R shiny? 如何在不考虑Na值的情况下返回多列,并按R中的其他列名称分组? - How do I return multiple columns without consider Na values and group by other columns name in R? 如何摆脱R中一个时间序列中的多个离群值? - How to get rid of multiple outliers in a timeseries in R? R dplyr具有多个具有相同主干名称的列 - R dplyr with multiple columns with same stem name 如何将多个 csv 文件读入 R 并确保所有列都是相同的数据? - How do I read multiple csv files into R and ensure that all columns are the same data? 如何根据条件获得 R 中多列的中位数(根据另一列) - How do I get the median of multiple columns in R with conditions (according to another column) 如何合并R中具有相同列的多个数据框? - How can I merge multiple data frames with the same columns in R?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM