简体   繁体   English

如何在R(或excel)中将每个SubjectID的多个excel表中的所有数据组合起来

[英]How to combine all data from multiple excel sheets per SubjectID in R (or excel)

I have an Excel file (xlsx) which contains multiple datasheets.我有一个包含多个数据表的 Excel 文件 (xlsx)。 All sheats contain answers to different questionnaires answered by different subjects.所有表都包含对不同主题回答的不同问卷的答案。 Every Subject has it's on row in each sheet (with SubjectID) and the top row has the unique name of the specific question Not all subjects have answered each questionnaire so not all datasheets have the exact same number of rows and the sheets are not ordered by subjectID每个主题都在每张表中的一行上(带有主题 ID),顶行具有特定问题的唯一名称 并非所有主题都回答了每个问卷,因此并非所有数据表都具有完全相同的行数,并且表不是按顺序排列的主题ID

I want to create 1 file in which each subject has it's own row and all answers from that subject are added to that row.我想创建 1 个文件,其中每个主题都有自己的行,并且该主题的所有答案都添加到该行。 In case a subject has not answered a specific question (or does not appear in a sheet at all, their value for that column should remain empty.如果主题没有回答特定问题(或根本没有出现在工作表中,则该列的值应保持为空。

I can't seem to find a way to combine all these steps (either in R of Excel)我似乎找不到组合所有这些步骤的方法(在 Excel 的 R 中)

Anyone who can help me get going?谁能帮我上路?

Hard to answer without a specific example, but the following might work:没有具体的例子很难回答,但以下可能有效:

library(readxl) 
lapply(excel_sheets(path), read_excel, path = path) %>%
purrr::reduce(merge,by="subjectID")

path is the path to the Excel file. path是 Excel 文件的路径。

This creates a list with each sheet as one data.frame within it, and feeds it into 'reduce' which merges the first two dataframes by subjectID, and then merges the result with the third and so on.这将创建一个列表,其中每个工作表都作为一个 data.frame,并将其输入到“reduce”中,该列表按主题 ID 合并前两个数据帧,然后将结果与第三个数据帧合并,依此类推。

More information would be helpful to really answer the question.更多信息将有助于真正回答问题。

You can probably accomplish this in R using the tidyverse package:您可以使用 tidyverse 包在 R 中完成此操作:

install.packages("tidyverse")
library("tidyverse")

Then you need to import your Excel sheets:然后你需要导入你的 Excel 表格:

Sheet_A <- read_excel(File_Name, sheet = "Sheet_Name")
Sheet_B <- read_excel(File_Name, sheet = "Sheet_Name")
Sheet_Z <- read_excel(File_Name, sheet = "Sheet_Name")

Then you need to join all of the sheets on whatever your ID column is called:然后,无论您的 ID 列叫什么,您都需要加入所有工作表:

Come_Together <- Sheet_A %>%
    left_join(Sheet_B, by='ID_COLUMN') %>%
    left_join(Sheet_C, by='ID_COLUMN')

And then you can write them out to an Excel file in one sheet, if you'd like:然后,如果您愿意,您可以将它们写到一张工作表中的 Excel 文件中:

install.packages("xlsx")
library("xlsx")
write.xlsx(Come_Together, filepath, sheetName = "Sheet_Name")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM