简体   繁体   English

使用 apply 在 N 个数据集列表中的每个数据集的顶部添加一个(固定的)新行

[英]Adding a (fixed) new row to the top of each dataset in a list of N datasets using apply

I have N data sets which were loaded into RStudio and stored in the list object "datasets".我有 N 个数据集,它们被加载到 RStudio 并存储在列表 object“数据集”中。 The problem is what I want to be the top row in each of them or the headers for each of them, either way is in their third rows.问题是我想成为他们每个人的第一行或他们每个人的标题,无论哪种方式都在他们的第三行。

The initial version of this question I posted only had the paragraph below describing what each of the N datasets look like, but I realized that is not nearly simple enough, so now I am including a screenshot of what one of them looks like right below that paragraph.我发布的这个问题的初始版本只有下面的段落描述了 N 个数据集的每一个是什么样子,但我意识到这还不够简单,所以现在我在下面添加了其中一个的截图段落。

Each dataset is 503 by 31 and that 3rd row is "Y", "X1", "X2", ..., "X30" in every dataset, the first row in each of them is a row of dummy variables, so all of them are either 1 or 0 depending on a condition.每个数据集是 503 x 31,第 3 行是每个数据集中的“Y”、“X1”、“X2”、...、“X30”,每个数据集中的第一行是一行虚拟变量,所以所有其中有 1 或 0 取决于条件。 The 2nd row of each is black in the first spot, then '1', '2', '3', ..., '30'.第 2 行的第一个位置是黑色,然后是“1”、“2”、“3”、...、“30”。 在此处输入图像描述

What I want to do from here is to add a new row, one equivalent to the 3rd row, to the top of each of these N dataframe elements within the list object datasets, or add proper headers to them instead which would be even better.我想从这里开始做的是在列表 object 数据集内的每个 N dataframe 元素的顶部添加一个新行,相当于第 3 行,或者向它们添加适当的标题,这样会更好。 Or, find a way to make delete or drop the 2nd row, then make the 1st and the new 2nd row switch places.或者,想办法删除或删除第二行,然后让第一行和新的第二行交换位置。

I just also took the liberty of adding in that new row in the source csv file-formatted dataset and screenshotting that to demonstrate what the dataframe in R for that dataset should look like after I am done applying whatever answer to this question works我还冒昧地在源 csv 文件格式的数据集中添加了新行,并对其进行了截图,以演示在我对这个问题应用任何答案后,该数据集的 R 中的 dataframe 应该是什么样子在此处输入图像描述

Would it be possible for me to somehow combine an rbind() function with one of the apply functions to accomplish this task??我是否有可能以某种方式将 rbind() function 与应用函数之一结合起来以完成此任务?

ps What is below the 3rd rows of each dataframe are just 500 rows of observations on each of the 31 variables. ps 每个 dataframe 的第 3 行下方是 31 个变量中每个变量的 500 行观察值。

I already tried to add the aforementioned row names to each dataframe using the following:我已经尝试使用以下方法将上述行名称添加到每个 dataframe:

lapply(datasets, function(i){
colnames(i) <- c("Y", "X1","X2", "X3", "X4","X5", "X6", "X7","X8", "X9",

              "X10","X11", "X12", "X13","X14", "X15", "X16","X17", 

              "X18", "X19","X20", "X21", "X22","X23", "X24", "X25",

              "X26", "X27", "X28","X29", "X30") }

But this didn't actually result in any permanent changes in datasets at all to my surprise.但令我惊讶的是,这实际上并没有导致数据集发生任何永久性变化。

ps The 2nd thing I do in this script (after setting the WorkSpace) is load the following libraries with the following unorthodox method: ps 我在这个脚本中做的第二件事(在设置工作区之后)是使用以下非正统方法加载以下库:

# load all necessary packages using only 1 command/line
library_list <- c(library(stats),library(plyr),library(dplyr),
                  library(tidyverse),library(tibble),library(readr),
                  library(leaps),library(lars),library(stringi),
                  library(purrr),library(parallel), library(vroom))

I just run rm(library_list immediately) afterwards and it's like I never did it weird.之后我只是运行 rm(library_list immediately) 就好像我从来没有做过奇怪的事情一样。 I do it that way because my hands are disabled, so the less thumb clicks to run each line individually, the better!我这样做是因为我的手被禁用了,所以拇指点击来单独运行每一行的次数越少越好!

If I understand you correctly this should work:如果我理解正确的话,这应该有效:


library(janitor)
library(purrr)
library(dplyr)

# create a list

df1 <- read.table(header = FALSE, 
           text = '
           1 0 1 1 0
           1 2 3 4 5
           X1 X2 X3 X4 X5
           no no no no no')

df2 <- read.table(header = FALSE, 
                  text = '
           1 1 0 0 0
           6 7 8 9 10
           X1 X2 X3 X4 X5
           no no no no no')


my_list <- list(df1, df2)

Base R基地 R

# create a custom function and then use it with lapply
my_renamer <- function(df, row=3){
  names(df) <- df[row,]
  df
}

lapply(my_list, function(x) my_renamer(x, 3))

OR with purrr and janitor s row_to_names :或者使用purrrjanitorrow_to_names

map(my_list, ~row_to_names(., remove_rows_above = FALSE, 
                           remove_row = FALSE, 3))

OR with lapply and janitor :或者与lapplyjanitor

lapply(my_list, function(x) row_to_names(x, remove_rows_above = FALSE, 3))
[[1]]
  X1 X2 X3 X4 X5
1  1  0  1  1  0
2  1  2  3  4  5
3 X1 X2 X3 X4 X5
4 no no no no no

[[2]]
  X1 X2 X3 X4 X5
1  1  1  0  0  0
2  6  7  8  9 10
3 X1 X2 X3 X4 X5
4 no no no no no

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM