简体   繁体   English

R:识别一列中的非 NA 值并创建 dataframe 并选择另一列中的值

[英]R: Identify non-NA values from one column and create dataframe with values from another column based rows selected

I have a data frame (df) with multiple columns (45) and rows (20,000):我有一个包含多列(45)和多行(20,000)的数据框(df):

I want to filter each variable column by selecting only the rows with non-NA values and creating a separate data frame with the corresponding ID and Name for the rows selected.我想通过仅选择具有非 NA 值的行并为所选行创建一个具有相应 ID 和名称的单独数据框来过滤每个变量列。 I then want to save each data frame with the corresponding variable name.然后我想用相应的变量名保存每个数据框。 For example, the output data frames would look like this and would be saved as Var1 and Var2 respectively.例如,output 数据帧如下所示,将分别保存为 Var1 和 Var2。

Var 1 <变量 1 <

在此处输入图像描述

Var 2 <变量 2 <

在此处输入图像描述

I am currently trying to use this function on R and thinking of implementing a for loop.我目前正在尝试在 R 上使用此 function 并考虑实现 for 循环。

df2 = lapply(df, function(x) {x[.is.na(x)]}). df2 = lapply(df, function(x) {x[.is.na(x)]})。

This hasn't worked so well as it does not list the values from corresponding ID and Name column.这效果不太好,因为它没有列出相应 ID 和 Name 列中的值。 This also doesn't create a dataframe.这也不会创建 dataframe。

Any suggestions will be greatly appreciated!任何建议将不胜感激!

Here is how it can be done using dplyr & purrr这是使用dplyrpurrr完成的方法

Note that next time instead of posting image of your data, please try create sample data in R and copy paste the dput of that sample data instead.请注意,下次不要发布您的数据图像,而是尝试在dput中创建示例数据,然后复制粘贴该示例数据的 dput。

library(purrr)
library(dplyr)

data <- tibble(ID = c("A", "B", "C"),
  Name = c("D", "E", "F"),
  Var1 = c(1, NA, 2),
  Var2 = c(2, 2, NA),
  Var4 = c(NA, NA, 4))

columns <- names(data)[grepl("^Var", names(data))]


extract_na_item <- function(column_name, df) {
  df %>%
    filter(!is.na(!!sym(column_name))) %>%
    select(ID, Name)
}
list_var_not_na <- map(columns, extract_na_item, df = data)
names(list_var_not_na) <- columns

Here is the result这是结果

list_var_not_na
#> $Var1
#> # A tibble: 2 x 2
#>   ID    Name 
#>   <chr> <chr>
#> 1 A     D    
#> 2 C     F    
#> 
#> $Var2
#> # A tibble: 2 x 2
#>   ID    Name 
#>   <chr> <chr>
#> 1 A     D    
#> 2 B     E    
#> 
#> $Var4
#> # A tibble: 1 x 2
#>   ID    Name 
#>   <chr> <chr>
#> 1 C     F

And if you really want to have the variable assignment in global environment as you mentioned in OP you can do as below (Though I recommend just use the list to access the data instead)如果您真的想在 OP 中提到的那样在全局环境中分配变量,您可以执行以下操作(尽管我建议只使用列表来访问数据)

list2env(list_var_not_na, envir = globalenv())

Created on 2021-05-03 by the reprex package (v2.0.0)代表 package (v2.0.0) 于 2021 年 5 月 3 日创建

You can use lapply like so:您可以像这样使用lapply

cols <- grep('Var', names(df))
df2 <- lapply(df[cols], function(x) df[!is.na(x), -cols])
df2

#$Var1
#  ID Name
#1  A    D
#3  C    F

#$Var2
#  ID Name
#1  A    D
#2  B    E

#$Var4
#  ID Name
#3  C    F

data数据

df <- structure(list(ID = c("A", "B", "C"), Name = c("D", "E", "F"), 
    Var1 = c(1, NA, 2), Var2 = c(2, 2, NA), Var4 = c(NA, NA, 
    4)), class = "data.frame", row.names = c(NA, -3L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用 R 中另一列的值替换特定列中的非 NA 值 - Replace non-NA values in a Certain Column with Values From Another Column in R 以另一个变量的值为条件,用先前行中的非NA值替换数据框中变量的NA值 - Replacing NA values for a variable in a dataframe with non-NA values from prior rows conditional on values of another variable 如何使用其他多个列中的所有非NA值创建新列? - How to create new column with all non-NA values from multiple other columns? 通过另一列中非 NA 值的总和对 df 进行子集 - Subsetting a df by sum of non-NA values in another column 来自最后一个非NA的R天(按列分组) - R Days from Last non-NA Grouped by Column R:在数据帧中:将列中的第一个非NA值设置为NA - R: In dataframe: set first non-NA value in column to NA 从每行的最后一个非NA值构造新列 - Construct new column from last non-NA values for each row 如何删除 NA 并将非 NA 值移动到新列? - How to remove NA and move the non-NA values to new column? 如何从组内最后一个非 NA 行中的列中的 select 值并将其添加到另一列以创建新列 - How to select value from a column in the last non-NA row within group and add it to another column to create a new column 如何将一列中的NA值替换为另一列中的值? - How to replace NA values in one column with values from another column?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM