简体   繁体   English

如何在循环R中创建独立的不同data.frame

[英]How to create independent different data.frame in a loop R

Good evening everybody,晚上好大家,

I'm stuck about the construction of the for loop, I don't have any problem, buit I'd like to understand how I can create dataframe "independents" (duplicite with some differences).我对 for 循环的构建感到困惑,我没有任何问题,但我想了解如何创建数据帧“独立对象”(重复但有一些差异)。

I wrote the code step by step (it works), but I think that, maybe, there is a way to compact the code with the for.我一步一步地编写了代码(它有效),但我认为,也许有一种方法可以使用 for 来压缩代码。

x is my original data.frame x是我的原始 data.frame

str(x)
Classes ‘data.table’ and 'data.frame':  13500 obs. of  6 variables:
 $ a: int  1 56 1058 567 987 574 1001...
 $ b: int  10 5 10 5 5 10 10 5 10 10 ...
 $ c: int  NA NA NA NA NA NA NA NA NA NA ...
 $ d: int  0 0 0 0 0 0 0 0 0 0 ...
 $ e: int  0 0 0 0 0 0 0 0 0 0 ...
 $ f: int  22 22 22 22 22 22 22 22 22 22 ...

My first goal is to delete per every column the eventualy NA and "" elements.我的第一个目标是删除每列最终 NA 和 "" 元素。 I do this by these codes of rows.我通过这些行代码来做到这一点。

x_b<- x[!(!is.na(x$b) & x$b==""), ]
x_c<- x[!(!is.na(x$c) & x$c==""), ]
x_d<- x[!(!is.na(x$d) & x$d==""), ]
x_e<- x[!(!is.na(x$e) & x$e==""), ]
x_f<- x[!(!is.na(x$f) & x$f==""), ]

After this the second goal is to create per each new data.frame a id code that I create using the function paste0(x_b$a, x_b$f) .在此之后,第二个目标是为每个新的 data.frame 创建一个我使用函数paste0(x_b$a, x_b$f)创建的 id 代码。

x_b$ID_1<-paste0(x_b$a, x_b$b)
x_c$ID_2<-paste0(x_c$a, x_c$c)
x_d$ID_3<-paste0(x_c$a, x_c$d)
x_e$ID_4<-paste0(x_c$a, x_c$e)
x_f$ID_5<-paste0(x_c$a, x_c$f)

I created this for loop to try to minimize the rows that I use, and to create a good code visualization.我创建这个 for 循环是为了尽量减少我使用的行,并创建一个好的代码可视化。

z<-data.frame("a", "b","c","d","e","f")
zy<-data.frame("x_b", "x_c", "x_d", "x_e", "x_f")


for(i in z) {
  for (j in zy ) {
    target <- paste("_",i)
    x[[i]]<-(!is.na(x[[i]]) & x[[i]]=="") #with this I able to create a column on the x data.frame, 
                                          #but if I put a new dataframe the for doesn't work
                                          #the name, but I don't want this. I'd like to create a 
                                          #data.base per each transformation.

                                          #at this point of the script, I should have a new 
                                          #different dataframe, as x_b, x_c, x_d, x_e, x_f but I 
                                          #don't know

                                          #How to create them?

                                          #If I have these data frame I will do this anther function 
                                          #in the for loop:
    zy[[ID]]<-paste0(x_b$a, "_23X")
   }
}

I'd like to have as output this:我想有这样的输出:

str(x_b)
    Classes ‘data.table’ and 'data.frame':  13500 obs. of  6 variables:
     $ a: int  1 56 1058 567 987 574 1001...
     $ b: int  10 5 10 5 5 10 10 5 10 10 ...
     $ c: int  NA NA NA NA NA NA NA NA NA NA ...
     $ d: int  0 0 0 0 0 0 0 0 0 0 ...
     $ e: int  0 0 0 0 0 0 0 0 0 0 ...
     $ f: int  22 22 22 22 22 22 22 22 22 22 ...
     $ ID: int  1_23X 56_23X 1058_23X 567_23X 987_23X 574_23X 1001_23X...

and so on.等等。

I think that there is some important concept about the dataframe that I miss.我认为我错过了一些关于数据框的重要概念。

Where I wrong?我哪里错了?

Thank you so much in advance for the support.非常感谢您的支持。

There is simple way to do this with the tidyverse package(s):使用tidyverse包有一种简单的方法可以做到这一点:

First goal:第一个目标:

drop.na(df)

You can also use na_if if you want convert "" to NA .如果要将""转换为NA也可以使用na_if

Second goal: use mutate to create a new variable:第二个目标:使用mutate创建一个新变量:

df <- df %>% 
 mutate(id = paste0(x_b$a, "_23X"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM