[英]How to create independent different data.frame in a loop R
Good evening everybody,晚上好大家,
I'm stuck about the construction of the for loop, I don't have any problem, buit I'd like to understand how I can create dataframe "independents" (duplicite with some differences).我对 for 循环的构建感到困惑,我没有任何问题,但我想了解如何创建数据帧“独立对象”(重复但有一些差异)。
I wrote the code step by step (it works), but I think that, maybe, there is a way to compact the code with the for.我一步一步地编写了代码(它有效),但我认为,也许有一种方法可以使用 for 来压缩代码。
x
is my original data.frame x
是我的原始 data.frame
str(x)
Classes ‘data.table’ and 'data.frame': 13500 obs. of 6 variables:
$ a: int 1 56 1058 567 987 574 1001...
$ b: int 10 5 10 5 5 10 10 5 10 10 ...
$ c: int NA NA NA NA NA NA NA NA NA NA ...
$ d: int 0 0 0 0 0 0 0 0 0 0 ...
$ e: int 0 0 0 0 0 0 0 0 0 0 ...
$ f: int 22 22 22 22 22 22 22 22 22 22 ...
My first goal is to delete per every column the eventualy NA and "" elements.我的第一个目标是删除每列最终 NA 和 "" 元素。 I do this by these codes of rows.
我通过这些行代码来做到这一点。
x_b<- x[!(!is.na(x$b) & x$b==""), ]
x_c<- x[!(!is.na(x$c) & x$c==""), ]
x_d<- x[!(!is.na(x$d) & x$d==""), ]
x_e<- x[!(!is.na(x$e) & x$e==""), ]
x_f<- x[!(!is.na(x$f) & x$f==""), ]
After this the second goal is to create per each new data.frame a id code that I create using the function paste0(x_b$a, x_b$f)
.在此之后,第二个目标是为每个新的 data.frame 创建一个我使用函数
paste0(x_b$a, x_b$f)
创建的 id 代码。
x_b$ID_1<-paste0(x_b$a, x_b$b)
x_c$ID_2<-paste0(x_c$a, x_c$c)
x_d$ID_3<-paste0(x_c$a, x_c$d)
x_e$ID_4<-paste0(x_c$a, x_c$e)
x_f$ID_5<-paste0(x_c$a, x_c$f)
I created this for loop to try to minimize the rows that I use, and to create a good code visualization.我创建这个 for 循环是为了尽量减少我使用的行,并创建一个好的代码可视化。
z<-data.frame("a", "b","c","d","e","f")
zy<-data.frame("x_b", "x_c", "x_d", "x_e", "x_f")
for(i in z) {
for (j in zy ) {
target <- paste("_",i)
x[[i]]<-(!is.na(x[[i]]) & x[[i]]=="") #with this I able to create a column on the x data.frame,
#but if I put a new dataframe the for doesn't work
#the name, but I don't want this. I'd like to create a
#data.base per each transformation.
#at this point of the script, I should have a new
#different dataframe, as x_b, x_c, x_d, x_e, x_f but I
#don't know
#How to create them?
#If I have these data frame I will do this anther function
#in the for loop:
zy[[ID]]<-paste0(x_b$a, "_23X")
}
}
I'd like to have as output this:我想有这样的输出:
str(x_b)
Classes ‘data.table’ and 'data.frame': 13500 obs. of 6 variables:
$ a: int 1 56 1058 567 987 574 1001...
$ b: int 10 5 10 5 5 10 10 5 10 10 ...
$ c: int NA NA NA NA NA NA NA NA NA NA ...
$ d: int 0 0 0 0 0 0 0 0 0 0 ...
$ e: int 0 0 0 0 0 0 0 0 0 0 ...
$ f: int 22 22 22 22 22 22 22 22 22 22 ...
$ ID: int 1_23X 56_23X 1058_23X 567_23X 987_23X 574_23X 1001_23X...
and so on.等等。
I think that there is some important concept about the dataframe that I miss.我认为我错过了一些关于数据框的重要概念。
Where I wrong?我哪里错了?
Thank you so much in advance for the support.非常感谢您的支持。
There is simple way to do this with the tidyverse
package(s):使用
tidyverse
包有一种简单的方法可以做到这一点:
First goal:第一个目标:
drop.na(df)
You can also use na_if
if you want convert ""
to NA
.如果要将
""
转换为NA
也可以使用na_if
。
Second goal: use mutate
to create a new variable:第二个目标:使用
mutate
创建一个新变量:
df <- df %>%
mutate(id = paste0(x_b$a, "_23X"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.