[英]rbind two datatables and make a new column in R
嗨,我有两个代表不同组的数据集:
student_details <- c("John", "Henrick", "Maria", "Lucas", "Ali")
student_class <- c("High School", "College", "Preschool", "High School", "college")
df1 <- data.frame(student_details, student_class)
#另一个 dataframe
Student_details<-c("Bracy","Evin")
Student_class<-c("High school","College")
Student_rank<-c("A","A+")
df2<-data.frame(Student_class,Student_details,Student_rank)
df2
我需要 rbind df1 和 df2 即使长度不相等,并在最后一个名为“数据集”的第三列中指示它来自哪个数据集:
您可以使用 data.table package 中的data.table
rbindlist()
function 来完成此操作。
两个数据框中的列名相同很重要,因为您要按列名绑定。
#convert uppercase letters in column names to lower case.
names(df2) <- tolower(names(df2))
接下来,将它们绑定在一起:
library(data.table)
final_df <- rbindlist(list(df1, df2), use.names = T, fill = T, idcol = "dataset")
final_df
Output:
dataset student_details student_class student_rank
1: 1 John High School <NA>
2: 1 Henrick College <NA>
3: 1 Maria Preschool <NA>
4: 1 Lucas High School <NA>
5: 1 Ali college <NA>
6: 2 Bracy High school A
7: 2 Evin College A+
我假设您的列名 student_details,student_class 在数据框中是相同的。 您可以使用比 rbind 更灵活的 bind_rows。 它将创建 NA 值。
student_details <- c("John", "Henrick", "Maria", "Lucas", "Ali")
student_class <- c("High School", "College", "Preschool", "High School", "college")
df1 <- data.frame(student_details, student_class)
student_details<-c("Bracy","Evin")
student_class<-c("High school","College")
student_rank<-c("A","A+")
df2<-data.frame(student_details,student_class,student_rank)
library(dplyr)
df_full<-bind_rows(df1,df2)
使用您的特定df1
和df2
,我们可以尝试从基础 R merge
> merge(df1, df2, all = TRUE, sort = FALSE)
student_details student_class student_rank
1 John High School <NA>
2 Henrick College <NA>
3 Maria Preschool <NA>
4 Lucas High School <NA>
5 Ali college <NA>
6 Bracy High school A
7 Evin College A+
但是使用rbindlist
的data.table
选项应该在一般意义上起作用(参见@Flap的回答)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.