简体   繁体   English

梳理数据框,行未正确对齐

[英]Combing data frame, rows not lining up properly

I'm trying to combine 2 data frames ( 1960 and 2000 ). 我正在尝试合并2个数据帧( 19602000 )。 These data frames are different lengths but the column names are the same. 这些数据帧的长度不同,但列名相同。 My first attempt was to use the plyr package and bind based on the column names: 我的第一次尝试是使用plyr包并根据列名称进行绑定:

library(plyr)
combined <- rbind.fill(1960[c("Name","Gender","1960")],2000[c("Name","Gender","2000")])

This was fine but I noticed that it wasn't merging the rows properly. 很好,但是我注意到它没有正确合并行。 A sample of the data in the data frame shows that there are no females called Aaron born in 1960 on the first row but the 3rd row shows there are 20. 数据框中的数据样本显示,第一行没有出生于1960年的叫亚伦的女性,而第三行显示有20位女性。

Name   Gender  1960  2000
Aaron  F       NA    35    29613
Aaron  M       NA    9548  2728
Aaron  F       20    NA    7511
Aaron  M       1772  NA

I then tried smartbind but got the same result: 然后,我尝试使用smartbind但结果相同:

library(gtools)
t <- smartbind(1960, 2000)

I'm not sure how to get female and male entries to correspond. 我不确定如何获得女性和男性参赛作品。 I've also tried merging the data frames but I don't really like the output. 我也尝试过合并数据框,但我并不真正喜欢输出。

m <- merge(1960, 2000, by = c("Name"), all = TRUE)
m[is.na(m)] <- 0  

If anyone could advise how I can get the rows to line up properly based on the name and gender I'd really appreciate. 如果有人可以建议我如何根据姓名和性别正确排列行,我将非常感谢。

EDIT: The two data frames consist of 3 columns: Name , Gender and Total . 编辑:这两个数据框由3列组成: NameGenderTotal The Total column represents the number of people in the year with a particular name. Total栏代表年份中具有特定名称的人数。 The 1960 data frame shows the total per name for that year, and the 2000 data frame shows the total for that year. 1960数据框显示该年每个名称的总数,而2000数据框显示该年的总数。 When I merge the 2 data frames the output is: 当我合并两个数据帧时,输出为:

Name   Gender.x  1960  Gender.y  2000  
Aaron  F         20    F         35 
Aaron  F         20    M         9548 
Aaron  M         1772  F         35 
Aaron  M         1772  M         9548 

What I don't like about merging them is that the M and F genders are showing on the same line. 我不喜欢合并它们,因为M和F性别显示在同一行。 I can manipulate them in the data frame output so they line up but I'd rather produce it properly with code if you know what I mean? 我可以在数据帧输出中操纵它们,以便它们对齐,但是如果您知道我的意思,我希望用代码正确地生成它们。

To conclude the question and for future reference : 总结问题并供将来参考:

m <- merge(yob1960, yob2000, by = c("Name", "Gender" ), all = TRUE)

This will keep name and gender combinations in the same row for both years. 这将在两年中将姓名和性别组合保持在同一行中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM