繁体   English   中英

在时间戳上加入两个熊猫数据帧

[英]Join two pandas dataframes on timestamps

我有两个 Pandas 数据框,列中相同。 每个都有一个时间戳列。 一个数据帧包含来自用户 A 的文本数据,另一个数据帧包含来自用户 B 的文本数据。当用户 A 说话时,用户 B 没有说话,因此数据永远不会重叠。 我想将它们合并到一个按时间戳组织的数据帧中。

 df_a start stop words 0 2.1 i know honey but what happened we got a job 3.7 6.4 no know but thats a different kind of help but 8.2 11.5 because people that are supposed to be 12.9 15.4 yeah but where else can you go to get one df_b start stop words 2.2 3.6 but he never said 6.5 8.2 but what? 11.6 12.8 i dont think thats true 15.5 19.2 anywhere i dont know desired_output start stop words 0 2.1 i know honey but what happened we got a job 2.2 3.6 but he never said 3.7 6.4 no know but thats a different kind of help but 6.5 8.2 but what? 8.2 11.5 because people that are supposed to be 11.6 12.8 i dont think thats true 12.9 15.4 yeah but where else can you go to get one 15.5 19.2 anywhere i dont know

这应该做:

df = df_a.append(df_b).sort_values(by=['start'])

鉴于操作感觉更像是连接而不是加入,我会使用pd.concat

output = pd.concat([df_a,df_b]).sort_values(['start'])
print(output)
   start  stop                                           words
0    0.0   2.1     i know honey but what happened we got a job
0    2.2   3.6                               but he never said
1    3.7   6.4  no know but thats a different kind of help but
1    6.5   8.2                                       but what?
2    8.2  11.5          because people that are supposed to be
2   11.6  12.8                         i dont think thats true
3   12.9  15.4       yeah but where else can you go to get one
3   15.5  19.2                            anywhere i dont know

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM