[英]Join two Pandas Dataframes
We have two tables: 我们有两个表:
Table 1: EventLog 表1:EventLog
class EventLog(Base):
""""""
__tablename__ = 'event_logs'
id = Column(Integer, primary_key=True, autoincrement=True)
# Keys
event_id = Column(Integer)
data = Column(String)
signature = Column(String)
# Unique constraint
__table_args__ = (UniqueConstraint('event_id', 'signature'),)
Table 2: Machine_Event_Logs 表2:Machine_Event_Logs
class Machine_Event_Logs(Base):
""""""
__tablename__ = 'machine_event_logs'
id = Column(Integer, primary_key=True, autoincrement=True)
# Keys
machine_id = Column(String, ForeignKey("machines.id"))
event_log_id = Column(String, ForeignKey("event_logs.id"))
event_record_id = Column(Integer)
time_created = Column(String)
# Unique constraint
__table_args__ = (UniqueConstraint('machine_id', 'event_log_id', 'event_record_id', 'time_created'),)
# Relationships
event_logs = relationship("EventLog")
The relationship between EventLogs
and Machine_Event_Logs
is 1 to many. EventLogs
和Machine_Event_Logs
之间的关系是一对多。
Whereby we register a unique event log into the EventLogs
table and then register millions of entries into Machine_Event_Logs
for every time we encounter that event. 因此,我们在
EventLogs
表中注册一个唯一的事件日志,然后在每次遇到该事件时,将数百万个条目注册到Machine_Event_Logs
中。
Goal: We're trying to join both table to display the entire timeline of event logs captured. 目标:我们试图将两个表都连接起来,以显示捕获的事件日志的整个时间表。
We've tried multiple combinations of the merge()
function in Panda Dataframe but it only returns a bunch of NaN or empty. 我们在Panda Dataframe中尝试了
merge()
函数的多种组合,但它仅返回一堆NaN或为空。 For example: 例如:
pd.merge(event_logs, machine_event_logs, how='left', left_on='id', right_on='event_log_id')
Any ideas on how to solve this? 关于如何解决这个问题的任何想法?
Thank in in advance for your assistance. 在此先感谢您的协助。
According to your data schema, you have incompatible types where id
in event_logs is an Integer and event_log_id
in machine_event_logs is a String column. 根据您的数据模式,您具有不兼容的类型,其中event_logs中的
id
是一个整数,而machine_event_logs中的event_log_id
是一个String列。 In Python the equality of a string and its equivalent numeric value yields false: 在Python中,字符串的相等性及其等效数值产生false:
print('0'==0)
# False
Therefore your pandas left join merge returns all NAN
on right hand side since no matches are successfully found. 因此,由于未成功找到匹配项,因此您的熊猫左连接合并将返回右侧的所有
NAN
。 Consider converting to align types for proper merging: 考虑转换为对齐类型以进行正确合并:
event_logs['id'] = event_logs['id'].astype(str)
OR 要么
machine_event_logs['event_log_id'] = machine_event_logs['event_log_id'].astype(int)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.