[英]How to compare two dataframes of the same size and create a new one without the rows that have the same value in a column
I am creating a data acquisition device that retrieves sensor data (from an API) every 5 minutes and saves it in CSV files (exported every 24h to a database) and I would like to decrease the size of these files by only saving the data when the value changes. 我正在创建一个数据采集设备,该设备每5分钟检索一次传感器数据(从API)并将其保存在CSV文件中(每24小时导出到数据库中),我想通过仅在以下情况下保存数据来减小这些文件的大小:价值改变了。
My idea is to save all the data in a "memory" CSV file (which will be deleted at the end of the day) and to compare the last X lines (df1 -> T1) with the new dataframe (df2 -> T2) and to create the dataframe (df3 -> T2) without the lines where the values remain the same. 我的想法是将所有数据保存在“内存” CSV文件中(该文件将在一天结束时删除),并将最后X行(df1-> T1)与新数据帧(df2-> T2)比较并创建数据框(df3-> T2),而各行的值保持不变。 This df3 will be written in another CSV which will be exported to the database at the end of the day.
此df3将以另一个CSV格式编写,并在一天结束时导出到数据库中。
Is this the right way to proceed ? 这是正确的方法吗?
How to compare two dataframes of the same size and create a 3rd dataframe without the rows where the value does not change ? 如何比较两个相同大小的数据帧,并创建第三个数据帧,而没有值不变的行?
df1
Time Name Value
0 t1 Name1 3
1 t1 Name2 1
2 t1 Name3 5
3 t1 Name4 9
df2
Time Name Value
0 t2 Name1 3
1 t2 Name2 7
2 t2 Name3 5
3 t2 Name4 2
df3
Time Name Value
0 t2 Name2 7
1 t2 Name4 2
Use DataFrame.merge
with indicator and filter only right_only
rows: 将
DataFrame.merge
与指标一起使用,并仅过滤right_only
行:
df = (df1.merge(df2, on=['Name','Value'], indicator=True, how='outer', suffixes=('_',''))
.query('_merge == "right_only"')[df2.columns])
print (df)
Time Name Value
4 t2 Name2 7
5 t2 Name4 2
采用:
df3 = df2[df2['value'] != df1['value']]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.