简体   繁体   English

获取Pandas DataFrame中行之间的差异值

[英]Get difference values between rows in Pandas DataFrame

Hi I have a result set from psycopg2 like so 嗨,我从psycopg2得到了一个结果集,像这样

(
(timestamp1, val11, val12, val13, val14),
(timestamp2, val21, val22, val23, val24),
(timestamp3, val31, val32, val33, val34),
(timestamp4, val41, val42, val43, val44),
)

I have to return the difference between the values of the row (exception for the timestamp column). 我必须返回行值之间的差异(时间戳列除外)。 Each row would subtract the previous row values. 每行将减去前一行的值。 The first row would be 第一行是

timestamp, 'NaN', 'NaN' ....

This has to then be returned as a generic object Ie something like an array of the following objects 然后必须将其作为通用对象返回,即类似于以下对象的数组

Group(timestamp=timestamp, rows=[val11, val12, val13, val14]

I was going to use Pandas to do the diff. 我打算用熊猫做比较。 Something like below works ok on the values 如下所示的值可以正常工作

df = DataFrame().from_records(data=results, columns=headers)
diffs = df.set_index('time', drop=False).diff()

But diff also performs on the timestamp column and I can't get it to ignore a column while leaving the original timestamp column in place. 但是diff也会在timestamp列上执行,我无法在保留原始timestamp列的同时忽略它。

Also I wasn't sure it was going to be efficient to get the data into my return format as Pandas advises against row access 另外我不确定将数据转换为我的返回格式是否有效,因为Pandas建议不要访问行

What would a fast way to get the result set differences in my required output format ? 在我所需的输出格式中获得结果集差异的快速方法是什么?

Why did you set drop=False ? 为什么设置drop=False That puts the timestamps in the index (where they will not be touched by diff ) but also leaves a copy of the timestamps as a proper column, to be process by diff . 这会将时间戳记放入索引中(在diff不会被它们触到),但还将时间戳记的副本作为适当的列保留,由diff

I think this will do what you want: 我认为这将满足您的要求:

diffs = df.set_index('time').diff().reset_index()

Since you mention psycopg2, take a look at the docs for pandas 0.14, released just a few days ago, which features improved SQL functionality, including new support for postgresql. 由于您提到了psycopg2,因此请看几天前发布的pandas 0.14文档,该文档具有改进的SQL功能,包括对PostgreSQL的新支持。 You can read and write directly between the database and pandas DataFrames. 您可以在数据库和熊猫DataFrame之间直接读写。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM