简体   繁体   English

Python Pandas使用时间索引值移动数据框

[英]Python pandas shift dataframe with time index value

I am quite new with python and am struggling with the shift in pandas. 我对python很陌生,并且正在努力应对熊猫的shift

I am comparing data, but it needs to be aligned to compare it. 我正在比较数据,但需要对其进行调整以进行比较。 To align the data, I only need to shift one of the data's index values. 为了对齐数据,我只需要移动数据的索引值之一。

Reference data:                        Data to be shifted:
                          acc                                   acc
index                                  index            
1480681219**96**0000000     1          1480681220**04**0000000    8
1480681220**00**0000000     2          1480681220**08**0000000    9    
1480681220**04**0000000     3          1480681220**12**0000000    7
1480681220**08**0000000     4          1480681220**16**0000000   10
1480681220**12**0000000     5          1480681220**20**0000000    6

(The bold editing option did not seem to work, but I wanted to highlight those parts of the indexes) (粗体编辑选项似乎无效,但是我想突出显示索引的那些部分)

I would like to shift my data frame with amount of extra time given. 我想在给定额外时间的情况下转移数据框。 Please note, the time is in nanoseconds. 请注意,时间以纳秒为单位。 I realized that something like df.shift(2) shifts my data 2 places, but I would like to shift my data with -80000000 nanoseconds which in this case is 2 places: 我意识到类似df.shift(2)数据会将我的数据移动2个位置,但是我想将数据移动-80000000纳秒,在这种情况下为2个位置:

Input: 输入:

                     acc
index                   
1480681220040000000    8
1480681220080000000    9
1480681220120000000    7
1480681220160000000   10
1480681220200000000    6

Desired output: 所需的输出:

                      acc
index          
1480681219960000000     8
1480681220000000000     9          
1480681220040000000     7
1480681220080000000    10
1480681220120000000     6
1480681220160000000   NaN
1480681220200000000   NaN

This is a smaller scale of my code: 这是我的代码的较小比例:

class device_data(object):
    def __init__(self):

        _index = [1480681220040000000,
                 1480681220080000000,
                 1480681220120000000,
                 1480681220160000000,
                 1480681220200000000]

        self.df = pd.DataFrame({'acc': [8, 9, 7, 10, 6], 'index': _index})
        self.df = self.df.set_index('index')

if __name__ == '__main__':
    extratime = np.int64(-40000000)

    session = dict()
    session[2] = {'testnumber': '401',
              'devicename': 'peanut'}
    session[2]['data_in_device_class'] = device_data()

    print session[2]['data_in_device_class'].df

    if hasattr(session[2]['data_in_device_class'], 'df'):
        session[2]['data_in_device_class'].df = session[2]['data_in_device_class'].df.shift(int(round(extratime)))
    else:
        pass

    print session[2]['data_in_device_class'].df

When I ran the original code, it gave me this error: OverflowError: Python int too large to convert to C long 当我运行原始代码时,它给了我这个错误: OverflowError: Python int too large to convert to C long

I used extratime = np.int64(extratime) to solve the problem. 我用extratime = np.int64(extratime)解决了这个问题。 I notice that with the scaled down version of my code, that it is not really needed. 我注意到,随着代码的缩减版本,它并不是真正需要的。

My question still stands as how I could use shift to move my index with a value amount and not with the amount of places it needs to move? 我的问题仍然是,如何使用shift来移动索引值而不是需要移动的位数?

Thank you 谢谢

IIUC: IIUC:
You can just reassign your index with itself added to extra time. 您可以重新分配索引,而索引本身会增加额外的时间。

Consider the dataframe df as an example 以数据帧df为例

df = pd.DataFrame(np.arange(100).reshape(5, -1))
df

在此处输入图片说明

I can "shift" the entire dataframe down like this 我可以像这样将整个数据框“下移”

df.index = df.index + 5
df

在此处输入图片说明


Let me know if this is on the mark. 让我知道这是否可行。 Otherwise, I'll delete it. 否则,我将其删除。

First you want to shift your index by the desired amount, and then reindex , to make things easier I take a copy here, shift the index, and we reindex on the union of the shifted index and the original index to introduce NaN rows: 首先要通过所需量的指数移动,然后reindex ,使事情变得更容易我拿一个copy在这里,却将指数,我们reindexunion移位索引和原始索引引进NaN行:

In [232]:
df1 = df.copy()
df1.index -= 80000000
df1.reindex(df1.index.union(df.index))

Out[232]:
                      acc
index                    
1480681219960000000   8.0
1480681220000000000   9.0
1480681220040000000   7.0
1480681220080000000  10.0
1480681220120000000   6.0
1480681220160000000   NaN
1480681220200000000   NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM