简体   繁体   English

熊猫数据框重新采样而不进行聚合

[英]Pandas dataframe resample without aggregation

I have a dataframe defined as follows: 我有一个定义如下的数据框:

import datetime
import pandas as pd
import random
import numpy as np

todays_date = datetime.datetime.today().date()
index = pd.date_range(todays_date - datetime.timedelta(10), periods=10, freq='D')
index = index.append(index)
idname = ['A']*10 + ['B']*10
values = random.sample(xrange(100), 20)
data = np.vstack((idname, values)).T

tmp_df = pd.DataFrame(data, columns=['id', 'value'])
tmp_index = pd.DataFrame(index, columns=['date'])
tmp_df = pd.concat([tmp_index, tmp_df], axis=1)
tmp_df = tmp_df.set_index('date')

Note that there are 2 values for each date. 请注意,每个日期有2个值。 I would like to resample the dataframe tmp_df on a weekly basis but keep the two separate values. 我想每周对数据帧tmp_df进行重新采样,但要保留两个单独的值。 I tried tmp_df.resample('W-FRI') but it doesn't seem to work. 我尝试了tmp_df.resample('W-FRI')但它似乎不起作用。

The solution you're looking for is groupby , which lets you perform operations on dataframe slices (here 'A' and 'B') independently: 您正在寻找的解决方案是groupby ,它使您可以独立地对数据帧切片(此处为“ A”和“ B”)执行操作:

df.groupby('id').resample('W-FRI')

Note: your code produces an error ( No numeric types to aggregate ) because the 'value' column is not converted to int . 注意:由于'value'列未转换为int代码会产生错误( No numeric types to aggregate )。 You need to convert it first: 您需要先将其转换:

df['value'] = pd.to_numeric(df['value'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM