简体   繁体   English

从Pandas DataFrame中提取系列对象

[英]Extract series objects from Pandas DataFrame

I have a dataframe with the columns 我有一个带有列的数据框

['CPL4', 'Part Number', 'Calendar Year/Month', 'Sales', 'Inventory']

For each 'Part Number', 'Calendar Year/Month' will be unique on each Part Number. 对于每个“零件号”,每个零件号上的“日历年/月”将是唯一的。

I want to convert each part number to a univariate Series with 'Calendar Year/Month' as the index and either 'Sales' or 'Inventory' as the value. 我想将每个零件号转换为以“日历年/月”为索引并且以“销售”或“库存”为值的单变量系列。

How can I accomplish this using pandas built-in functions and not iterating through the dataframe manually? 如何使用pandas内置函数而不是手动遍历数据框来实现此目的?

In pandas this is called a MultiIndex. 在熊猫中,这称为MultiIndex。 Try: 尝试:

import pandas as pd
df = pd.DataFrame(file, 
        index=['Part Number', 'Calendar Year/Month'], 
        columns = ['Sales', 'Inventory'])

you can use the groupby method such has: 您可以使用具有以下内容的groupby方法:

grouped_df = df.groupby('Part Number')

and then you can access the df of a certain part number and set the index easily such has: 然后您可以访问某个零件号的df并轻松设置索引,例如:

new_df = grouped_df.get_group('THEPARTNUMBERYOUWANT').set_index('Calendar Year/Month')

if you only want the 2 columns you can do: 如果只需要2列,则可以执行以下操作:

print new_df[['Sales', 'Inventory']]]

From the answers and comments here, along with a little more research, I ended with the following solution. 通过这里的答案和评论以及更多的研究,我得出了以下解决方案。

temp_series = df[df[ "Part Number" == sku ] ].pivot(columns = ["Calendar Year/Month"], values = "Sales").iloc[0]

Where sku is a specific part number from df["Part Number"].unique() 其中sku是df [“ Part Number”]。unique()中的特定零件号

This will give you a univariate time series(temp_series) indexed by "Calendar Year/Month" with values of "Sales" EG: 这将为您提供一个以“日历年/月”为索引的单变量时间序列(temp_series),其值为“销售” EG:

1.2015     NaN
1.2016     NaN
2.2015     NaN
2.2016     NaN
3.2015     NaN
3.2016     NaN
4.2015     NaN
4.2016     NaN
5.2015     NaN
5.2016     NaN
6.2015     NaN
6.2016     NaN
7.2015     NaN
7.2016     NaN
8.2015     NaN
8.2016     NaN
9.2015     NaN
10.2015    NaN
11.2015    NaN
12.2015    NaN
Name: 161, dtype: float64

<class 'pandas.core.series.Series'>])

from the columns 从列

['CPL4', 'Part Number', 'Calendar Year/Month', 'Sales', 'Inventory']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM