简体   繁体   English

如何访问熊猫数据框单元格中字符串值的索引?

[英]How to access index of string value in a cell of pandas data frame?

I'm working with the Bureau of Labor Statistics data which looks like this:我正在使用劳工统计局的数据,它看起来像这样:

series_id           year    period         value
CES0000000001       2006    M01            135446.0

series_id[3][4] indicate the supersector. series_id[3][4]表示超扇区。 for example, CES10xxxxxx01 would be Mining & Logging .例如, CES10xxxxxx01将是Mining & Logging There are 15 supersectors that I'm concerned with and hence I want to create 15 separate data frames for each supersector to perform time series analysis.我关注 15 个超扇区,因此我想为每个超扇区创建 15 个单独的数据框以执行时间序列分析。 So I'm trying to access each value as a list to achieve something like:所以我试图将每个值作为一个列表来访问,以实现以下目标:

# *psuedocode*:
mining_and_logging = df[df.series_id[3]==1 and df.series_id[4]==0]

Can I avoid writing a for loop where I convert each value to a list then access by index and add the row to the new dataframe?我可以避免编写一个 for 循环,将每个值转换为列表,然后按索引访问并将行添加到新数据帧吗?

How can I achieve this?我怎样才能做到这一点?

One way to do what you want and recursively store the dataframes through a for loop could be:执行您想要的操作并通过for循环递归存储数据帧的一种for可能是:

First, create an auxiliary column to make your life easier:首先,创建一个辅助列,让您的生活更轻松:

df['id'] = df['series_id'][3:5] #Exctract characters 3 and 4 of every string (counting from zero)

Then, you create an empty dictionary and populate it:然后,您创建一个空字典并填充它:

dict_df = {}
for unique_id in df.id.unique():
    dict_df[unique_id] = df[df.id == unique_id]

Now you'll have a dictionary with 15 dataframes inside.现在您将拥有一个包含 15 个数据框的字典。 For example, if you want to call the dataframe associated with id = 01, you just do:例如,如果要调用与id = 01 关联的数据帧,只需执行以下操作:

dict_df['01']

Hope it helps !希望能帮助到你 !

Solved it by combining answers from Juan C and G. Anderson.通过结合 Juan C 和 G. Anderson 的答案解决了这个问题。

Select the 3rd and 4th character:选择第 3 个和第 4 个字符:

    df['id'] = df.series_id.str.slice(start=3, stop=5)

And then the following to create dataframes:然后执行以下操作来创建数据帧:

    dict_df = {}
    for unique_id in df.id.unique():
        dict_df[unique_id] = df[df.id == unique_id]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM