pandas groupby 列值并替换另一列中的分组值

Question

I have a dataframe like this:我有一个这样的数据框：

Ticker股票代码	instrument_name仪器名称	year年	month月	instrument_type仪器类型	expiry_type到期类型
ABAN10SEPFUT ABAN10SEPFUT	ABAN阿班	10 10	SEP九月	FUT FUT	NaN钠
ABAN10OCTFUT ABAN10OCTFUT	ABAN阿班	10 10	OCT华侨城	FUT FUT	NaN钠
ABAN10NOVFUT ABAN10NOVFUT	ABAN阿班	10 10	NOV十一月	FUT FUT	NaN钠

I want to groupby instrument_type ('FUT') & find unique values in month .我想按instrument_type ('FUT') 分组并在month中找到唯一值。 Then compare unique values with month column and replace values with 'I','II','III' in the expiry_type column.然后将唯一值与月份列进行比较，并在expiry_type列中将值替换为“I”、“II”、“III”。

Result expected:预期结果：

Ticker股票代码	instrument_name仪器名称	year年	month月	instrument_type仪器类型	expiry_type到期类型
ABAN10SEPFUT ABAN10SEPFUT	ABAN阿班	10 10	SEP九月	FUT FUT	I我
ABAN10OCTFUT ABAN10OCTFUT	ABAN阿班	10 10	OCT华侨城	FUT FUT	II二
ABAN10NOVFUT ABAN10NOVFUT	ABAN阿班	10 10	NOV十一月	FUT FUT	III三

My code look like #1我的代码看起来像 #1

def condition(x):
if x =='SEP':
    return "I"
elif x =='OCT':
    return "II"
elif x =='NOV':
    return "III"
else:
    return ''

#2 #2

for index, row in path.iterrows():
    data = pd.read_parquet(row['location'])
    data['expiry_type'] = np.where((data['instrument_type'] == 'FUT'),data['month'].apply(condition),'')

Since I already know the unique values in month column so I created a custom function to replace values in expiry_type column.由于我已经知道月份列中的唯一值，所以我创建了一个自定义函数来替换 expiry_type 列中的值。 I have similar files like this so is there a way to find unique values and replace automatically.我有类似的文件，所以有没有办法找到唯一值并自动替换。 How do I do that?我怎么做？ Thank you in advance!先感谢您！

Answer 1

Considering you have grouped by instrument_type , you could build a function like the one in #1:考虑到您已按instrument_type分组，您可以构建一个类似于 #1 中的函数：

def condition(x):
    if x.month =='SEP':
        return "I"
    elif x.month =='OCT':
        return "II"
    elif x.month =='NOV':
        return "III"
    else:
        return ''

And apply this function to the expiry_type column:并将此函数应用于expiry_type列：

df['expiry_type'] = df.apply(condition, axis = 1).

Answer 2

You can find the unique values in a column by using Pandasunique function.您可以使用 Pandas唯一函数在列中查找唯一值。 Using a for loop for each DataFrame you have, apply the unique function over the month column to obtain a list of unique values.对您拥有的每个 DataFrame 使用 for 循环，在month列上应用unique函数以获得唯一值列表。 Then, create a dictionary using those values as keys and the new representation (roman numbers in this particular example) as values.然后，使用这些值作为键和新的表示形式（在这个特定示例中为罗马数字）作为值来创建一个字典。 You can then use the map function to replace the values in the month column and assign the new ones to the expiry_type column.然后，您可以使用map函数替换month列中的值并将新值分配给expiry_type列。

def toRoman(n):
    roman = ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', 'IX', 'X', 'XI', 'XII']
    return roman[n]

df_list = ['df1.csv', 'df2.csv', 'df3.csv']
for df_file in df_list:
    df = pd.read_csv(df_file)
    g = df.groupby('instrument_type')
    uniq = g['month'].unique()[0]
    # create a dictionary using the unique values
    dict_map = {name:toRoman(idx) for idx,name in enumerate(uniq)}
    df['expiry_type'] = df['month'].map(dict_map)

pandas groupby 列值并替换另一列中的分组值

问题描述

2 个解决方案

解决方案1
0 2022-05-23 00:19:06

解决方案2
0 2022-06-13 01:53:32

pandas groupby 列值并替换另一列中的分组值

问题描述

2 个解决方案

解决方案1 0 2022-05-23 00:19:06

解决方案2 0 2022-06-13 01:53:32

解决方案1
0 2022-05-23 00:19:06

解决方案2
0 2022-06-13 01:53:32