I have a dataframe like this:
Ticker | instrument_name | year | month | instrument_type | expiry_type |
---|---|---|---|---|---|
ABAN10SEPFUT | ABAN | 10 | SEP | FUT | NaN |
ABAN10OCTFUT | ABAN | 10 | OCT | FUT | NaN |
ABAN10NOVFUT | ABAN | 10 | NOV | FUT | NaN |
I want to groupby instrument_type ('FUT') & find unique values in month . Then compare unique values with month column and replace values with 'I','II','III' in the expiry_type column.
Result expected:
Ticker | instrument_name | year | month | instrument_type | expiry_type |
---|---|---|---|---|---|
ABAN10SEPFUT | ABAN | 10 | SEP | FUT | I |
ABAN10OCTFUT | ABAN | 10 | OCT | FUT | II |
ABAN10NOVFUT | ABAN | 10 | NOV | FUT | III |
My code look like #1
def condition(x):
if x =='SEP':
return "I"
elif x =='OCT':
return "II"
elif x =='NOV':
return "III"
else:
return ''
#2
for index, row in path.iterrows():
data = pd.read_parquet(row['location'])
data['expiry_type'] = np.where((data['instrument_type'] == 'FUT'),data['month'].apply(condition),'')
Since I already know the unique values in month column so I created a custom function to replace values in expiry_type column. I have similar files like this so is there a way to find unique values and replace automatically. How do I do that? Thank you in advance!
Considering you have grouped by instrument_type
, you could build a function like the one in #1:
def condition(x):
if x.month =='SEP':
return "I"
elif x.month =='OCT':
return "II"
elif x.month =='NOV':
return "III"
else:
return ''
And apply this function to the expiry_type
column:
df['expiry_type'] = df.apply(condition, axis = 1).
You can find the unique values in a column by using Pandasunique function. Using a for loop for each DataFrame you have, apply the unique
function over the month
column to obtain a list of unique values. Then, create a dictionary using those values as keys and the new representation (roman numbers in this particular example) as values. You can then use the map function to replace the values in the month
column and assign the new ones to the expiry_type
column.
def toRoman(n):
roman = ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', 'IX', 'X', 'XI', 'XII']
return roman[n]
df_list = ['df1.csv', 'df2.csv', 'df3.csv']
for df_file in df_list:
df = pd.read_csv(df_file)
g = df.groupby('instrument_type')
uniq = g['month'].unique()[0]
# create a dictionary using the unique values
dict_map = {name:toRoman(idx) for idx,name in enumerate(uniq)}
df['expiry_type'] = df['month'].map(dict_map)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.