[英]Python: pandas dynamically fill in values in string in each row using parameters
這個問題參考了這個 SO thread 。 為了新鮮,我再次提供數據框。
ID Static_Text Params
1 Today, {0} is quite Sunny. Tomorrow, {1} 1-10-2020
may be little {2}
1 Today, {0} is quite Sunny. Tomorrow, {1} 2-10-2020
may be little {2}
1 Today, {0} is quite Sunny. Tomorrow, {1} Cloudy
may be little {2}
2 Let's have a coffee break near {0}, if I Balcony
don't get any SO reply by {1}
2 Let's have a coffee break near {0}, if I 30
don't get any SO reply by {1} mins
這就是我想要的最終數據框:
ID Final Text
1 Today, 1-10-2020 is quite Sunny. Tomorrow, 2-10-2020
may be little Cloudy
2 Let's have a coffee break near Balcony, if I
don't get any SO reply by 30 mins
我遵循的方法之一如下:
df = df.groupby(['ID','Static_text']).['Params'].agg(list).reset_index()
df['Final Text'] = df.apply(lambda x : x['Static text'].format(','.join(x['Params'])),axis=1)
但是上面的方法拋出以下錯誤:
IndexError: tuple index out of range
我在這里缺少什么? 我發現lambda x:
part mayby 中需要一些技巧。 為簡單起見,讓我們假設所有日期都在string
。
對我來說,工作加*
為字符串值加盟,從使用的解決方案之前,此:
也有刪除join
。
您的錯誤應該是Static_Text
中的{}
周圍有一些數值與聚合列表后的list
不匹配 - 這意味着例如ID=1
只有 2 行並且有{2}
- 所以列表中不存在3th
值和解決方案失敗。
df = df.groupby(['ID','Static_Text'])['Params'].agg(list).reset_index()
df['Final Text'] = df.apply(lambda x : x['Static_Text'].format(*x['Params']),axis=1)
print (df)
ID Static_Text \
0 1 Today, {0} is quite Sunny. Tomorrow, {1} may b...
1 2 Let's have a coffee break near {0}, if I don't...
Params \
0 [1-10-2020, 2-10-2020, Cloudy]
1 [Balcony, 30]
Final Text
0 Today, 1-10-2020 is quite Sunny. Tomorrow, 2-1...
1 Let's have a coffee break near Balcony, if I d...
測試:
print (df)
ID Static_Text Params
0 1 Today, {0} is quite Sunny. Tomorrow, {1} may b... 1-10-2020
1 1 Today, {0} is quite Sunny. Tomorrow, {1} may b... 2-10-2020
2 2 Let's have a coffee break near {0}, if I don't... Balcony
3 2 Let's have a coffee break near {0}, if I don't... 30
df = df.groupby(['ID','Static_Text'])['Params'].agg(list).reset_index()
df['Final Text'] = df.apply(lambda x : x['Static_Text'].format(*x['Params']),axis=1)
print (df)
IndexError:位置參數元組的替換索引 2 超出范圍
您可以找到沒有匹配項的所有行:
s1 = df['Static_Text'].str.extractall('{(\d+)}')[0].astype(int).max(level=0).add(1)
s2 = df.groupby(['ID','Static_Text'])['Params'].transform('size')
df = df[s1.gt(s2)]
print (df)
ID Static_Text Params
0 1 Today, {0} is quite Sunny. Tomorrow, {1} may b... 1-10-2020
1 1 Today, {0} is quite Sunny. Tomorrow, {1} may b... 2-10-2020
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.