Python：pandas 使用參數在每行中動態填充字符串中的值

Question

這個問題參考了這個 SO thread 。 為了新鮮，我再次提供數據框。

ID         Static_Text                                           Params
1      Today, {0} is quite Sunny. Tomorrow, {1}              1-10-2020  
       may be little {2}
1      Today, {0} is quite Sunny. Tomorrow, {1}              2-10-2020
       may be little {2}
1      Today, {0} is quite Sunny. Tomorrow, {1}              Cloudy
       may be little {2}
2      Let's have a coffee break near {0}, if I              Balcony
       don't get any SO reply by {1}
2      Let's have a coffee break near {0}, if I              30
       don't get any SO reply by {1} mins

這就是我想要的最終數據框：

ID                     Final Text                 
1         Today, 1-10-2020 is quite Sunny. Tomorrow, 2-10-2020            
          may be little Cloudy
2         Let's have a coffee break near Balcony, if I              
          don't get any SO reply by 30 mins

我遵循的方法之一如下：

df = df.groupby(['ID','Static_text']).['Params'].agg(list).reset_index()
df['Final Text'] = df.apply(lambda x : x['Static text'].format(','.join(x['Params'])),axis=1)

但是上面的方法拋出以下錯誤：

IndexError: tuple index out of range

我在這里缺少什么？ 我發現lambda x: part mayby 中需要一些技巧。 為簡單起見，讓我們假設所有日期都在string 。

Answer 1

對我來說，工作加*為字符串值加盟，從使用的解決方案之前，此：

也有刪除join 。

您的錯誤應該是Static_Text中的{}周圍有一些數值與聚合列表后的list不匹配 - 這意味着例如ID=1只有 2 行並且有{2} - 所以列表中不存在3th值和解決方案失敗。

df = df.groupby(['ID','Static_Text'])['Params'].agg(list).reset_index()

df['Final Text'] = df.apply(lambda x : x['Static_Text'].format(*x['Params']),axis=1)

print (df)
   ID                                        Static_Text  \
0   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...   
1   2  Let's have a coffee break near {0}, if I don't...   

                           Params  \
0  [1-10-2020, 2-10-2020, Cloudy]   
1                   [Balcony, 30]   

                                          Final Text  
0  Today, 1-10-2020 is quite Sunny. Tomorrow, 2-1...  
1  Let's have a coffee break near Balcony, if I d...

測試：

print (df)
   ID                                        Static_Text     Params
0   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  1-10-2020
1   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  2-10-2020
2   2  Let's have a coffee break near {0}, if I don't...    Balcony
3   2  Let's have a coffee break near {0}, if I don't...         30


df = df.groupby(['ID','Static_Text'])['Params'].agg(list).reset_index()

df['Final Text'] = df.apply(lambda x : x['Static_Text'].format(*x['Params']),axis=1)
print (df)

IndexError：位置參數元組的替換索引 2 超出范圍

您可以找到沒有匹配項的所有行：

s1 = df['Static_Text'].str.extractall('{(\d+)}')[0].astype(int).max(level=0).add(1)
s2 = df.groupby(['ID','Static_Text'])['Params'].transform('size')
                                                          

df = df[s1.gt(s2)]
print (df)
   ID                                        Static_Text     Params
0   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  1-10-2020
1   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  2-10-2020

Python：pandas 使用參數在每行中動態填充字符串中的值

問題描述

1 個解決方案

解決方案1
0 已采納 2020-10-02 12:58:01

Python：pandas 使用參數在每行中動態填充字符串中的值

問題描述

1 個解決方案

解決方案1 0 已采納 2020-10-02 12:58:01

解決方案1
0 已采納 2020-10-02 12:58:01