簡體   English   中英

Python:pandas 使用參數在每行中動態填充字符串中的值

[英]Python: pandas dynamically fill in values in string in each row using parameters

這個問題參考了這個 SO thread 為了新鮮,我再次提供數據框。

ID         Static_Text                                           Params
1      Today, {0} is quite Sunny. Tomorrow, {1}              1-10-2020  
       may be little {2}
1      Today, {0} is quite Sunny. Tomorrow, {1}              2-10-2020
       may be little {2}
1      Today, {0} is quite Sunny. Tomorrow, {1}              Cloudy
       may be little {2}
2      Let's have a coffee break near {0}, if I              Balcony
       don't get any SO reply by {1}
2      Let's have a coffee break near {0}, if I              30
       don't get any SO reply by {1} mins

這就是我想要的最終數據框:

ID                     Final Text                 
1         Today, 1-10-2020 is quite Sunny. Tomorrow, 2-10-2020            
          may be little Cloudy
2         Let's have a coffee break near Balcony, if I              
          don't get any SO reply by 30 mins

我遵循的方法之一如下:

df = df.groupby(['ID','Static_text']).['Params'].agg(list).reset_index()
df['Final Text'] = df.apply(lambda x : x['Static text'].format(','.join(x['Params'])),axis=1)

但是上面的方法拋出以下錯誤:

IndexError: tuple index out of range

我在這里缺少什么? 我發現lambda x: part mayby 中需要一些技巧。 為簡單起見,讓我們假設所有日期都在string

對我來說,工作加*為字符串值加盟,從使用的解決方案之前,

也有刪除join

您的錯誤應該是Static_Text中的{}周圍有一些數值與聚合列表后的list不匹配 - 這意味着例如ID=1只有 2 行並且有{2} - 所以列表中不存在3th值和解決方案失敗。

df = df.groupby(['ID','Static_Text'])['Params'].agg(list).reset_index()

df['Final Text'] = df.apply(lambda x : x['Static_Text'].format(*x['Params']),axis=1)

print (df)
   ID                                        Static_Text  \
0   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...   
1   2  Let's have a coffee break near {0}, if I don't...   

                           Params  \
0  [1-10-2020, 2-10-2020, Cloudy]   
1                   [Balcony, 30]   

                                          Final Text  
0  Today, 1-10-2020 is quite Sunny. Tomorrow, 2-1...  
1  Let's have a coffee break near Balcony, if I d...  

測試

print (df)
   ID                                        Static_Text     Params
0   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  1-10-2020
1   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  2-10-2020
2   2  Let's have a coffee break near {0}, if I don't...    Balcony
3   2  Let's have a coffee break near {0}, if I don't...         30


df = df.groupby(['ID','Static_Text'])['Params'].agg(list).reset_index()

df['Final Text'] = df.apply(lambda x : x['Static_Text'].format(*x['Params']),axis=1)
print (df)

IndexError:位置參數元組的替換索引 2 超出范圍

您可以找到沒有匹配項的所有行:

s1 = df['Static_Text'].str.extractall('{(\d+)}')[0].astype(int).max(level=0).add(1)
s2 = df.groupby(['ID','Static_Text'])['Params'].transform('size')
                                                          

df = df[s1.gt(s2)]
print (df)
   ID                                        Static_Text     Params
0   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  1-10-2020
1   1  Today, {0} is quite Sunny. Tomorrow, {1} may b...  2-10-2020

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM