[英]How to replace the start and end of a column with certain characters python dataframe
我有一個看起來像這樣的 dataframe:
clients_x clients_y coords_x coords_y
7110001002 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
7110001002 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
7110001002 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[7110052941, 7110107795] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[7110052941, 7110107795] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[7110052941, 7110107795] 0700245857 -23.609,-46.6974 -23.7074,-46.569
我想要做的是讓clients_x
列中的所有值都以“[]”開頭和結尾。 因此,我預期的 output 就是這個:
clients_x clients_y coords_x coords_y
[7110001002] 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
[7110001002] 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
[7110001002] 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[7110052941, 7110107795] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[7110052941, 7110107795] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[7110052941, 7110107795] 0700245857 -23.609,-46.6974 -23.7074,-46.569
為此,我首先嘗試做這樣的事情:
df["clients_x"] = "[" + "df["clients_x"]" + "]"
但是,這樣做實際上會在每個值的開頭和結尾添加“[]”,但對於那些已經有“[]”的行,它們會重復它們。 output 是這個:
clients_x clients_y coords_x coords_y
[7110001002] 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
[7110001002] 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
[7110001002] 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[[7110052941, 7110107795]] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[[7110052941, 7110107795]] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[[7110052941, 7110107795]] 0700245857 -23.609,-46.6974 -23.7074,-46.569
為了避免這個問題,我嘗試使用以下代碼,基本上我想在以數字開頭的clients_x
列中每個值的開頭和結尾添加“[]”。
df['clients_x'] = df['clients_x'].mask(df['clients_x'].astype(str).str.startswith(r'^\d'), f'[{df.clients_x}]')
但是,這行代碼生成的 output 和我原來的 dataframe 是一樣的。 如果有人對如何解決此問題有任何想法,我將非常感謝您的幫助。
使用np.where -
df['clients_x'] = np.where(df['clients_x'].str.startswith('['), df['clients_x'], '[' + df['clients_x'] + ']')
使用df.where -
df['clients_x'].where(df['clients_x'].str.startswith('['), '[' + df['clients_x'] + ']')
Output
0 [7110001002]
1 [7110001002]
2 [7110001002]
3 [7110052941,7110107795]
4 [7110052941,7110107795]
5 [7110052941,7110107795]
Name: clients_x, dtype: object
您需要使用where
,而不是mask
(請參閱文檔):
df["clients_x"] = df.clients_x.where(
df.clients_x.astype(str).str.startswith("["),
"[" + df.clients_x + "]"
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.