如何用某些字符替換列的開頭和結尾 python dataframe

Question

我有一個看起來像這樣的 dataframe：

 clients_x                 clients_y              coords_x               coords_y 
7110001002                7100019838    -23.63013,-46.704887  -23.657433,-46.744095   
7110001002                7100021875    -23.63013,-46.704887    -23.7729,-46.591366   
7110001002                0700245857    -23.63013,-46.704887      -23.7074,-46.5698 
[7110052941, 7110107795]  7100019838        -23.609,-46.6974  -23.657433,-46.744095
[7110052941, 7110107795]  7100021875        -23.609,-46.6974    -23.7729,-46.591366
[7110052941, 7110107795]  0700245857        -23.609,-46.6974       -23.7074,-46.569

我想要做的是讓clients_x列中的所有值都以“[]”開頭和結尾。 因此，我預期的 output 就是這個：

 clients_x                 clients_y              coords_x               coords_y 
[7110001002]                7100019838    -23.63013,-46.704887  -23.657433,-46.744095   
[7110001002]                7100021875    -23.63013,-46.704887    -23.7729,-46.591366   
[7110001002]                0700245857    -23.63013,-46.704887      -23.7074,-46.5698 
[7110052941, 7110107795]  7100019838        -23.609,-46.6974  -23.657433,-46.744095
[7110052941, 7110107795]  7100021875        -23.609,-46.6974    -23.7729,-46.591366
[7110052941, 7110107795]  0700245857        -23.609,-46.6974       -23.7074,-46.569

為此，我首先嘗試做這樣的事情：

df["clients_x"] = "[" + "df["clients_x"]" + "]"

但是，這樣做實際上會在每個值的開頭和結尾添加“[]”，但對於那些已經有“[]”的行，它們會重復它們。 output 是這個：

 clients_x                 clients_y              coords_x               coords_y 
[7110001002]                7100019838    -23.63013,-46.704887  -23.657433,-46.744095   
[7110001002]                7100021875    -23.63013,-46.704887    -23.7729,-46.591366   
[7110001002]                0700245857    -23.63013,-46.704887      -23.7074,-46.5698 
[[7110052941, 7110107795]]  7100019838        -23.609,-46.6974  -23.657433,-46.744095
[[7110052941, 7110107795]]  7100021875        -23.609,-46.6974    -23.7729,-46.591366
[[7110052941, 7110107795]]  0700245857        -23.609,-46.6974       -23.7074,-46.569

為了避免這個問題，我嘗試使用以下代碼，基本上我想在以數字開頭的clients_x列中每個值的開頭和結尾添加“[]”。

df['clients_x'] = df['clients_x'].mask(df['clients_x'].astype(str).str.startswith(r'^\d'), f'[{df.clients_x}]')

但是，這行代碼生成的 output 和我原來的 dataframe 是一樣的。 如果有人對如何解決此問題有任何想法，我將非常感謝您的幫助。

Answer 1

使用np.where -

df['clients_x'] = np.where(df['clients_x'].str.startswith('['), df['clients_x'], '[' + df['clients_x'] + ']')

使用df.where -

df['clients_x'].where(df['clients_x'].str.startswith('['), '[' + df['clients_x'] + ']')

Output

0               [7110001002]
1               [7110001002]
2               [7110001002]
3    [7110052941,7110107795]
4    [7110052941,7110107795]
5    [7110052941,7110107795]
Name: clients_x, dtype: object

Answer 2

您需要使用where ，而不是mask （請參閱文檔）：

df["clients_x"] = df.clients_x.where(
  df.clients_x.astype(str).str.startswith("["), 
  "[" + df.clients_x + "]"
)

如何用某些字符替換列的開頭和結尾 python dataframe

問題描述

2 個解決方案

解決方案1
2 已采納 2021-05-03 17:40:35

解決方案2
1 2021-05-03 17:42:16

如何用某些字符替換列的開頭和結尾 python dataframe

問題描述

2 個解決方案

解決方案1 2 已采納 2021-05-03 17:40:35

解決方案2 1 2021-05-03 17:42:16

解決方案1
2 已采納 2021-05-03 17:40:35

解決方案2
1 2021-05-03 17:42:16