剝離 pandas 列中的字符串

Question

我有一個小的 dataframe，其中包含有關賽車運動性能平衡的條目。

我嘗試擺脫“@”之后的字符串

這適用於代碼：

for col in df_engine.columns[1:]:
df_engine[col] = df_engine[col].str.rstrip(r"[\ \@ \d.[0-9]+]")

但保留最后一列不變，我不明白為什么。 法拉利專欄也有一個 NaN 條目作為最后一個 position，作為附加信息。

誰能提供一些幫助？

先感謝您！

Answer 1

rstrip不適用於正則表達式。 根據文檔，

to_strip str 或無，默認無

指定要刪除的字符集。 這組字符的所有組合都將被去除。 如果沒有，則刪除空格。

>>> "1.76 @ 0.88".rstrip("[\ \@ \d.[0-9]+]")
'1.76 @ 0.88'
>>> "1.76 @ 0.88".rstrip("[\ \@ \d.[0-8]+]") # It's not treated as regex, instead All combinations of characters(`[\ \@ \d.[0-8]+]`) stripped
'1.76'

您可以改用replace方法。

for col in df.columns[1:]:
    df[col] = df[col].str.replace(r"\s@\s[\d\.]+$", "", regex=True)

Answer 2

str.split() 怎么樣？ https://pandas.pydata.org/docs/reference/api/pandas.Series.str.split.html#pandas.Series.str.split

function 使用提供的分隔符將系列拆分為 dataframe 列（當 expand=True 時）。

以下示例拆分 serie df_engine[col] 並生成 dataframe。新 dataframe 的第一列包含值中第一個分隔符“@”之前的值

df_engine[col].str.split('@', expand=True)[0]

剝離 pandas 列中的字符串

問題描述

2 個解決方案

解決方案1
1 已采納 2023-01-09 13:04:49

解決方案2
0 2023-01-09 13:33:21

剝離 pandas 列中的字符串

問題描述

2 個解決方案

解決方案1 1 已采納 2023-01-09 13:04:49

解決方案2 0 2023-01-09 13:33:21

解決方案1
1 已采納 2023-01-09 13:04:49

解決方案2
0 2023-01-09 13:33:21