如何刪除數據框中的回車

Question

我有一個數據框，其中包含名為 id、country_name、location 和 total_deaths 的列。 在進行數據清理過程時，我在一行中遇到了一個附加了'\\r' 。 完成清理過程后，我將生成的數據幀存儲在 destination.csv 文件中。 由於上面的特定行附加了\\r ，它總是會創建一個新行。

id                               29
location            Uttar Pradesh\r
country_name                  India
total_deaths                     20

我想刪除\\r 。 我試過df.replace({'\\r': ''}, regex=True) 。 它對我不起作用。

有沒有其他解決辦法。 有人可以幫忙嗎？

編輯：

在上述過程中，我遍歷 df 以查看是否存在\\r 。 如果存在，則需要更換。 此處row.replace()或row.str.strip()似乎不起作用，或者我可能以錯誤的方式進行操作。

我不想在使用replace()時指定列名或行號。 因為我不能確定只有 'location' 列會有\\r 。 請在下面找到代碼。

count = 0
for row_index, row in df.iterrows():
    if re.search(r"\\r", str(row)):
        print type(row)               #Return type is pandas.Series
        row.replace({r'\\r': ''} , regex=True)
        print row
        count += 1

Answer 1

另一種解決方案是使用str.strip ：

df['29'] = df['29'].str.strip(r'\\r')
print df
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

如果要使用replace ，請添加r和一個\\ ：

print df.replace({r'\\r': ''}, regex=True)
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

在replace您可以定義用於替換的列，例如：

print df
               id               29
0        location  Uttar Pradesh\r
1    country_name            India
2  total_deaths\r               20

print df.replace({'29': {r'\\r': ''}}, regex=True)
               id             29
0        location  Uttar Pradesh
1    country_name          India
2  total_deaths\r             20

print df.replace({r'\\r': ''}, regex=True)
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

通過評論編輯：

import pandas as pd

df = pd.read_csv('data_source_test.csv')
print df
   id country_name           location  total_deaths
0   1        India          New Delhi           354
1   2        India         Tamil Nadu            48
2   3        India          Karnataka             0
3   4        India      Andra Pradesh            32
4   5        India              Assam           679
5   6        India             Kerala           128
6   7        India             Punjab             0
7   8        India      Mumbai, Thane             1
8   9        India  Uttar Pradesh\r\n            20
9  10        India             Orissa            69

print df.replace({r'\r\n': ''}, regex=True)
   id country_name       location  total_deaths
0   1        India      New Delhi           354
1   2        India     Tamil Nadu            48
2   3        India      Karnataka             0
3   4        India  Andra Pradesh            32
4   5        India          Assam           679
5   6        India         Kerala           128
6   7        India         Punjab             0
7   8        India  Mumbai, Thane             1
8   9        India  Uttar Pradesh            20
9  10        India         Orissa            69

如果只需要在列location替換：

df['location'] = df.location.str.replace(r'\r\n', '')
print df
   id country_name       location  total_deaths
0   1        India      New Delhi           354
1   2        India     Tamil Nadu            48
2   3        India      Karnataka             0
3   4        India  Andra Pradesh            32
4   5        India          Assam           679
5   6        India         Kerala           128
6   7        India         Punjab             0
7   8        India  Mumbai, Thane             1
8   9        India  Uttar Pradesh            20
9  10        India         Orissa            69

Answer 2

使用str.replace ，您需要對序列進行轉義，以便將其視為回車而不是文字\\r ：

In [15]:
df['29'] = df['29'].str.replace(r'\\r','')
df

Out[15]:
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

Answer 3

下面的代碼刪除了 \\n 制表符空格、\\n 換行符和 \\r 回車符，非常適合將數據壓縮為一行。 答案取自https://gist.github.com/smram/d6ded3c9028272360eb65bcab564a18a

df.replace(to_replace=[r"\\t|\\n|\\r", "\t|\n|\r"], value=["",""], regex=True, inplace=<INPLACE>)

Answer 4

不知何故，接受的答案對我不起作用。 最終，我通過如下方式找到了解決方案

df["29"] = df["29"].replace(r'\r', '', regex=True)

不同之處在於我使用\\r而不是\\\\r 。

Answer 5

只需使 df 等於 df.replace 代碼行，然后打印 df。

df=df.replace({'\r': ''}, regex=True) 
print(df)

如何刪除數據框中的回車

問題描述

編輯：

5 個解決方案

解決方案1
15 已采納 2016-05-11 11:17:16

解決方案2
3 2016-05-11 11:14:23

解決方案3
3 2019-10-27 23:34:37

解決方案4
0 2021-03-23 13:45:55

解決方案5
-1 2020-03-17 19:45:43

如何刪除數據框中的回車

問題描述

編輯：

5 個解決方案

解決方案1 15 已采納 2016-05-11 11:17:16

解決方案2 3 2016-05-11 11:14:23

解決方案3 3 2019-10-27 23:34:37

解決方案4 0 2021-03-23 13:45:55

解決方案5 -1 2020-03-17 19:45:43

解決方案1
15 已采納 2016-05-11 11:17:16

解決方案2
3 2016-05-11 11:14:23

解決方案3
3 2019-10-27 23:34:37

解決方案4
0 2021-03-23 13:45:55

解決方案5
-1 2020-03-17 19:45:43