去除空格/制表符/換行符-python

Question

我試圖在 Linux 上的 python 2.7 中刪除所有空格/制表符/換行符。

我寫了這個，應該可以完成這項工作：

myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = myString.strip(' \n\t')
print myString

輸出：

I want to Remove all white   spaces, new lines 
 and tabs

這似乎是一件簡單的事情，但我在這里遺漏了一些東西。 我應該進口一些東西嗎？

Answer 1

使用str.split([sep[, maxsplit]])沒有sep或sep=None ：

從文檔：

如果sep未指定或為None ，則應用不同的拆分算法：連續空格的運行被視為單個分隔符，如果字符串具有前導或尾隨空格，則結果將在開頭或結尾不包含空字符串。

演示：

>>> myString.split()
['I', 'want', 'to', 'Remove', 'all', 'white', 'spaces,', 'new', 'lines', 'and', 'tabs']

在返回的列表上使用str.join以獲取此輸出：

>>> ' '.join(myString.split())
'I want to Remove all white spaces, new lines and tabs'

Answer 2

如果要刪除多個空格項並用單個空格替換它們，最簡單的方法是使用這樣的正則表達式：

>>> import re
>>> myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
>>> re.sub('\s+',' ',myString)
'I want to Remove all white spaces, new lines and tabs '

如果需要，您可以使用.strip()刪除尾隨空格。

Answer 3

使用re庫

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = re.sub(r"[\n\t\s]*", "", myString)
print myString

輸出：

我想刪除所有空格、換行符和制表符

Answer 4

這只會刪除制表符、換行符、空格和其他任何內容。

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
output   = re.sub(r"[\n\t\s]*", "", myString)

輸出：

我想刪除所有空白、換行符和制表符

再會！

Answer 5

import re

mystr = "I want to Remove all white \t spaces, new lines \n and tabs \t"
print re.sub(r"\W", "", mystr)

Output : IwanttoRemoveallwhitespacesnewlinesandtabs

Answer 6

上述建議使用正則表達式的解決方案並不理想，因為這是一個很小的任務，正則表達式需要的資源開銷比任務的簡單性所證明的要多。

這是我所做的：

myString = myString.replace(' ', '').replace('\\t', '').replace('\\n', '')

或者如果你有一堆東西要刪除，以至於單行解決方案會很長：

removal_list = [' ', '\t', '\n']
for s in removal_list:
  myString = myString.replace(s, '')

Answer 7

由於沒有什么比這更復雜的了，我想分享它，因為它幫助了我。

這是我最初使用的：

import requests
import re

url = 'https://stackoverflow.com/questions/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
print("{}".format(r.content))

不希望的結果：

b'<!DOCTYPE html>\r\n\r\n\r\n    <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive">\r\n\r\n    <head>\r\n\r\n        <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>\r\n        <link

這是我將其更改為：

import requests
import re

url = 'https://stackoverflow.com/questions/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
regex = r'\s+'
print("CNT: {}".format(re.sub(regex, " ", r.content.decode('utf-8'))))

預期結果：

<!DOCTYPE html> <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive"> <head> <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>

@MattH 提到的精確正則表達式對我來說是適合我的代碼。 謝謝！

注意：這是python3

Answer 8

在連接中使用列表理解的單行怎么樣？

>>> foobar = "aaa bbb\t\t\tccc\nddd"
>>> print(foobar)
aaa bbb                 ccc
ddd

>>> print(''.join([c for c in foobar if c not in [' ', '\t', '\n']]))
aaabbbcccddd

去除空格/制表符/換行符-python

問題描述

8 個解決方案

解決方案1
144 2012-05-22 22:42:54

解決方案2
64 2012-05-22 22:40:43

解決方案3
18 2017-12-30 16:36:26

解決方案4
13 2017-12-12 09:49:51

解決方案5
12 2012-12-31 11:32:23

解決方案6
7 2019-05-01 20:09:55

解決方案7
2 2019-05-15 06:54:50

解決方案8
1 2020-09-30 14:11:04

去除空格/制表符/換行符-python

問題描述

8 個解決方案

解決方案1 144 2012-05-22 22:42:54

解決方案2 64 2012-05-22 22:40:43

解決方案3 18 2017-12-30 16:36:26

解決方案4 13 2017-12-12 09:49:51

解決方案5 12 2012-12-31 11:32:23

解決方案6 7 2019-05-01 20:09:55

解決方案7 2 2019-05-15 06:54:50

解決方案8 1 2020-09-30 14:11:04

解決方案1
144 2012-05-22 22:42:54

解決方案2
64 2012-05-22 22:40:43

解決方案3
18 2017-12-30 16:36:26

解決方案4
13 2017-12-12 09:49:51

解決方案5
12 2012-12-31 11:32:23

解決方案6
7 2019-05-01 20:09:55

解決方案7
2 2019-05-15 06:54:50

解決方案8
1 2020-09-30 14:11:04