去除空格/制表符/换行符-python

Question

I am trying to remove all spaces/tabs/newlines in python 2.7 on Linux.我试图在 Linux 上的 python 2.7 中删除所有空格/制表符/换行符。

I wrote this, that should do the job:我写了这个，应该可以完成这项工作：

myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = myString.strip(' \n\t')
print myString

output:输出：

I want to Remove all white   spaces, new lines 
 and tabs

It seems like a simple thing to do, yet I am missing here something.这似乎是一件简单的事情，但我在这里遗漏了一些东西。 Should I be importing something?我应该进口一些东西吗？

Answer 1

Use str.split([sep[, maxsplit]]) with no sep or sep=None :使用str.split([sep[, maxsplit]])没有sep或sep=None ：

From docs :从文档：

If sep is not specified or is None , a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.如果sep未指定或为None ，则应用不同的拆分算法：连续空格的运行被视为单个分隔符，如果字符串具有前导或尾随空格，则结果将在开头或结尾不包含空字符串。

Demo:演示：

>>> myString.split()
['I', 'want', 'to', 'Remove', 'all', 'white', 'spaces,', 'new', 'lines', 'and', 'tabs']

Use str.join on the returned list to get this output:在返回的列表上使用str.join以获取此输出：

>>> ' '.join(myString.split())
'I want to Remove all white spaces, new lines and tabs'

Answer 2

If you want to remove multiple whitespace items and replace them with single spaces, the easiest way is with a regexp like this:如果要删除多个空格项并用单个空格替换它们，最简单的方法是使用这样的正则表达式：

>>> import re
>>> myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
>>> re.sub('\s+',' ',myString)
'I want to Remove all white spaces, new lines and tabs '

You can then remove the trailing space with .strip() if you want to.如果需要，您可以使用.strip()删除尾随空格。

Answer 3

Use the re library使用re库

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = re.sub(r"[\n\t\s]*", "", myString)
print myString

Output:输出：

IwanttoRemoveallwhitespaces,newlinesandtabs我想删除所有空格、换行符和制表符

Answer 4

This will only remove the tab, newlines, spaces and nothing else.这只会删除制表符、换行符、空格和其他任何内容。

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
output   = re.sub(r"[\n\t\s]*", "", myString)

OUTPUT:输出：

IwantoRemoveallwhiespaces,newlinesandtabs我想删除所有空白、换行符和制表符

Good day!再会！

Answer 5

import re

mystr = "I want to Remove all white \t spaces, new lines \n and tabs \t"
print re.sub(r"\W", "", mystr)

Output : IwanttoRemoveallwhitespacesnewlinesandtabs

Answer 6

The above solutions suggesting the use of regex aren't ideal because this is such a small task and regex requires more resource overhead than the simplicity of the task justifies.上述建议使用正则表达式的解决方案并不理想，因为这是一个很小的任务，正则表达式需要的资源开销比任务的简单性所证明的要多。

Here's what I do:这是我所做的：

myString = myString.replace(' ', '').replace('\\t', '').replace('\\n', '')

or if you had a bunch of things to remove such that a single line solution would be gratuitously long:或者如果你有一堆东西要删除，以至于单行解决方案会很长：

removal_list = [' ', '\t', '\n']
for s in removal_list:
  myString = myString.replace(s, '')

Answer 7

Since there is not anything else that was more intricate, I wanted to share this as it helped me out.由于没有什么比这更复杂的了，我想分享它，因为它帮助了我。

This is what I originally used:这是我最初使用的：

import requests
import re

url = 'https://stackoverflow.com/questions/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
print("{}".format(r.content))

Undesired Result:不希望的结果：

b'<!DOCTYPE html>\r\n\r\n\r\n    <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive">\r\n\r\n    <head>\r\n\r\n        <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>\r\n        <link

This is what I changed it to:这是我将其更改为：

import requests
import re

url = 'https://stackoverflow.com/questions/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
regex = r'\s+'
print("CNT: {}".format(re.sub(regex, " ", r.content.decode('utf-8'))))

Desired Result:预期结果：

<!DOCTYPE html> <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive"> <head> <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>

The precise regex that @MattH had mentioned, was what worked for me in fitting it into my code. @MattH 提到的精确正则表达式对我来说是适合我的代码。 Thanks!谢谢！

Note: This is python3注意：这是python3

Answer 8

How about a one-liner using a list comprehension within join?在连接中使用列表理解的单行怎么样？

>>> foobar = "aaa bbb\t\t\tccc\nddd"
>>> print(foobar)
aaa bbb                 ccc
ddd

>>> print(''.join([c for c in foobar if c not in [' ', '\t', '\n']]))
aaabbbcccddd

去除空格/制表符/换行符-python

问题描述

8 个解决方案

解决方案1
144 2012-05-22 22:42:54

解决方案2
64 2012-05-22 22:40:43

解决方案3
18 2017-12-30 16:36:26

解决方案4
13 2017-12-12 09:49:51

解决方案5
12 2012-12-31 11:32:23

解决方案6
7 2019-05-01 20:09:55

解决方案7
2 2019-05-15 06:54:50

解决方案8
1 2020-09-30 14:11:04

去除空格/制表符/换行符-python

问题描述

8 个解决方案

解决方案1 144 2012-05-22 22:42:54

解决方案2 64 2012-05-22 22:40:43

解决方案3 18 2017-12-30 16:36:26

解决方案4 13 2017-12-12 09:49:51

解决方案5 12 2012-12-31 11:32:23

解决方案6 7 2019-05-01 20:09:55

解决方案7 2 2019-05-15 06:54:50

解决方案8 1 2020-09-30 14:11:04

解决方案1
144 2012-05-22 22:42:54

解决方案2
64 2012-05-22 22:40:43

解决方案3
18 2017-12-30 16:36:26

解决方案4
13 2017-12-12 09:49:51

解决方案5
12 2012-12-31 11:32:23

解决方案6
7 2019-05-01 20:09:55

解决方案7
2 2019-05-15 06:54:50

解决方案8
1 2020-09-30 14:11:04