简体   繁体   English

python脚本删除空格

[英]python script to strip white spaces

I'm fairly new to python looking for an help! 我是python寻求帮助的新手! on this I have this string which has a xml content. 在此,我有一个具有xml内容的字符串。 I need to strip white spaces in between different tags. 我需要在不同标签之间去除空白。

<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>

afterwards it looks like: 之后看起来像:

<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Too many concurrent login(s)</TEXT></RESPONSE></SIMPLE_RETURN>

Appreciated if anyone can help!! 感谢任何人都可以帮助!

If you don't want to use regex, you could do this: (It also looks easier to me for someone new to understand how it works, but I am not aware if this is the best way to do it) 如果您不想使用正则表达式,则可以执行以下操作:(对于新来的人来说,了解它的工作原理也很容易,但我不知道这是否是最好的方法)

my_str = '<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>'
new_str = ''
for character in my_str:
    if character != ' ':
        new_str = new_str + character

And then, if you do: 然后,如果您这样做:

print(new_str)

the output is: 输出为:

'<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Toomanyconcurrentlogin(s)</TEXT></RESPONSE></SIMPLE_RETURN>'

A second way I can come up with is this: 我想出的第二种方法是:

new_str = ''.join(my_str.split())

It says 'split my_str at white spaces and then join the pieces that result from this with no character in between'. 它说:“在空白处分割my_str,然后将由此产生的片段连接在一起,中间没有字符”。 The output of print is the same. print的输出是相同的。

Hope this helps, but again, I am not aware if these are the best ways to do it. 希望这会有所帮助,但是再次,我不知道这些是否是最好的方法。

Another way to do it: 另一种方法是:

k = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"
k.replace(" ","")
'<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Toomanyconcurrentlogin(s)</TEXT></RESPONSE></SIMPLE_RETURN>'

Use regex . 使用正则表达式

Ex: 例如:

import re
s = """<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"""
print(re.sub("([\>])\s+([\<])", "\g<1>\g<2>", s))

You can use the sub regex function: 您可以使用sub regex函数:

import re

string = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"

result = re.sub(r'> +<', '><', a)
print result

Here you go : 干得好 :

import re
str = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"

str = re.sub("\>\s+",">", str)
import re
str = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>" 
str = re.sub("([\>])\ +([\<])", "><", str) 
print(str)

I think it's fairly simple. 我认为这很简单。 You just need to get a regex to match whitespace between the tags 您只需要获取一个正则表达式以匹配标签之间的空格

str string = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>
" 
string = re.sub(r">(\s+)<","><",string)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM