简体   繁体   English

删除特殊字符前后的空白并加入python

[英]Remove White space before and after a special character and join them python

My_string =  My Awesome Company billing @ example . com Contractor Invoice # 000015 Acme Projects - Taxable Product Contractor Invoice Summary Account Information Don Test don @ example . com Contractor Invoice Date : 10 / 26 / 2016 Amount Due $ 21 .


Desired_string = My Awesome Company billing@example.com Contractor Invoice#000015 Acme Projects-Taxable Product Contractor Invoice Summary Account Information Don Test don@example.com Contractor Invoice Date:10/26/2016 Amount Due$21.

In simple words I need to remove space from before and after a special character. 简单来说,我需要删除特殊字符前后的空格。 Also can you share a good source to learn regex from 您还可以分享一个很好的资源来学习正则表达式吗

with open('sentence.txt') as txtfile:
string = str(txtfile.read())
list_of_str = string.split()
new_list = []
for d in range(len(list_of_str)):
    if not (list_of_str[d].isalpha() or list_of_str[d].isalnum()):
       print(list_of_str[d-1], list_of_str[d:])
       new_list.append(str(list_of_str[d-1]) + str(list_of_str[d]) + str(list_of_str[d+1]))
    else:
        new_list.append(list_of_str[d])
print(new_list)

Output: ['OnlineMyAwesome', 'Awesome', 'Company', 'billing', 'billing@example', 'example', 'example.com', 'com', 'Contractor', 'Invoice', 'Invoice#000015', '000015', 'Acme', 'Projects', 'Projects-Taxable', 'Taxable', 'Product', 'Contractor', 'Invoice', 'Summary', 'Account', 'Information', 'Don', 'Test', 'don', 'don@example', 'example', 'example.com', 'com', 'Contractor', 'Invoice', 'Date', 'Date:10', '10', '10/26', '26', '26/2016', '2016', 'Amount', 'Due', 'Due$21', '21']

At first i tried to use this but I think regex can help 起初我尝试使用它,但我认为正则表达式可以提供帮助

Thank you 谢谢

Yes, you can easily solve this problem using regex, rather than you current code. 是的,您可以使用正则表达式轻松地解决此问题,而不是使用当前代码。

You can use this regex, 您可以使用此正则表达式,

([@.#$\\/:-]) ? (Space followed by character set having special chars followed by an optional space. You can add more characters in the set as per your needs.) (空格后跟具有特殊字符的字符集,后跟一个可选空格。您可以根据需要在该字符集中添加更多字符。)

This regex captures a space followed by one character in your character set followed by optional space and replaces it with the character it captured in group 1. 此正则表达式捕获一个空格,其后是字符集中的一个字符,后跟可选的空格,并用在组1中捕获的字符替换它。

Demo 演示

Sample python codes, 示例python代码,

import re
s = 'My Awesome Company billing @ example . com Contractor Invoice # 000015 Acme Projects - Taxable Product Contractor Invoice Summary Account Information Don Test don @ example . com Contractor Invoice Date : 10 / 26 / 2016 Amount Due $ 21 .'
s = re.sub(' ([@.#$\/:-]) ?',r'\1', s)
print(s)

which gives following output, 提供以下输出,

My Awesome Company billing@example.com Contractor Invoice#000015 Acme Projects-Taxable Product Contractor Invoice Summary Account Information Don Test don@example.com Contractor Invoice Date:10/26/2016 Amount Due$21.

Let me know if this works fine for you. 让我知道这是否适合您。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM