繁体   English   中英

如何删除列表python中的\\ t \\ n \\ r进行网络抓取?

[英]how to remove \t\n\r in list python doing web Scraping?

如何删除['公司名称', '总部位置', '公司类型\\n\\r\\n\\t\\t\\t\\t\\t\\t\\t\\t?', '中的\\n\\r\\t机队规模']

for i in head:
    c = i.text.strip()
    a.append(c)
    print(a)```

*Output**
['Company Name',
 'Headquarters Location',
 'Company Type\n\r\n\t\t\t\t\t\t\t\t?',
 'Fleet Size'] 

使用re.sub

import re

lst = ['Company Name', 'Headquarters Location', 'Company Type\n\r\n\t\t\t\t\t\t\t\t?', 'Fleet Size']

output = [re.sub(r'[\r\n\t]', '', x) for x in lst]
print(output) # ['Company Name', 'Headquarters Location', 'Company Type?', 'Fleet Size']

请不要烧毁我的变量名

a=['Company Name',
 'Headquarters Location',
 'Company Type\n\r\n\t\t\t\t\t\t\t\t?',
 'Fleet Size'] 

b = []
unwanted = ["\n","\t","\r"]

for i in a:
    to_add = ""
    for char in i:
        if char not in unwanted:
            to_add += char
    b.append(to_add)

print(b)

我认为正则表达式模块会很有效。

import re

for i in head:
    c = i.text.strip()
    # ==== regex substitution ====
    c = re.sub(r'[\r\n\t]', '', c, flags=re.MULTILINE)
    a.append(c)
    print(a)

*Output**
['Company Name',
 'Headquarters Location',
 'Company Type?',
 'Fleet Size'] 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM