简体   繁体   English

如何删除标点符号并使python的多个字符串语句的所有评论小写

[英]How can i remove punctuation and make all the reviews lower case for multiple string statements for python

I have like 3 strings and how can I remove the punctuation and make all the reviews lower-case and then print out all 3 reviews thereafter.我有 3 个字符串,如何删除标点符号并使所有评论小写,然后打印出所有 3 个评论。

Review1 = 'My great auntie has lived at Everton Park for decades, and once upon a time I even lived here too, and I remember the days before when there was nothing remotely hipster about this housing block.  It is really cool to see cute new cafes and coffee shops moving in, and I've been to Nylon every time I'm back in town.'

Review2 = 'Solid coffee in the Outram Park neighborhood. Location is hidden in a HDB block so you definitely need to search for it. Minus one star for limited seating options'

Review3 = 'Deserve it, truly deserves this much reviews. I will describe coffee here as honest, sincere, decent, strong, smart.'
Review1 = "My great auntie has lived at Everton Park for decades, and once upon a time I even lived here too, and I remember the days before when there was nothing remotely hipster about this housing block.  It is really cool to see cute new cafes and coffee shops moving in, and I've been to Nylon every time I'm back in town."

import string
Review1_Fixed = Review1.lower().translate(str.maketrans('', '', string.punctuation))
print(Review1_Fixed)

Output:输出:

"my great auntie has lived at everton park for decades and once upon a time i even lived here too and i remember the days before when there was nothing remotely hipster about this housing block  it is really cool to see cute new cafes and coffee shops moving in and ive been to nylon every time im back in town"

For more information on what this command is doing, or more ways of doing this see this post .有关此命令正在执行的操作或执行此操作的更多方法的更多信息,请参阅此帖子

Using re module:使用re模块:

Review1 = '''My great auntie has lived at Everton Park for decades, and once upon a time I even lived here too, and I remember the days before when there was nothing remotely hipster about this housing block. It is really cool to see cute new cafes and coffee shops moving in, and I've been to Nylon every time I'm back in town.'''
Review2 = '''Solid coffee in the Outram Park neighborhood. Location is hidden in a HDB block so you definitely need to search for it. Minus one star for limited seating options'''
Review3 = '''Deserve it, truly deserves this much reviews. I will describe coffee here as honest, sincere, decent, strong, smart.'''

import re

def strip_punctuation_make_lowercase(*strings):
    return map(lambda s: re.sub(r'[^\s\w]+', '', s).lower(), strings)

Review1, Review2, Review3 = strip_punctuation_make_lowercase(Review1, Review2, Review3)

print(Review1)
print()

print(Review2)
print()

print(Review3)
print()

Prints:印刷:

my great auntie has lived at everton park for decades and once upon a time i even lived here too and i remember the days before when there was nothing remotely hipster about this housing block it is really cool to see cute new cafes and coffee shops moving in and ive been to nylon every time im back in town

solid coffee in the outram park neighborhood location is hidden in a hdb block so you definitely need to search for it minus one star for limited seating options

deserve it truly deserves this much reviews i will describe coffee here as honest sincere decent strong smart
In [23]: whitelist = set(string.ascii_letters)                                                                                                                                                                                                                                                                                

In [24]: rev1 = "My great auntie has lived at Everton Park for decades, and once upon a time I even lived here too, and I remember the days before when there was nothing remotely hipster about this housing block. It is really cool to see cute new cafes and coffee shops moving in, and I've been to Nylon every time I'm
    ...:  back in town."                                                                                                                                                                                                                                                                                                      

In [25]: ''.join([char for char in rev1 if char in whitelist])                                                                                                                                                                                                                                                                
Out[25]: 'MygreatauntiehaslivedatEvertonParkfordecadesandonceuponatimeIevenlivedheretooandIrememberthedaysbeforewhentherewasnothingremotelyhipsteraboutthishousingblockItisreallycooltoseecutenewcafesandcoffeeshopsmovinginandIvebeentoNyloneverytimeImbackintown'

In [26]: whitelist = set(string.ascii_letters + ' ')                                                                                                                                                                                                                                                                          

In [27]: ''.join([char for char in rev1 if char in whitelist])                                                                                                                                                                                                                                                                
Out[27]: 'My great auntie has lived at Everton Park for decades and once upon a time I even lived here too and I remember the days before when there was nothing remotely hipster about this housing block It is really cool to see cute new cafes and coffee shops moving in and Ive been to Nylon every time Im back in town'

__contains__ method defines how instances of class behave when they appear at right side of in and not in operator. __contains__方法定义了类的实例出现在 in 和 not in 运算符的右侧时的行为方式。

from string import ascii_letters

Review1 = "My great auntie has lived at Everton Park for decades, and once upon a time I even lived here too, and I remember the days before when there was nothing remotely hipster about this housing block.  It is really cool to see cute new cafes and coffee shops moving in, and I've been to Nylon every time I'm back in town."

key = set(ascii_letters + ' ') # key = set('abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ')
Review1_ = ''.join(filter(key.__contains__, Review1)).lower()
print (Review1_)

output:输出:

my great auntie has lived at everton park for decades and once upon a time i even lived here too and i remember the days before when there was nothing remotely hipster about this housing block  it is really cool to see cute new cafes and coffee shops moving in and ive been to nylon every time im back in town

For remove the punctuation s.translate(None, string.punctuation) or create you own function def Punctuation(string):删除标点符号s.translate(None, string.punctuation)或创建自己的函数def Punctuation(string):
punctuations = '''!()-[]{};:'"\\,<>./?@#$%^&*_~'''标点符号 = '''!()-[]{};:'"\\,<>./?@#$%^&*_~'''

for x in string.lower(): 
    if x in punctuations: 
        string = string.replace(x, "") 

# Print string without punctuation 
print(string) 

For lower case string.lower()对于小写string.lower()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM