如何从列表中的项目中删除标点符号并将其另存为列表中的单独项目？

Question

I am trying to compress items from one list to another list and I need to be able to save punctuation as separate items in the list because if I don't, "you" and "you;" 我试图将项目从一个列表压缩到另一个列表，我需要能够将标点符号保存为列表中的单独项目，因为如果我不这样做，“你”和“你;” are saved as separate items in the list. 被保存为列表中的单独项目。

For example the original list is, 例如，原始列表是，

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you;', 'ask', 'what', 'you', 'can', 'do', 'for', 'your', 'country!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'is', 'a', 'former', 'American', 'President.']

and the compressed list is currently, 目前压缩列表是，

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you;', 'ask', 'you', 'country!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'former', 'American', 'President.']

but I want it to have punctuation as separate items in the list. 但我希望它将标点符号作为列表中的单独项目。

My intended output is, 我的预期输出是，

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you', ';', 'ask', '!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'former', 'American', 'President', '.']

Answer 1

You can implement with regex . 您可以使用regex实现。

import re
a = ['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you;', 'ask', 'what', 'you', 'can', 'do', 'for', 'your', 'country!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'is', 'a', 'former', 'American', 'President.']
result = re.findall(r"[\w']+|[.,!?;]",' '.join(a))

Output 产量

['Ask', 'not', 'what', 'your', 'country', 'can', 'do', 'for', 'you', ';', 'ask', 'what', 'you', 'can', 'do', 'for', 'your', 'country', '!', 'This', 'is', 'a', 'quote', 'from', 'JFK', 'who', 'is', 'a', 'former', 'American', 'President', '.']

Here is a demo to understand more about regex . 这是一个了解有关正则表达式的更多信息的演示。

Answer 2

This is the code to separete the non alphabetic characters and also remove duplicates. 这是分隔非字母字符并删除重复字符的代码。 hope it helps. 希望能帮助到你。

def separate(mylist):
    newlist = [] 
    test = ''
    a = ''
    for e in mylist:
        for c in e:   
            if not c.isalpha():
                a = c
            else:
                test = test + c
        if a != '':
            newlist = newlist + [test] + [a]
        else:
            newlist = newlist + [test]
        test = ''
        a = ''
    noduplicates = []
    for i in newlist:
        if i not in noduplicates:
            noduplicates = noduplicates + [i]
    return noduplicates

I`m sure someone else can do better couse this is a bit messy but at least works. 我相信别人可以做得更好，这有点乱，但至少有效。

如何从列表中的项目中删除标点符号并将其另存为列表中的单独项目？

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-06-15 11:10:10

解决方案2
0 2016-06-15 11:11:48

如何从列表中的项目中删除标点符号并将其另存为列表中的单独项目？

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-06-15 11:10:10

解决方案2 0 2016-06-15 11:11:48

解决方案1
2 已采纳 2016-06-15 11:10:10

解决方案2
0 2016-06-15 11:11:48