简体   繁体   English

创建一个函数以返回句子中所有大写的单词(不包括逗号)

[英]Creating a function that returns all capitalized words in a sentence (commas excluded)

I need to create a function that will return all capitalized words from a sentence into a list. 我需要创建一个函数,将一个句子中的所有大写单词返回到列表中。 If the word ends with a comma, you need to exclude it (the comma). 如果单词以逗号结尾,则需要将其排除(逗号)。 This is what I came up with: 这是我想出的:

def find_cap(sentence):
    s = []
    for word in sentence.split():
        if word.startswith(word.capitalize()):
            s.append(word)
        if word.endswith(","):
            word.replace(",", "")
    return s

My problem: The function seems to work, but if I have a sentence and a word is in quotes, it returns the word in quotes even if it isn't capitalized. 我的问题:该函数似乎可以正常工作,但是如果我有一个句子并且一个单词用引号引起来,即使它没有大写,它也会以引号引起该单词。 Also the commas aren't replaced, even though I used word.replace(",", "") . 即使我使用word.replace(",", "")也不替换逗号。 Any tips would be appreciated. 任何提示将不胜感激。

Strings are an immutable type in Python. 字符串是Python中的不可变类型。 This means that word.replace(",", "") will not mutate the string word is pointing at; 这意味着word.replace(",", "")不会使word所指向的字符串发生突变; it will return a new string with the commas replaced. 它将返回替换为逗号的新字符串。

Also, since this is a stripping problem (and commas are not likely to be in the middle of words), why not use string.strip() instead? 另外,由于这是一个剥离问题(逗号不太可能出现在单词中间),为什么不使用string.strip()代替呢?

Try something like this: 尝试这样的事情:

import string

def find_cap(sentence):
    s = []
    for word in sentence.split():

        # strip() removes each character from the front and back of the string
        word = word.strip(string.punctuation)

        if word.startswith(word.capitalize()):
            s.append(word)
    return s

Use regular expression to do this: 使用正则表达式可以做到这一点:

>>> import re
>>> string = 'This Is a String With a Comma, Capital and small Letters'
>>> newList = re.findall(r'([A-Z][a-z]*)', string)
>>> newList
['This', 'Is', 'String', 'With', 'Comma', 'Capital', 'Letters']

using re.findall : 使用re.findall

  a= "Hellow how, Are You"
  re.findall('[A-Z][a-z]+',a)
  ['Hellow', 'Are', 'You']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM