简体   繁体   English

如何找到一个单词 - 第一个字母大写,其他字母小写

[英]How to find a word - First letter will be capital & other will be lower

Problem Statement: Filter those words from the complete set of text6, having first letter in upper case and all other letters in lower case.问题陈述:从完整的文本集中过滤这些单词,第一个字母大写,其他所有字母小写。 Store the result in variable title_words.将结果存储在变量 title_words 中。 print the number of words present in title_words.打印 title_words 中存在的单词数。

I have tried every possible ways to find the answer but don't know where I am lagging.我已经尝试了所有可能的方法来找到答案,但不知道我落后于哪里。

import nltk
from nltk.book import text6
title_words = 0
for item in set(text6):
    if item[0].isupper() and item[1:].islower():
        title_words += 1
print(title_words)

I have tried in this way as well:我也试过这种方式:

title_words = 0
for item in text6:
    if item[0].isupper() and item[1:].islower():
        title_words += 1
print(title_words)

I am not sure how many count its required, whatever the count is coming its not allowing me to pass the challenge.我不确定需要多少计数,无论计数即将到来,它都不允许我通过挑战。 Please let me know if I am doing anything wrong in this code如果我在这段代码中做错了什么,请告诉我

One of the above suggestions did work for me.上述建议之一确实对我有用。 Sample code below.下面的示例代码。

title_words = [word for word in text6 if (len(word)==1 and word[0].isupper()) or (word[0].isupper() and word[1:].islower()) ]
print(len(title_words))

In the question, "Store the result in variable title_words. print the number of words present in title_words."在问题中,“将结果存储在变量 title_words 中。打印 title_words 中存在的单词数。”

The result of filtering a list of elements is a list of the same type of elements.过滤一个元素列表的结果是一个相同类型元素的列表。 In your case, filtering the list text6 (assuming it's a list of strings) would result in a (smaller) list of strings.在您的情况下,过滤列表text6 (假设它是一个字符串列表)将导致一个(较小的)字符串列表。 Your title_words variable should be this filtered list, not the number of strings;你的title_words变量应该是这个过滤列表,而不是字符串的数量; the number of strings would just be the length of the list.字符串的数量就是列表的长度。

It's also ambiguous from the question if capitalized words should be filtered out (ie. removed from the smaller list) or filtered (ie. kept in the list), so try out both to see if you're interpreting it incorrectly.如果大写的单词应该被过滤掉(即从较小的列表中删除)还是过滤掉(即保留在列表中),这个问题也是模棱两可的,因此请尝试两者,看看您是否对它的解释有误。

Give regular expressions a try:试试正则表达式:

>>> import re
>>> from nltk.book import text6
>>>
>>> text = ' '.join(set(text6))
>>> title_words = re.findall(r'([A-Z]{1}[a-z]+)', text)
>>> len(title_words)
461

text6 中有 50 个单例元素(长度为 1 的元素),但是,您的代码不会成功通过任何元素,例如“I”或“W”等。这是正确的,还是您需要最小长度为 2 的单词?

I think the problem is with set(text6) .我认为问题出在set(text6) I suggest you iterate over text6.tokens .我建议你迭代text6.tokens

Update, explanation更新、说明

The code you've provided is correct.您提供的代码是正确的。

The issues is that the text can contain same words multiple times.问题是文本可以多次包含相同的单词。 Doing a set(words) will reduce the total available words, so you start with an incomplete data set.做一个set(words)会减少总可用词,所以你从一个不完整的数据集开始。

The other responses are not necessary wrong in checking the validity of a word, but they are iterating over the same wrong data set.其他响应在检查单词的有效性时不一定是错误的,但它们正在迭代相同的错误数据集。

Just few changes according to what the question asks.根据问题的要求,只需进行一些更改。

from nltk.book import text6
title_words = []
for item in set(text6):
    if item[0].isupper() and item[1:].islower():
        title_words.append(item)
print(len(title_words))

Try this one:试试这个:

title_words = [ word for word in text6 if word.istitle()]

print(len(title_words))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何查找单词 - 第一个字母大写,其他字母小写 - How to find a words - First letter will be capital & other will be lower 程序:在word中查找大写字母python - Program :find Capital letter in word in python 如何让我的代码将其中包含大写字母的单词的首字母大写? (猪拉丁语) - How do I make my code capitalize the first letter of the word that has a capital letter in it? (Pig Latin) Python:在大写字母前查找小写/数字的正则表达式条件 - Python: regex condition to find lower case/digit before capital letter 如何找到以大写字母开头的任何单词包围的特定预定义单词? - How to find a specific, pre-defined word surrounded by any word(s) starting with a capital letter(s)? 如何改变python中大写字母的单词的第i个字母? - how to change ith letter of a word in capital letter in python? Python:字符串索引超出范围,试图找到第一个大写字母 - Python :string index out of range with trying to find first capital letter 如何使单词大写的第一个字母? - How to make the first alphabet of the word capital? 大写字母单词计数python - Capital letter word count python 如何在Python中获取第一个大写字母,然后再获取每个不跟另一个大写字母的字母? - How to get the first capital letter and then each that isn't followed by another capital letter in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM