简体   繁体   English

Python,使用字典和集合来跟踪一个词的所有出现

[英]Python, using dictionaries and sets to keep track of all occurences of a word

I am trying to solve this problem:我正在尝试解决这个问题:

You are given n words.给你 n 个单词。 Some words may repeat.有些话可能会重复。 For each word, output its number of occurrences.对于每个单词,输出其出现次数。 The output order should correspond with the input order of appearance of the word.输出顺序应与单词出现的输入顺序相对应。 See the sample input/output for clarification.有关说明,请参阅示例输入/输出。

Note: Each input line ends with a "\\n" character.注意:每个输入行都以“\\n”字符结尾。

Input Format输入格式

The first line contains the integer, n .第一行包含整数 n 。

The next n lines each contain a word.接下来的 n 行每行包含一个单词。

Output Format输出格式

Output 2 lines.输出 2 行。 On the first line, output the number of distinct words from the input.在第一行,输出来自输入的不同单词的数量。 On the second line, output the number of occurrences for each distinct word according to their appearance in the input.在第二行,根据每个不同单词在输入中的出现输出它们的出现次数。

I have implemented a solution like this,我已经实施了这样的解决方案,

# Enter your code here. Read input from STDIN. Print output to STDOUT

n  = int(input())

mySet = set()
myDict = {}

for i in range(n):
    inp = input()[:-1]
    if inp not in mySet:
        mySet.add(inp)
        myDict[inp] = 1
    else:
        myDict[inp] += 1

print(len(mySet))
# print(' '.join(list(map(str, myDict.values()))))
print(*myDict.values())

My strategy is as follows:我的策略如下:

If the word is not in mySet , add it to the set and create a value-key pair on myDict with word as key and 1 as value.如果单词不在mySet 中,则将其添加到集合中并在myDict上创建一个值键对,单词为键,1 为值。

If the word is in the set already, then increment the value of that word in the dictionary.如果该词已经在集合中,则增加该词在字典中的值。

However, half of the test cases are successfull, but the rest are "Wrong Answer".然而,一半的测试用例成功了,其余的都是“错误的答案”。 So, I wonder, can anybody point out what do I miss to add to this code?所以,我想知道,有人能指出我想添加到这段代码中的什么吗?

Your mistake is in inp = input()[:-1] .你的错误在于inp = input()[:-1] The [:-1] cuts off the word's last character. [:-1]切断单词的最后一个字符。 I guess you're trying to remove the newline character, but input() is already " stripping a trailing newline ".我猜您正在尝试删除换行符,但input()已经“剥离尾随换行符”。 Demo:演示:

>>> [input() for _ in range(2)]
foo
bar
['foo', 'bar']

Simpler solution, btw:更简单的解决方案,顺便说一句:

from collections import Counter

ctr = Counter(input() for _ in range(int(input())))
print(len(ctr))
print(*ctr.values())

And for fun a tricky version of that (also gets accepted):有趣的是,一个棘手的版本(也被接受):

from collections import Counter

ctr = Counter(map(input, [''] * int(input())))
print(len(ctr))
print(*ctr.values())

Another one:另一个:

from collections import Counter
import sys

next(sys.stdin)
ctr = Counter(map(str.strip, sys.stdin))
print(len(ctr))
print(*ctr.values())

This one reads the whole lines, so here the strings do include the newline character.这个读取整行,所以这里的字符串确实包含换行符。 That wouldn't matter if all lines had it, but no, HackerRank commits the cardinal sin of not ending the last line with a newline.如果所有行都有它,那也没关系,但是不,HackerRank 犯下了不以换行符结束最后一行的大罪。 So I strip them off every line.所以我strip它们赶走的每一行。 Sigh.叹。

In the problem, you don't need to use the set property and the way you are using input ie input()[::-1] is wrong as it is reversing the input which you have entered, ie for the input abc it is storing it as cba .在问题中,您不需要使用set属性,并且您使用输入的方式即 input()[::-1] 是错误的,因为它正在反转您输入的输入,即输入abc it将其存储为cba

now to solve problem, as OP has figured out to take unique item use set, and to get the frequency of each word use dictionary.现在要解决问题,因为 OP 已经想出采用唯一的项目使用集,并获得每个单词使用字典的频率。 but using set is costly operation and dictionary can be used to get the unique word as it store unique keys which can be used to get the unique elments, and to get the frequency use the same operation as you are doing.但是使用 set 是昂贵的操作,并且字典可用于获取唯一词,因为它存储可用于获取唯一元素的唯一键,并使用与您正在执行的操作相同的操作来获取频率。

instead of adding key, value in dictionary by dict[key]=value better to update the current dictionary with inbuild method ie dict1.update(dict2) where dict1 is orignal dictionary and dict2 is new dictionary with key and value , dict2={key:value} , by this way you will keep the order of the element in dict.而不是通过dict[key]=value value 在字典中添加 key, value 更好地使用 inbuild 方法更新当前字典,即dict1.update(dict2)其中dict1是原始字典, dict2是带有键和值的新字典, dict2={key:value} ,通过这种方式,您将保持 dict 中元素的顺序。

to get the len of unique word , len of dictionary work ie len(dic) and to get the frequency of each word values need to printed ie print(*dic.values())获取唯一单词的 len ,字典工作的 len ,即len(dic)并获取需要打印的每个单词值的频率,即print(*dic.values())

n = int(input())
dic  = {}

for _ in range(n):
  w = input()
  if w in dic.keys():
    dic[w]+=1
  else:
    dic.update({w:1})

print(len(dic.keys()))
print(*dic.values())
n=int(input())
word_list=[]
for i in range(n):
    word_list.append(input())
    
New_word_list=[]
for element in word_list:
    if element in New_word_list:
        New_word_list=New_word_list    
    else:
        New_word_list.append(element)

print(len(New_word_list))
for element in New_word_list:
    print(word_list.count(element), end=" ")

Input:输入:

10
cat
dog
dog
rabbit
cat
pig
horse
pig
pig
goat

Results:结果:

6
2 2 1 3 1 1 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM