简体   繁体   English

如何从字典中获取单词列表的值?

[英]how to get values from a dictionary for a list of words?

(Sorry if I write the post wrong or made some serious mistakes in the code, it is my first week in this) (很抱歉,如果我在帖子中写错了或在代码中犯了一些严重错误,这是我参加的第一周)

I have a dictionary, obtained with this code: 我有一本通过以下代码获得的字典:

import json
sentimientos=open("Sentimientos.txt")
valores={}
for linea in sentimientos:
    termino, valor=linea.split("\t")
    valores[termino]=(int(valor)):
print(valores.items())

That looks like this: 看起来像这样:

dict_items([('abandon', -2), ('abandoned', -2), ('abandons', -2), ('abducted', -2)... dict_items([('abandon',-2),('abandoned',-2),('abandons',-2),('abappeded ,,-2)...

But with a ton of words 但是有很多话

Then I have list of words (obtained from tweets with the method .split(" ")), and I need to check, for each word of the second list, if that words exists on the dictionary, and if so, put his value in the dict. 然后,我有一个单词列表(使用方法.split(“”)从推文中获取),对于第二个列表中的每个单词,我需要检查该单词是否存在于字典中,如果存在,请输入其值在字典中。

The code with which I have obtained the words from the list is: 我从列表中获得单词的代码是:

tw = open("salida_tweets.txt")
tweets = []
for linea in tw:
    clean_tweet = json.loads(linea)
    tweets.append(clean_tweet["text"])
    words = [tweet.split(" ") for tweet in tweets]
print(words)

And I have something like: 我有类似的东西:

[['@Brenamae_', 'I', 'WHALE', 'SLAP', 'YOUR', 'FIN', 'AND', 'TELL', 'YOU', 'ONE', 'LAST', 'TIME:'... [[''Brenamae _','I','WHALE','SLAP','YOUR','FIN','AND','TELL','YOU','ONE','LAST','TIME: '...

But, as before, with a lot of words 但是,和以前一样,很多话

As I said, I need to make a list that, for each tweet, print the value of each tweet word that is in the dictionary (the sum of the words if the tweet has more than 1 word). 就像我说的,我需要列出一个列表,为每个推文打印字典中每个推文单词的值(如果该推文包含多个单词,则为单词的总和)。

I'm having serious problems tryng to do that. 我在尝试这样做时遇到严重问题。

¡Thanks everyone! 感谢大家!

PD: What I've tried is: PD:我尝试过的是:

import json
sentimientos=open("Sentimientos.txt")
valores={}
for linea in sentimientos:
    termino, valor=linea.split("\t")
    valores[termino]=(int(valor)):
tw = open("salida_tweets.txt")
tweets = []
for linea in tw:
    clean_tweet = json.loads(linea)
    tweets.append(clean_tweet["text"])
    words = [tweet.split(" ") for tweet in tweets]
    if words in valores:
    valorestweet.append(sum(valores.get(words) for valor in valores)

And what I get is 我得到的是

<ipython-input-68-30a0230d33a7> in <module>()
    19         tweets.append(clean_tweet["text"])
    20         words = [tweet.split(" ") for tweet in tweets] 
    ---> 21         if words in valores:
    22             valorestweet.append(sum(valores.get(words) for valor in valores))
    23 print(valorestweet)

TypeError: unhashable type: 'list' TypeError:无法散列的类型:“列表”

lines 22 and 23 are remarked in red 第22和23行用红色标记

I am really not sure I got it right, but let's say you have this input: 我确实不确定我是否正确,但是假设您输入以下内容:

tweet0 = "Hello, I am groot"
tweet1 = "My name is red"
tweets = [tweet0, tweet1]

With this dictionnary: 使用此词典:

dict = {'Hello': 1, 'I': -2, 'Yellow': -2, 'blue': -5, 'red': 4}

then the expected output would be a list like this one: 那么预期的输出将是这样的列表:

[sum of the words value for tweet 1, sum of the world values for tweet 2] [tweet 1的单词值之和与tweet 2的世界值之和]

If that is really what you want, then this code does the trick: 如果这确实是您想要的,那么这段代码可以解决问题:

dict = {'Hello': 1, 'I': -2, 'Yellow': -2, 'blue': -5, 'red': 4}

tweet0 = "Hello, I am groot"
tweet1 = "My name is red"
tweets = [tweet0, tweet1]

words = [tweet.split(" ") for tweet in tweets]

Results = list()

for i in range(len(tweets)):
    # words[i] are the words from the tweet i
    value = 0
    for word in words[i]:
        if word in dict:
            value += dict[word]
    Results.append(value)

print (Results)

The output with this example is: 此示例的输出是:

[-2, 4]

-2 because only "I" is present in tweet0, and 4 because "red" is present in tweet1. -2是因为tweet0中仅存在“ I”,而4是因为tweet1中存在“红色”。

As you notice, since there is a "," after "Hello", it doesn't take this word into account. 如您所见,由于在“ Hello”之后有一个“,”,因此没有考虑到该词。 This can be fix with an other in statement, and we can also add .lower() method to the str to avoid any problem with the capital letters. 这可以用其他in语句解决,我们也可以在str中添加.lower()方法,以避免大写字母出现任何问题。

Since I'm not sure about what you want, I just did this proof of concept. 由于我不确定您想要什么,所以我只是做了这个概念证明。 I could improve it if you would give us clear examples. 如果您能给我们清楚的例子,我可以改善它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM