简体   繁体   English

python列表计数元素

[英]python list counting elements

I have a code as below 我有如下代码

How can I find that abc is a list made up of lists? 如何找到abc是由列表组成的列表?

Whats wrong with my map function? 我的地图功能出了什么问题?

I want my function to return count of each element in my input list divided by length of my list. 我希望函数返回输入列表中每个元素的计数除以列表的长度。

Something like 就像是

{'brown': 0.16666666666666666, 'lazy': 0.16666666666666666, 'jumps': 0.16666666666666666, 'fox': 0.16666666666666666,  'dog': 0.16666666666666666, 'quick': 0.16666666666666666}

My code: 我的代码:

quickbrownfox1=['quick', 'brown', 'fox', 'jumps', 'lazy', 'dog']
print quickbrownfox1


def tf(tokens):

    abc=([[x,(tokens.count(x))] for x in set(tokens)])
    print type(abc)#how to know that abc is made up of lists
    print type(abc[1])
    answer=abc.map(lambda input:(input(0)),input(1)/len(tokens)))

    return answer
    #return <FILL IN>

print tf((quickbrownfox1)) # Should give { 'quick': 0.1666 ... }
#print tf(tokenize(quickbrownfox)) # Should give { 'quick': 0.1666 ... }

_______________________________________ _______________________________________

update 1 更新1

I updated my code as below. 我更新了我的代码,如下所示。 I get result [('brown', 0), ('lazy', 0), ('jumps', 0), ('fox', 0), ('dog', 0), ('quick', 0)] any idea why? 我得到结果[('brown', 0), ('lazy', 0), ('jumps', 0), ('fox', 0), ('dog', 0), ('quick', 0)]为什么知道? If i do return return list(map(lambda input: (input[0], input[1]), abc)) , it gives correct result - [('brown', 1), ('lazy', 1), ('jumps', 1), ('fox', 1), ('dog', 1), ('quick', 1)] 如果我确实return return list(map(lambda input: (input[0], input[1]), abc)) ,它给出正确的结果- [('brown', 1), ('lazy', 1), ('jumps', 1), ('fox', 1), ('dog', 1), ('quick', 1)]

from __future__ import division
quickbrownfox1=['quick', 'brown', 'fox', 'jumps', 'lazy', 'dog']

def islistoflists(i):
    if isinstance(i, list):
        if len(i) > 0 and all(isinstance(t, list) for t in i):
            return True
    return False


def tf(tokens):

    print(islistoflists(tokens))

    abc = ([[x,tokens.count(x)] for x in set(tokens)])
    return list(map(lambda input: (input[0], input[1] / len(tokens)), abc))

print tf(quickbrownfox1)

update 2 更新2

I am using pyspark/spark. 我正在使用pyspark / spark。 Could that be a reason for issues that i am facing in update1? 这可能是导致我在update1中遇到问题的原因吗?

The counter solution will definitely be better. 应对方案肯定会更好。 Your use of tokens.count gives the code quadratic time complexity. 您对tokens.count使用使代码具有二次时间复杂度。 Heres your code fixed up. 这是您的代码已修复。 You should note that map is a standalone function, not a member function of a list or any other type. 您应该注意, map是一个独立函数,而不是列表或任何其他类型的成员函数。

from __future__ import division
quickbrownfox1=['quick', 'brown', 'fox', 'jumps', 'lazy', 'dog']

def islistoflists(i):
    if isinstance(i, list):
        if len(i) > 0 and all(isinstance(t, list) for t in i):
            return True
    return False


def tf(tokens):

    print(islistoflists(tokens))

    abc = ([[x,tokens.count(x)] for x in set(tokens)])
    return list(map(lambda input: (input[0], input[1] / len(tokens)), abc))

print tf(quickbrownfox1)

To test if you have a list of lists, you can use isinstance to check the type of the parent object and if its a list and has at least one element in it, you can loop through them using isinstance to check if each child object is a list. 要测试您是否具有列表列表,可以使用isinstance检查父对象的类型,如果它是一个列表并且其中包含至少一个元素,则可以使用isinstance遍历它们,以检查每个子对象是否为一个列表。

Note that I made your function return a list of tuples, implying that the items are read only, but you could make it return a list of lists by changing the line. 请注意,我使您的函数返回一个元组列表,这意味着这些项是只读的,但是您可以通过更改该行使它返回一个列表列表。

return list(map(lambda input: [input[0], input[1] / len(tokens)], abc))

If you look at it closely you'll see that a set of parenthesis have been substituted for square brackets, making each element a list. 如果仔细观察,您会发现一组括号已替换为方括号,从而使每个元素成为一个列表。

If you have a older version of python 2 that does not support the from __future__ import division import you can uses the following workaround to force float division to occur. 如果您有不支持from __future__ import division import的旧版本的python 2,则可以使用以下变通办法强制进行float划分。

return list(map(lambda input: (input[0], (input[1] * 1.0) / len(tokens)), abc))

Based on what I think you're asking you could do something like 根据我的要求,您可以做类似的事情

token_size = len(tokens)
word_counter_list = {}
for word in tokens:
    if word in word_counter_list:
        word_counter_list[word] += 1
    else:
        word_counter_list[word] = 1

for word, amount in word_counter_list:
    print("The word " + word + " was used " + str(amount/token_size)

That being said the question isn't very clear since you're mentioning list type() but showing percentage of word frequency in a list 话虽这么说,问题不是很清楚,因为您提到列表type(),但显示列表中单词频率的百分比

You should be able to do this fairly easily with a Counter : 您应该可以使用Counter轻松完成此操作:

$ python3
Python 3.4.2 (default, Oct  8 2014, 10:45:20) 
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
@>>> from collections import Counter
@>>> c = Counter(['quick', 'brown', 'fox', 'jumps', 'lazy', 'dog'])
@>>> total = sum(c.values())
@>>> result = dict()
@>>> for key, value in c.items():
@...   result[key] = value/total
@... 
@>>> result
{'dog': 0.16666666666666666, 'quick': 0.16666666666666666, 'fox': 0.16666666666666666, 'brown': 0.16666666666666666, 'jumps': 0.16666666666666666, 'lazy': 0.16666666666666666}

or, to make it super pythonic: 或者,使其成为超级pythonic:

dict([ (key, value/total) for key,value in c.items() ])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM