简体   繁体   English

python中的“reduce”函数不适用于“namedtuple”?

[英]“reduce” function in python not work on “namedtuple”?

I have a log file that is formatted in the following way: 我有一个日志文件,格式如下:

datetimestring \t username \t transactionName \r\n

I am attempting to run some stats over this dataset. 我试图在这个数据集上运行一些统计数据。 I have the following code: 我有以下代码:

import time
import collections
file = open('Log.txt', 'r')

TransactionData = collections.namedtuple('TransactionData', ['transactionDate', 'user', 'transactionName'])
transactions = list()

for line in file:
    fields = line.split('\t')

    transactionDate = time.strptime(fields[0], '%Y-%m-%d %H:%M:%S')
    user = fields[1]
    transactionName = fields[2]

    transdata = TransactionData(transactionDate, user, transactionName)
    transactions.append(transdata)

file.close()

minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
print minDate

I did not want to define a class for such a simple dataset, so I used a name tuple. 我不想为这样一个简单的数据集定义一个类,所以我使用了一个名字元组。 When I attempt to run, I get this error: 当我尝试运行时,我收到此错误:

Traceback (most recent call last):
  File "inquiriesStat.py", line 20, in <module>
    minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
  File "inquiriesStat.py", line 20, in <lambda>
    minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
AttributeError: 'time.struct_time' object has no attribute 'transactionDate'

It appears that the lambda function is operating on the 'transactionDate' property directly instead of passing in the full tuple. 似乎lambda函数直接在'transactionDate'属性上运行,而不是传入完整的元组。 If I change the lambda to: 如果我将lambda更改为:

lambda x,y: min(x, y)

It works as I would expect. 它像我期望的那样工作。 Any ideas why this would be the case? 任何想法为什么会这样?

Simply use: 只需使用:

minDate = min(t.transactionDate for t in transactions)

Below is an explanation of why your code isn't working. 以下是您的代码无法正常工作的原因说明。

Let's say transactions = [t1, t2, t3] where t1 ... t3 are three named tuples. 假设transactions = [t1, t2, t3] ,其中t1 ... t3是三个命名元组。

By the definition of reduce , your code: 通过reduce的定义,您的代码:

reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)

is equivalent to 相当于

min(min(t1.transactionDate, t2.transactionDate).transactionDate, t3.transactionDate)

Clearly, the inner min() returns time.struct_time instead of a named tuple, so when reduce tries to apply .transactionDate to it, that fails. 显然,内部min()返回time.struct_time而不是命名元组,因此当reduce尝试将.transactionDate应用于它时,失败。

There are ways to fix this, and to make use of reduce for this problem. 有办法解决这个问题,并利用reduce来解决这个问题。 However, there seems to be little point given that a direct application of min does the job and to my eye is a lot clearer than anything involving reduce . 然而,似乎没有什么意义,因为min的直接应用可以完成这项工作,而且我的眼睛比任何涉及reduce事情要清楚得多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM