[英]“reduce” function in python not work on “namedtuple”?
I have a log file that is formatted in the following way: 我有一个日志文件,格式如下:
datetimestring \t username \t transactionName \r\n
I am attempting to run some stats over this dataset. 我试图在这个数据集上运行一些统计数据。 I have the following code:
我有以下代码:
import time
import collections
file = open('Log.txt', 'r')
TransactionData = collections.namedtuple('TransactionData', ['transactionDate', 'user', 'transactionName'])
transactions = list()
for line in file:
fields = line.split('\t')
transactionDate = time.strptime(fields[0], '%Y-%m-%d %H:%M:%S')
user = fields[1]
transactionName = fields[2]
transdata = TransactionData(transactionDate, user, transactionName)
transactions.append(transdata)
file.close()
minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
print minDate
I did not want to define a class for such a simple dataset, so I used a name tuple. 我不想为这样一个简单的数据集定义一个类,所以我使用了一个名字元组。 When I attempt to run, I get this error:
当我尝试运行时,我收到此错误:
Traceback (most recent call last):
File "inquiriesStat.py", line 20, in <module>
minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
File "inquiriesStat.py", line 20, in <lambda>
minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
AttributeError: 'time.struct_time' object has no attribute 'transactionDate'
It appears that the lambda function is operating on the 'transactionDate' property directly instead of passing in the full tuple. 似乎lambda函数直接在'transactionDate'属性上运行,而不是传入完整的元组。 If I change the lambda to:
如果我将lambda更改为:
lambda x,y: min(x, y)
It works as I would expect. 它像我期望的那样工作。 Any ideas why this would be the case?
任何想法为什么会这样?
Simply use: 只需使用:
minDate = min(t.transactionDate for t in transactions)
Below is an explanation of why your code isn't working. 以下是您的代码无法正常工作的原因说明。
Let's say transactions = [t1, t2, t3]
where t1
... t3
are three named tuples. 假设
transactions = [t1, t2, t3]
,其中t1
... t3
是三个命名元组。
By the definition of reduce
, your code: 通过
reduce
的定义,您的代码:
reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
is equivalent to 相当于
min(min(t1.transactionDate, t2.transactionDate).transactionDate, t3.transactionDate)
Clearly, the inner min()
returns time.struct_time
instead of a named tuple, so when reduce
tries to apply .transactionDate
to it, that fails. 显然,内部
min()
返回time.struct_time
而不是命名元组,因此当reduce
尝试将.transactionDate
应用于它时,失败。
There are ways to fix this, and to make use of reduce
for this problem. 有办法解决这个问题,并利用
reduce
来解决这个问题。 However, there seems to be little point given that a direct application of min
does the job and to my eye is a lot clearer than anything involving reduce
. 然而,似乎没有什么意义,因为
min
的直接应用可以完成这项工作,而且我的眼睛比任何涉及reduce
事情要清楚得多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.