简体   繁体   English

如何使用 lambda 计算文件中的字数?

[英]How can I use lambda to count the number of words in a file?

I'm trying to calculate the number of words in a file using reduce , lambda & readlines in an unconventional way:我正在尝试以非常规的方式使用reducelambdareadlines来计算文件中的字数:

import functools as ft
f=open("test_file.txt")
words=ft.reduce(lambda a,b:(len(a.split())+len(b.split())),f.readlines())
print(words)

This raises an attribute error as I'm trying to split integers (indices).当我尝试拆分整数(索引)时,这会引发属性错误。 How do I get this code to split the elements of the iterable returned by f.readlines() and successively add their lengths (ie, number of words in those lines) to ultimately calculate the total number of words in the file?如何获取此代码来拆分f.readlines()返回的可迭代元素并连续添加它们的长度(即这些行中的单词数)以最终计算文件中的单词总数?

If you're trying get a count of words in a file, f.read() makes more sense than f.readlines() because it obviates the need to sum line-by-line counts.如果您尝试获取文件中的字数, f.read()f.readlines()更有意义,因为它无需逐行求和。 You get the whole file in a chunk and can then split on whitespace using split without arguments.您将整个文件放在一个块中,然后可以在没有 arguments 的情况下使用split分割空白。

>>> with open("foo.py") as f:
...     len(f.read().split())
...
1530

If you really want to use readlines , it's easier to avoid functools.reduce in any event and sum the lengths of the split lines ( sum is a very succinct reduction operation on an iterable that does away with the distracting accumulator business):如果你真的想使用readlines ,在任何情况下都更容易避免使用functools.reduce并对split线的长度sumsum是一个非常简洁的对可迭代的归约操作,它消除了令人分心的累加器业务):

>>> with open("foo.py") as f:
...     sum(len(x.split()) for x in f.readlines())
...
1530

It's good practice to use a with context manager so your resource is automatically closed.使用with上下文管理器是一个很好的做法,这样您的资源就会自动关闭。 Use whitespace around all operators so the code is readable.在所有运算符周围使用空格,以便代码可读。

As for getting functools.reduce to work: it accepts a lambda which accepts the accumulator as its first argument and the current element as the second.至于让functools.reduce工作:它接受一个 lambda ,它接受累加器作为其第一个参数,当前元素作为第二个参数。 The second argument to functools.reduce is an iterable and the third initializes the accumulator. functools.reduce的第二个参数是可迭代的,第三个参数初始化累加器。 Leaving it blank as you've done sets it to the value of the first item in the iterable--probably not what you want, since the idea is to perform a numerical summation using the accumulator.将其留空,将其设置为可迭代项中第一项的值 - 可能不是您想要的,因为这个想法是使用累加器执行数值求和。

You can use您可以使用

>>> with open("foo.py") as f:
...     ft.reduce(lambda acc, line: len(line.split()) + acc, f.readlines(), 0)
...
1530

but this strikes me as a rather Rube Goldbergian way to solve the problem.但这让我觉得这是解决问题的一种相当鲁布·戈德堡式的方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM