在Python中从外部文本文件读取多行

Question

This program works fine when I use the code 当我使用代码时，该程序运行正常

for character in infile.readline():

Problem is the readline only reads one line of text. 问题是readline仅读取一行文本。 When I add "s" to readline command 当我在readline命令中添加“ s”时

for character in infile.readlines():

I end up getting for 0's my output. 我最终得到0的输出。

os.chdir(r'M:\Project\Count')

def main():
    infile = open("module3.txt","r")
    uppercasecount = 0
    lowercasecount = 0
    digitcount = 0
    spacecount = 0
    for character in infile.readlines():
        if character.isupper() == True:
            uppercasecount += 1
        if character.islower() == True:
            lowercasecount += 1
        if character.isdigit() == True:
            digitcount += 1
        if character.isspace() == True:
            spacecount += 1

    print ("Total count is %d Upper case, %d Lower case, %d Digit(s) and %d spaces." %(uppercasecount, lowercasecount, digitcount, spacecount))

main()

Also if anyone could give me advice, can I take that directory and make it a default location so i can take this and use it on someone else's machine. 另外，如果有人可以给我建议，我可以将该目录设置为默认位置，以便我可以在其他计算机上使用该目录并使用它。

Answer 1

You can use the two-form iter to read them an arbitrary number of bytes at a time, and itertools.chain to consider them as one long input. 您可以使用两种形式的iter读取任意数量的字节，然后使用itertools.chain将其视为一个长输入。 Instead of keeping track of several variables, you can use the str methods as keys to a collections.Counter , eg: 您可以使用str方法作为collections.Counter键，而不是跟踪几个变量，例如：

from collections import Counter
from itertools import chain

counts = Counter()
with open('yourfile') as fin:
    chars = chain.from_iterable(iter(lambda: fin.read(4096), ''))
    for ch in chars:
        for fn in (str.isupper, str.islower, str.isdigit, str.isspace):
            counts[fn] += fn(ch)

#Counter({<method 'islower' of 'str' objects>: 39, <method 'isspace' of 'str' objects>: 10, <method 'isdigit' of 'str' objects>: 0, <method 'isupper' of 'str' objects>: 0})

Then counts[str.lower] will give you 39 for instance... 然后counts[str.lower]会给你39 。

Answer 2

If you just want to check the type of caracters contained in the file, I wouldn't use readlines but a regular read . 如果您只想检查文件中包含的角色的类型，则我不会使用readlines而是常规的read 。

STEP_BYTES = 1024

def main():
    infile = open("module3.txt","r")
    uppercasecount = 0
    lowercasecount = 0
    digitcount = 0
    spacecount = 0
    data = infile.read(STEP_BYTES)
    while data:
        for character in data:
            if character.isupper() == True:
                uppercasecount += 1
            if character.islower() == True:
                lowercasecount += 1
            if character.isdigit() == True:
                digitcount += 1
            if character.isspace() == True:
                spacecount += 1
        data = infile.read(STEP_BYTES)

    print ("Total count is %d Upper case, %d Lower case, %d Digit(s) and %d spaces." %(uppercasecount, lowercasecount, digitcount, spacecount))

main()

If you really need to use readlines , keep in mind that that method will read all the lines of the file and put them in memory (not so good with very large files) in a list of lines. 如果您确实需要使用readlines ，请记住，该方法将读取文件的所有行并将它们以行列表的形式放入内存中（对于大文件来说效果不佳）。

For instance, assuming that your module3.txt file contains: 例如，假设您的module3.txt文件包含：

this Is a TEST
and this is another line

Using readlines() will return: 使用readlines()将返回：

['this Is a TEST\n', 'and this is another line']

With that in mind, you can walk the file contents using a double for loop: 考虑到这一点，您可以使用double for循环遍历文件内容：

def main():
    infile = open("module3.txt","r")
    uppercasecount = 0
    lowercasecount = 0
    digitcount = 0
    spacecount = 0
    lines = infile.readlines()
    for line in lines:
        for character in line:
            if character.isupper() == True:
                uppercasecount += 1
            if character.islower() == True:
                lowercasecount += 1
            if character.isdigit() == True:
                digitcount += 1
            if character.isspace() == True:
                spacecount += 1
    print ("Total count is %d Upper case, %d Lower case, %d Digit(s) and %d spaces." %(uppercasecount, lowercasecount, digitcount, spacecount))

main()

As for the directory thing, if your code and your text file ( module3.txt ) are going to be shipped in the same directory, you don't need to do the chdir . 至于目录，如果您的代码和文本文件（ module3.txt ）将放在同一目录中，则无需执行chdir 。 By default, the working directory of the script is the directory where the script is. 默认情况下，脚本的工作目录是脚本所在的目录。

Let's say you ship it in a directory like: 假设您将其运送到以下目录中：

  |-> Count
     |-> script.py
     |-> module3.txt

You can just use relative paths to open module3.txt from within script.py : the line open("module3.txt", "r") will go look for a file called module3.txt withing the directory where the script is running (meaning, Count\\ ). 您可以使用相对路径从script.py打开module3.txt ： open("module3.txt", "r")行将查找一个名为module3.txt的文件，其中包含脚本运行的目录（意思是Count\\ ）。 You don't need the call to os.chdir . 您不需要调用os.chdir 。 If you still want to make sure, you could chdir to the directory where the script is located (take a look to this ): 如果你仍然想确保，你可以chdir到脚本所在（看看到目录此）：

Knowing that, change your hardcoded chdir line ( os.chdir(r'M:\\Project\\Count') on top of your file) to: 知道这一点后，将硬编码的chdir行（文件顶部的os.chdir(r'M:\\Project\\Count') ）更改为：

print "Changing current directory to %s" % os.path.dirname(os.path.realpath(__file__))
os.chdir(os.path.dirname(os.path.realpath(__file__)))

在Python中从外部文本文件读取多行

问题描述

2 个解决方案

解决方案1
2 2014-04-06 17:38:02

解决方案2
1 已采纳 2014-04-06 17:21:14

在Python中从外部文本文件读取多行

问题描述

2 个解决方案

解决方案1 2 2014-04-06 17:38:02

解决方案2 1 已采纳 2014-04-06 17:21:14

解决方案1
2 2014-04-06 17:38:02

解决方案2
1 已采纳 2014-04-06 17:21:14