简体   繁体   English

在Python中从外部文本文件读取多行

[英]Reading multiple lines from an external text file in Python

This program works fine when I use the code 当我使用代码时,该程序运行正常

for character in infile.readline():  

Problem is the readline only reads one line of text. 问题是readline仅读取一行文本。 When I add "s" to readline command 当我在readline命令中添加“ s”时

for character in infile.readlines():  

I end up getting for 0's my output. 我最终得到0的输出。

os.chdir(r'M:\Project\Count')

def main():
    infile = open("module3.txt","r")
    uppercasecount = 0
    lowercasecount = 0
    digitcount = 0
    spacecount = 0
    for character in infile.readlines():
        if character.isupper() == True:
            uppercasecount += 1
        if character.islower() == True:
            lowercasecount += 1
        if character.isdigit() == True:
            digitcount += 1
        if character.isspace() == True:
            spacecount += 1

    print ("Total count is %d Upper case, %d Lower case, %d Digit(s) and %d spaces." %(uppercasecount, lowercasecount, digitcount, spacecount))

main()

Also if anyone could give me advice, can I take that directory and make it a default location so i can take this and use it on someone else's machine. 另外,如果有人可以给我建议,我可以将该目录设置为默认位置,以便我可以在其他计算机上使用该目录并使用它。

You can use the two-form iter to read them an arbitrary number of bytes at a time, and itertools.chain to consider them as one long input. 您可以使用两种形式的iter读取任意数量的字节,然后使用itertools.chain将其视为一个长输入。 Instead of keeping track of several variables, you can use the str methods as keys to a collections.Counter , eg: 您可以使用str方法作为collections.Counter键,而不是跟踪几个变量,例如:

from collections import Counter
from itertools import chain

counts = Counter()
with open('yourfile') as fin:
    chars = chain.from_iterable(iter(lambda: fin.read(4096), ''))
    for ch in chars:
        for fn in (str.isupper, str.islower, str.isdigit, str.isspace):
            counts[fn] += fn(ch)

#Counter({<method 'islower' of 'str' objects>: 39, <method 'isspace' of 'str' objects>: 10, <method 'isdigit' of 'str' objects>: 0, <method 'isupper' of 'str' objects>: 0})

Then counts[str.lower] will give you 39 for instance... 然后counts[str.lower]会给你39

If you just want to check the type of caracters contained in the file, I wouldn't use readlines but a regular read . 如果您只想检查文件中包含的角色的类型,则我不会使用readlines而是常规的read

STEP_BYTES = 1024

def main():
    infile = open("module3.txt","r")
    uppercasecount = 0
    lowercasecount = 0
    digitcount = 0
    spacecount = 0
    data = infile.read(STEP_BYTES)
    while data:
        for character in data:
            if character.isupper() == True:
                uppercasecount += 1
            if character.islower() == True:
                lowercasecount += 1
            if character.isdigit() == True:
                digitcount += 1
            if character.isspace() == True:
                spacecount += 1
        data = infile.read(STEP_BYTES)

    print ("Total count is %d Upper case, %d Lower case, %d Digit(s) and %d spaces." %(uppercasecount, lowercasecount, digitcount, spacecount))

main()

If you really need to use readlines , keep in mind that that method will read all the lines of the file and put them in memory (not so good with very large files) in a list of lines. 如果您确实需要使用readlines ,请记住,该方法将读取文件的所有行并将它们以行列表的形式放入内存中(对于大文件来说效果不佳)。

For instance, assuming that your module3.txt file contains: 例如,假设您的module3.txt文件包含:

this Is a TEST
and this is another line

Using readlines() will return: 使用readlines()将返回:

['this Is a TEST\n', 'and this is another line']

With that in mind, you can walk the file contents using a double for loop: 考虑到这一点,您可以使用double for循环遍历文件内容:

def main():
    infile = open("module3.txt","r")
    uppercasecount = 0
    lowercasecount = 0
    digitcount = 0
    spacecount = 0
    lines = infile.readlines()
    for line in lines:
        for character in line:
            if character.isupper() == True:
                uppercasecount += 1
            if character.islower() == True:
                lowercasecount += 1
            if character.isdigit() == True:
                digitcount += 1
            if character.isspace() == True:
                spacecount += 1
    print ("Total count is %d Upper case, %d Lower case, %d Digit(s) and %d spaces." %(uppercasecount, lowercasecount, digitcount, spacecount))

main()

As for the directory thing, if your code and your text file ( module3.txt ) are going to be shipped in the same directory, you don't need to do the chdir . 至于目录,如果您的代码和文本文件( module3.txt )将放在同一目录中,则无需执行chdir By default, the working directory of the script is the directory where the script is. 默认情况下,脚本的工作目录是脚本所在的目录。

Let's say you ship it in a directory like: 假设您将其运送到以下目录中:

  |-> Count
     |-> script.py
     |-> module3.txt

You can just use relative paths to open module3.txt from within script.py : the line open("module3.txt", "r") will go look for a file called module3.txt withing the directory where the script is running (meaning, Count\\ ). 您可以使用相对路径从script.py打开module3.txtopen("module3.txt", "r")行将查找一个名为module3.txt的文件,其中包含脚本运行的目录(意思是Count\\ )。 You don't need the call to os.chdir . 您不需要调用os.chdir If you still want to make sure, you could chdir to the directory where the script is located (take a look to this ): 如果你仍然想确保,你可以chdir到脚本所在(看看到目录 ):

Knowing that, change your hardcoded chdir line ( os.chdir(r'M:\\Project\\Count') on top of your file) to: 知道这一点后,将硬编码的chdir行(文件顶部的os.chdir(r'M:\\Project\\Count') )更改为:

print "Changing current directory to %s" % os.path.dirname(os.path.realpath(__file__))
os.chdir(os.path.dirname(os.path.realpath(__file__)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM