简体   繁体   English

enumerate(fileinput.input(file))和enumerate(file)之间的区别

[英]Differences between enumerate(fileinput.input(file)) and enumerate(file)

I'm looking for some help with my code which is rigth below : 我正在以下代码中寻找一些帮助:

for file in file_name :
    if os.path.isfile(file):
        for line_number, line in enumerate(fileinput.input(file, inplace=1)):
            print file
            os.system("pause")
            if line_number ==1:
                line = line.replace('Object','#Object')
                sys.stdout.write(line)

I wanted to modify some previous extracted files in order to plot them with matplotlib. 我想修改一些以前提取的文件,以便使用matplotlib进行绘制。 To do so, I remove some lines, comment some others. 为此,我删除了一些行,另加了一些注释。

My problem is the following : 我的问题如下:

  • Using for line_number, line in enumerate(fileinput.input(file, inplace=1)): gives me only 4 out of 5 previous extracted files (when looking file_name list contains 5 references !) for line_number, line in enumerate(fileinput.input(file, inplace=1)):仅给我5个先前提取的文件中的4个(当查看file_name列表包含5个引用时!)

  • Using for line_number, line in enumerate(file): gives me the 5 previous extracted file, BUT I don't know how to make modifications using the same file without creating another one... for line_number, line in enumerate(file):给了我5个先前提取的文件,但是我不知道如何使用同一文件进行修改而不创建另一个文件...

Did you have an idea on this issue? 您对这个问题有想法吗? Is this a normal issue? 这是正常问题吗?

There a number of things that might help you. 有很多事情可以帮助您。

Firstly file_name appears to be a list of file names. 首先, file_name似乎是文件名列表。 It might be better named file_names and then you could use file_name for each one. 最好将其命名为file_names ,然后可以对每个名称使用file_name You have verified that this does hold 5 entries. 您已验证它确实包含5个条目。

The enumerate() function is used to help when enumerating a list of items to provide both an index and the item for each loop. enumerate()函数可用于枚举项目列表以为每个循环提供索引和项目。 This saves you having to use a separate counter variable, eg 这样省去了使用单独的计数器变量的麻烦,例如

for index, item in enumerate(["item1", "item2", "item3"]):
    print index, item

would print: 将打印:

0  item1
1  item2
2  item3

This is not really required, as you have chosen to use the fileinput library. 确实不是必需的,因为您已选择使用fileinput库。 This is designed to take a list of files and iterate over all of the lines in all of the files in one single loop. 它旨在获取文件列表,并在一个循环中遍历所有文件中的所有行。 As such you need to tweak your approach a bit, assuming your list of files is called file_names then you write something as follows: 因此,您需要稍微调整一下方法,假设文件列表称为file_names然后编写如下内容:

# Keep only files in the file list
file_names = [file_name for file_name in file_names if os.path.isfile(file_name)]

# Iterate all lines in all files
for line in fileinput.input(file_names, inplace=1):
    if fileinput.filelineno() == 1:
        line = line.replace('Object','#Object')
        sys.stdout.write(line)  

The main point here being that it is better to pre filter any non-filenames before passing the list to fileinput . 这里的重点是最好在将列表传递给fileinput之前预先过滤所有非文件名。 I will leave it up to you to fix the output. 我将由您自己决定修复输出。

fileinput provides a number of functions to help you figure out which file or line number is currently being processed. fileinput提供了许多功能来帮助您确定当前正在处理哪个文件或行号。

Assuming you're still having trouble, my typical approach is to open a file read-only, read its contents into a variable, close the file, make an edited variable, open the file to write (wiping out original file), and finally write the edited contents. 假设您仍然遇到麻烦,我的典型方法是打开一个只读文件,将其内容读入一个变量,关闭该文件,创建一个已edited变量,打开要写入的文件(清除原始文件),最后写入edited内容。

I like this approach since I can simply change the file_name that gets written out if I want to test my edits without wiping out the original file. 我喜欢这种方法,因为如果我要测试我的编辑而无需清除原始文件,只需更改要写出的file_name即可。

Also, I recommend naming containers using plural nouns, like @Martin Evans suggests. 另外,我建议使用复数名词来命名容器,如@Martin Evans建议的那样。

import os

file_names = ['file_1.txt', 'file_2.txt', 'file_3.txt', 'file_4.txt', 'file_5.txt']
file_names = [x for x in file_names if os.path.isfile(x)] # see @Martin's answer again

for file_name in file_names:
    # Open read-only and put contents into a list of line strings
    with open(file_name, 'r') as f_in:
        lines = f_in.read().splitlines()

    # Put the lines you want to write out in out_lines
    out_lines = []
    for index_no, line in enumerate(lines):
        if index_no == 1:
            out_lines.append(line.replace('Object', '#Object'))
        elif ...
        else:
            out_lines.append(line)

    # Uncomment to write to different file name for edits testing
    # with open(file_name + '.out', 'w') as f_out:
    #     f_out.write('\n'.join(out_lines))

    # Write out the file, clobbering the original
    with open(file_name, 'w') as f_out:
        f_out.write('\n'.join(out_lines))

Downside with this approach is that each file needs to be small enough to fit into memory twice ( lines + out_lines ). 这种方法的缺点是每个文件都必须足够小以适合两次内存( lines + out_lines )。

Best of luck! 祝你好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM