简体   繁体   English

如何合并目录中除指定文件之外的所有文件?

[英]How to merge all files of a directory except a specified one?

I want to merge all files with the extension .asc in my current working directory to be merged into a file called outfile.asc .我想将当前工作目录中所有扩展名为.asc的文件合并到一个名为outfile.asc的文件中。

My problem is, I don't know how to exclude a specific file ( "BigTree.asc" ) and how to overwrite an existing "outfile.asc" if there is one in the directory.我的问题是,我不知道如何排除特定文件( "BigTree.asc" )以及如何覆盖现有的"outfile.asc" (如果目录中有一个)。

if len(sys.argv) < 2:
    print("Please supply the directory of the ascii files and an output-file as argument:")
    print("python merge_file.py directory outfile")
    exit()
directory = sys.argv[1]

os.chdir(directory)
currwd = os.getcwd()

filename = sys.argv[2]
fileobj_out = open(filename, "w") 

starttime = time.time()

read_files = glob.glob(currwd+"\*.asc")

with open("output.asc", "wb") as outfile:
    for f in read_files:
        with open(f, "rb") as infile:
            if f == "BigTree.asc":
                continue
            else:
                outfile.write(infile.read())

endtime = time.time()
runtime = int(endtime-starttime)
sys.stdout.write("The script took %i sec." %runtime)


The problem is that glob takes the filenames with fullpath.问题是glob使用完整路径获取文件名。 I did some minor changes that should work on your code now.我做了一些小改动,现在应该对你的代码起作用。 For instance, instead of using == use in .例如,而不是使用==使用in

if len(sys.argv) < 2:
    print("Please supply the directory of the ascii files and an output-file as argument:")
    print("python merge_file.py directory outfile")
    exit()
directory = sys.argv[1]

os.chdir(directory)

filename = sys.argv[2]
fileobj_out = open(filename, "w") 

starttime = time.time()

read_files = glob.glob(currwd+"\*.asc")

# Change [1]
with open("output.asc", "ab") as outfile:
    for f in read_files:
        with open(f, "rb") as infile:
            # Change [2] '==' for 'in'
            if "BigTree.asc" in f:
                continue
            else:
                outfile.write(infile.read())

endtime = time.time()
runtime = int(endtime-starttime)
sys.stdout.write("The script took %i sec." %runtime)

Explanation解释

[1] Changed the file mode from 'wb' (write byte mode) to 'ab' (append byte mode), in this way if the file exists, it will append the information to it. [1] 将文件模式从'wb'(写入字节模式)更改为'ab'(附加字节模式),这样如果文件存在,它将append信息给它。

[2] Changed "==" for "in" in this way, if the file name f contains the string BigTree.asc it will skip this file and continue. [2] 以这种方式将“==”更改为“in”,如果文件名f包含字符串BigTree.asc它将跳过该文件并继续。

Please let me know if this helps!请让我知道这可不可以帮你!

Thanks:D感谢:D

Giving the following section of your code给出代码的以下部分

with open("output.asc", "wb") as outfile:
    for f in read_files:
        with open(f, "rb") as infile:
            if f == "BigTree.asc":
                continue
            else:
                outfile.write(infile.read())

Updated section更新部分

  1. Change open("output.asc", "wb") to use filename from filename = sys.argv[2]open("output.asc", "wb")更改为使用来自filename filename = sys.argv[2]的文件名
    • If output.asc is the second argument, it will be overwritten ( wb ) or appended to ( ab ) depending on the specified mode.如果output.asc是第二个参数,它将根据指定的模式被覆盖( wb )或附加到( ab )。
  2. Opening the unused file is inefficient打开未使用的文件效率低下
    • Check for BigTree.asc before opening the file, with not in在打开文件之前检查BigTree.asc ,而not in
    • Use negative conditionals as guard clauses to flatten your code.使用否定条件作为保护子句来扁平化你的代码。
with open(filename, "wb") as outfile:
    for f in read_files:
        if "BigTree.asc" not in f:
            with open(f, "rb") as infile:
                outfile.write(infile.read())

Reading and writing modes读写模式

| Access Modes | Description                                                   |
|--------------|---------------------------------------------------------------|
| r            | Opens a file for reading only.                                |
| rb           | Opens a file for reading only in binary format.               |
| r+           | Opens a file for both reading and writing.                    |
| rb+          | Opens a file for both reading and writing in binary format.   |
| w            | Opens a file for writing only.                                |
| wb           | Opens a file for writing only in binary format.               |
| w+           | Opens a file for both writing and reading.                    |
| wb+          | Opens a file for both writing and reading in binary format.   |
| a            | Opens a file for appending.                                   |
| ab           | Opens a file for appending in binary format.                  |
| a+           | Opens a file for both appending and reading.                  |
| ab+          | Opens a file for both appending and reading in binary format. |

Full Program完整计划

  • This line fileobj_out = open(filename, "w") should be removed这行fileobj_out = open(filename, "w")应该被删除
import glob
import sys
import os
import time


if len(sys.argv) < 2:
    print("Please supply the directory of the ascii files and an output-file as argument:")
    print("python merge_file.py directory outfile")
    exit()
directory = sys.argv[1]

os.chdir(directory)

currwd = os.getcwd()

filename = sys.argv[2]

starttime = time.time()

read_files = glob.glob(currwd+"\*.asc")

with open(filename, "wb") as outfile:  # "wb" or "ab", if you want to append or not
    for f in read_files:
        if "BigTree.asc" not in f:
            with open(f, "rb") as infile:
                outfile.write(infile.read())

endtime = time.time()
runtime = int(endtime-starttime)
sys.stdout.write("The script took %i sec." %runtime)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 删除目录中除两个以外的所有文件 - Remove all files in a directory except two 如何将目录中的所有文件拼接成一个文件 - How to concatenate all files in the directory into one file 如何将一个扩展名的一个目录的所有文件合并到一个文件夹中 - how to combine all the files of one directory of one extension into one folder 删除python中“列表”中指定的文件和目录以外的所有文件和目录 - Deleting all files and directories except a few specified in a “list” in python 熊猫将所有文件合并到同一目录中 - pandas merge all files in the same directory 如何将DataFrame中除第一列之外的所有列合并为一列,并删除空行? 蟒蛇 - How to merge all columns in a DataFrame except the first into one column and drop empty rows? Python 用零替换除numpy矩阵的指定索引之外的所有值 - Replace all the value except for the one a specified index of a numpy matrix with zero Python Pandas-将目录中的csv文件合并为一个 - Python pandas - merge csv files in directory into one alembic 如何将所有修订文件合并到一个文件中? - alembic how to merge all revision files to one file? 如何在python中连续读取所有文件并将其合并为一个文件 - how to continuously read and merge all files into one file in python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM