简体   繁体   English

Python:有没有办法提取和连接多个文本文件系列,将每个文件的前 3 行作为我 go 一起删除?

[英]Python: Is there a way to extract and concatenate several series of text files, dropping the top 3 rows of each file as I go along?

I have a folder containing several series of text files, each containing a single row of residuals from some analysis.我有一个包含几个文本文件系列的文件夹,每个文件都包含来自某些分析的单行残差。 Their file names are like this:他们的文件名是这样的:

'residual_x01'
'residual_x02'
...
'residual_y01'
'residual_y02'
...
'residual_z01'
'residual_z02'

The contents of the files look like this:文件的内容如下所示:

1 ### This is the file number in the series
c:\file\location\goes\here
983 1051 0 0 983 1051 ### other identifier
1.1 ### this is where the data I want starts
3.5
0.8
0.7
1.3
... ## so on for about a million lines.

Using Python, I would like to extract the residuals from these files, concatenate to form one long file for each series (ie x, y, z), and remove the top three lines of each file as I go, ie to form this:使用Python,我想从这些文件中提取残差,连接形成每个系列的一个长文件(即x,y,z),并将每个文件的前三行删除为I go,即形成:

1.1 ### data from first file of series 'residual_x01 / _y01 / _z01'
3.5
0.8
0.7
1.3
...
1.1 ### data from second file of series 'residual_x02 / _y02 / _z02'
3.5
0.8
0.7
1.3
...
1.1 ### data from third file of series 'residual_x03 / _y03 / _z03'
3.5
0.8
0.7
1.3
... ... and so on.

I am at a loss as to how to to this, can anyone help?我不知道该怎么做,有人可以帮忙吗?

You didn't provide much data, so I made some bogus data.你没有提供太多数据,所以我做了一些虚假数据。 I didn't want to make a bunch of files, so I only made three fake data files, but the code should work for any number of files, and the length of each file can be variable, too.我不想制作一堆文件,所以我只制作了三个假数据文件,但代码应该适用于任意数量的文件,每个文件的长度也可以是可变的。

Let's say you've got the following three text files:假设您有以下三个文本文件:

files/residual_x01.txt文件/residual_x01.txt

1
c:\file\location\goes\here
983 1051 0 0 983 1051
1.1
3.5
0.8
0.7
1.3

files/residual_x02.txt文件/residual_x02.txt

2
c:\file\location\goes\here
983 1051 0 0 983 1051
7.1
8.4
0.3
2.3
0.1

files/residual_y01.txt文件/residual_y01.txt

1
c:\file\location\goes\here
983 1051 0 0 983 1051
4.2
4.3
1.3
0.2
0.0

Code:代码:

def get_file_lines(path_to_file):

    from itertools import islice

    number_of_lines_to_skip = 3

    with path_to_file.open("r") as file:

        _ = list(islice(file, number_of_lines_to_skip))
        for line in file:
            yield line.strip()


def get_all_floats(path_to_dir):

    from pathlib import Path

    for path in Path(path_to_dir).glob("residual_*.txt"):
        for line in get_file_lines(path):
            yield float(line)


def main():

    for f in get_all_floats("files/"):
        print(f)

    return 0


if __name__ == "__main__":
    import sys
    sys.exit(main())

Output: Output:

1.1
3.5
0.8
0.7
1.3
7.1
8.4
0.3
2.3
0.1
4.2
4.3
1.3
0.2
0.0
>>> 

For each series, you can create a file containing all the lines from files except the first 3 lines of each using this code:对于每个系列,您可以使用以下代码创建一个文件,其中包含文件中除前 3 行之外的所有行:

filenames = ['residual_x01', 'residual_x02', ...]
output_file = 'path/to/output/residual_x'
lines_to_skip = 3
with open(output_file, 'w') as outfile:
    for fname in filenames:
        with open(fname) as infile:
            lines = infile.readlines()[lines_to_skip:]
            for line in lines:
                outfile.write(line)

Change filenames list and output_file according to your needs.根据您的需要更改filenames列表和output_file Also you can tweak lines_to_skip variable.您也可以调整lines_to_skip变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从几个 Pandas 系列中删除 NaN,然后连接 - Dropping NaN from several Pandas Series, then concatenate 我有一个包含许多.tar.gz文件的文件夹。 在python中,我想进入每个文件解压缩或压缩,并找到具有要提取的字符串的文本文件? - I have a folder with many .tar.gz files. In python I want to go into each file unzip or compress and find text file that has string I want to extract? 如何在列表中提取每个元组的'x'并以python方式连接? - How to extract 'x' of each tuple in a list and concatenate in python way? 如何使用Python将多个Javascript文件连接为一个文件 - How to concatenate several Javascript files into one file using Python python 中有没有办法删除 csv 文件中的几行? - Is there a way in python to delete several rows in an csv file? 如何连接 Python 中的文本文件? - How do I concatenate text files in Python? Python:从多个文本文件中提取一列数据 - Python: extract a column of data from several text files 使用python从多个元数据文件中提取特定文本 - extract specific text from several metadata files using python 如何使用Python从多个.txt文件中提取文本? - How to extract text from several .txt files with Python? 如何在 Python 中连接 Pandas 系列的行 - How to concatenate rows of a Pandas series in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM