简体   繁体   中英

Read data from text files having same name structure and append all data into a new file

I have some data files, say data1.txt , data 2.txt ,... and so on. I want to read all these data files using a single loop structure and append the data values into a single file, say data-all.txt.

I am fine with any of the following programming languages: c , python , matlab

pathlib module is great for globbing matching files, and for easy read/write:

from pathlib import Path

def all_files(dir, mask):
    for path in Path(dir).glob(mask):
        yield from path.open()

Path('data_all.txt').write_text(''.join(all_files('.', 'data*.txt')))

In python First of all you have to create the list of all file paths you can use the glob library in python.

import glob
import pandas as pd
path_list = glob.glob('Path/To/Your/DataFolder/pattern(data*)')

then you can read that data using list comprehension.it will give you the list of data frames depending upon the data files in your folder

list_data = [pd.read_csv(x,sep='\t') for x in path_list]

It will combine the data in single data frame and you can write it as a single dataframe.

data_all = pd.concat(list_data,ignore_index=True)

now you can write the dataframe in a single file.

data_all.to_csv('Path',sep=',')

Use python zip and the csv module to achieve this. Within a single for loop:

For Example:

import csv
with open("data_all.csv", "w") as f:
    csv_writer = csv.writer(f)
    for d1, d2, d3 in zip(open("data1.txt", "r"), open("data2.txt", "r"), open("data3.txt", "r")):
        csv_writer.writerow([d1, d2, d3])

It can be done through reading the content of each file, and writing them into an output file handle. The files structure in your description contains numbers, so we might need to call sorted to sort them before start reading. The "files_search_pattern" should point out to the input directory 'PATH/*.txt' and the same for the output file handle "data-all.txt"

import glob

files_search_pattern = "*.txt"

files = sorted(glob.glob(files_search_pattern))

with open("data-all.txt", "wb") as output:
    for f in files:
        with open(f, "rb") as inputFile:
            output.write(inputFile.read())

In windows, use copy data*.txt data-all.txt

In Unix, use cat data*.txt >> data-all.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM