简体   繁体   English

从多个数据文件中提取列并将它们保存到新数据框中的单独列中

[英]Extract columns from multiple data files and save them into separate columns in a new data frame

I have 12 data files in a folder, each look like this:我在一个文件夹中有 12 个数据文件,每个文件如下所示:

,Unnamed: 0,Date & time Time (s)    00 Nil  01 P1   02 P2   03 P3
0,0,9/9/2013  03:24:36.956017   0.00E+00    0.00E+00    -6.25E-01   -6.25E-01   -4.93E-03
1,1,9/9/2013  03:24:36.966017   1.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -4.49E-03
2,2,9/9/2013  03:24:36.976017   2.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -5.29E-03
3,3,9/9/2013  03:24:36.986017   3.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -4.08E-03
4,4,9/9/2013  03:24:36.996017   4.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -3.32E-03
5,5,9/9/2013  03:24:37.006017   5.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -4.80E-03
6,6,9/9/2013  03:24:37.016017   6.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -4.37E-03
7,7,9/9/2013  03:24:37.026017   7.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -3.68E-03
8,8,9/9/2013  03:24:37.036017   8.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -4.59E-03
9,9,9/9/2013  03:24:37.046017   9.00E-02    0.00E+00    -6.25E-01   -6.25E-01   -3.65E-03
10,10,9/9/2013  03:24:37.056017 1.00E-01    0.00E+00    -6.25E-01   -6.25E-01   -3.93E-03
11,11,9/9/2013  03:24:37.066017 1.10E-01    0.00E+00    -6.25E-01   -6.26E-01   -3.86E-03
12,12,9/9/2013  03:24:37.076017 1.20E-01    0.00E+00    -6.25E-01   -6.25E-01   -3.89E-03

I want to extract column 4 for example from each data file, save them all in a new data frame as separate columns, and the header of each column is the name of the file this column came from.例如,我想从每个数据文件中提取第 4 列,将它们全部保存在新数据框中作为单独的列,每列的标题是该列来自的文件的名称。 i have found this code我找到了这个代码

    import glob
    import numpy as np
    import matplotlib.pyplot
    import pandas as pd

    import glob
    import pandas as pd

    file_list = glob.glob('*.dat')
    cols = [4] # add more columns here

    df = pd.DataFrame()

    for f in file_list:
        df = df.append(
            pd.read_csv(f, delimiter='\s+', header=None, usecols=cols),
            ignore_index=True,    
        )

    arr = df.values

However, columns from all the data files are merged into one single column in a data frame.但是,所有数据文件中的列会合并为数据框中的一列。 I want them into separate columns and each column header is the name of the original file the column is extracted from.我希望它们分成单独的列,每个列标题都是从中提取列的原始文件的名称。 Can anyone help me in this?任何人都可以帮助我吗? I am a beginner in python我是python的初学者

You can read each csv as separate dataframe and save them in list then process those dataframes :您可以将每个 csv 作为单独的数据帧读取并将它们保存在列表中,然后处理这些数据帧:

dataFrames = []
for f in file_list:
        df = pd.read_csv(f, delimiter='\s+', header=None, usecols=cols)
        dataFrames.append(df)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何提取/拆分数据框中的列表列以分隔唯一的列? - How to extract/split columns that are lists in a data frame to separate unique columns? 如何找到给定数据帧的多个列之间的差异并将结果另存为单独的数据帧 - How to find the difference between multiple columns of a given data frame and save the result as a separate data frame Python 数据帧 | 从列中提取部分文本到 3 个新列中 - Python Data Frame | Extract part of a text from a column into 3 new columns 根据多列的值将新列添加到数据框中 - Adding new columns to data frame based on the values of multiple columns 从 pandas 数据帧中提取类似 JSON 的数据以分隔列 - Extracting JSON like data from pandas data frame to separate columns 将数据框中的字符串转换为单独的列,然后将它们添加回主数据框 - Converting a string in data frame into separate columns and then add them back to the main data frame 将现有数据框中的列修改为新数据框 - Modifying columns from existing data frame into new data frame 如何从两个不同的文本文件中提取多列数据并正确格式化它们以在更多代码中使用 - How to extract multiple columns of data out of two different text files and format them correctly to be used in more code 尝试将多个 .csv 读入单独的数据框列 - Trying to read multiple .csv into separate data-frame columns 数据框中多列的单个直方图 - Single histogram from multiple columns in data frame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM