I have many text files with one column data,different dtype (float64, date), no header inside.
I'm trying to write code which will:
- get all file names without extension -> create a list (this works!)
- read all files in one directory and concat them into one data frame with one numerated index.
My code:
filelist = os.listdir(path) #Make a file list
file_names=[os.path.splitext(x)[0] for x in filelist] #Remove file extension
Tried this (first option):
df_list = [pd.read_table(file) for file in filelist]
df = pd.concat(df_list,ignore_index=True)
...but I got 3 columns from 6 files with completely messed data.
Also tried this (second option):
df=pd.DataFrame(columns=file_names)
for file in filelist:
frame=pd.read_csv(file)
df=df.append(frame, ignore_index=True)
...this also doesn't work.
Any advice would be appreciated.
Input
At the beginning of Q*.txt files are only zeros (about 100values), and after this numbers shows.
Q1.txt Q2.txt T21 T22
0 0 51.06 77.46
0 0 50.32 77.33
0 0 50.90 77.45
When I run "first option", I got:
filelist
>>>['Q1.txt', 'Q2.txt','T21.txt', 'T22.txt']
file_names
>>>['Q1', 'Q2','T21', 'T22']
df.dtypes
>>>0 object
>>>51.06 object
>>>77.46 object
>>>dtype: object
Output file
0 51.06 77.46
0 0
1 0
2 0
It looks like first 2 files (those with zeros at the beginning) are in one column. Second and third are first values of file T21 and T22.
Thanks to @Viktor Kerkez I've added header=None
to the pd.read_table
and now all files are in one column, dtype=object.
How can I split all files to many columns ?
You can do the next thing:
import os
import pandas as pd
file_names = []
data_frames = []
for filename in os.listdir(path):
name = os.path.splitext(filename)[0]
file_names.append(name)
df = pd.read_csv(filename, header=None)
df.rename(columns={0: name}, inplace=True)
data_frames.append(df)
combined = pd.concat(data_frames, axis=1)
Here I renamed every DataFrame column to match the file name, you can leave that step out, and just use ignore_index=True
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.