Basically the original data has no headers but only value (but i have header list). The delimiter is '|'. Now what i try to do is to convert txt file to csv file by using. The csv file contains headers i have and corresponding values.
For example:
txt file looks like:
sadasd|dsdads|adsasd
value 1|value 2|value 3|value 4| value 5| value 100|value 101|value 102|value 103|value 104|value 105 value 200|value 201|value 202|value 203|value 204|value 205
sdasd|dsa|dsdad
and after converting .csv file will look like :
header 1,header 2, header 3, header 4, header 5,
value 1,value 2,value 3,value 4,value 5,
value 100,value 101,value 102,value 103,value 104,value 105
value 200,value 201,value 202,value 203,value 204,value 205
I just start to learn python and what my idea is:
delete first and last line.
use dictionary list: every column is a list with key (header i have). to dataframe
convert to .csv
so it looks like {'header 1': [value 1, value 100, value 200],'header 2': [value 2, value 101, value 201]. and then convert to .csv.
That's just my thought, or you have the easiest way but only using python.
Using csv
module
Ex:
import csv
with open(filename, "r") as infile:
data = []
for i in infile.readlines()[1:-1]: #Strip first and last line.
if i.strip():
data.extend(i.strip().split("|"))
data = [data[i:i+5] for i in range(0, len(data), 5)] #Split list to sub-list of 5 elements
print(data)
header = ["header 1","header 2", "header 3", "header 4", "header 5"]
with open(outfile, "w") as outfile: #Output CSV file
writer = csv.writer(outfile, delimiter=",")
writer.writerow(header) #Write Header
writer.writerows(data) #Write content.
Stitching up from parts in stackoverflow yields the following solution
import pandas as pd
mycolnames = ['col1','col2','col3','col4','col5']
# Use the sep argument to change your delimiter accordingly
df = pd.read_csv("foo.txt", sep="|")
# Set your column names to the data frame
df.columns = mycolnames
# Write your desired columns to csv
df['col1'].to_csv("bar.csv", sep=",")
Credits
@atomh33ls - How to read csv into record array in numpy?
@LangeHaare - set column names in pandas data frame from_dict with orient = 'index'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.