I have the following csv file myFile.csv that comes from a pandas dataframe exported:
# Comment line with information related to the business
customer_id column_1 column_2 column_3
123 A XX AG
456 B YY TT
# Comment line with other information
customer_id column_1 column_2 column_3
789 AA XX AG
111 BB YY TT
I want to edit this csv so that all lines starting with # are together in the beginning of the file. That way, I can keep a unique table concatenating both pieces of data and with unique columns. Like this:
# Comment line with information related to the business
# Comment line with other information
customer_id column_1 column_2 column_3
123 A XX AG
456 B YY TT
789 AA XX AG
111 BB YY TT
My csv file looks like this:
Any ideas? Thank you very much!
Update:
I have this python code to generate a test df:
input_data = {
'customer_id': [123, 456],
'column_1': ['A', 'B'],
'column_2': ['XX', 'YY'],
'column_3': ['AG', 'TT']
}
input_df = pd.DataFrame(input_data, columns=['customer_id', 'column_1', 'column_2', 'column_3'])
input_df.to_csv("test-matrix.csv", index=False)
a = "# Information as a comment"
# I am running the following twice, so I can have the concatenated tables, as this will happen in my code
with open("test-matrix.csv",'a') as file:
file.write(a + '\n')
input_df.to_csv(file, index=False)
print("APPENDING!")
with open("test-matrix.csv",'a') as file:
file.write(a + '\n')
input_df.to_csv(file, index=False)
print("APPENDING!")
df = pd.read_csv("test-matrix.csv")
print(df)
You can convert one CSV to another with following script:
comments = []
header = ''
data = []
with open('myFile.csv', 'r') as f:
lines = f.readlines()
for i in range(len(lines)):
if not lines[i].startswith('#') and not lines[i-1].startswith('#'):
data.append(lines[i])
elif lines[i].startswith('#'):
comments.append(lines[i])
elif lines[i-1].startswith('#'):
header = lines[i]
with open('result.csv', 'w') as f:
f.writelines(comments)
f.write(header)
f.writelines(data)
Output file will be:
# Comment line with information related to the business
# Comment line with other information
customer_id column_1 column_2 column_3
123 A XX AG
456 B YY TT
789 AA XX AG
111 BB YY TT
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.