简体   繁体   中英

Edit concatenated csv with comment lines in the top - Python

I have the following csv file myFile.csv that comes from a pandas dataframe exported:

# Comment line with information related to the business
customer_id    column_1     column_2     column_3
    123           A            XX           AG
    456           B            YY           TT
# Comment line with other information
customer_id    column_1     column_2     column_3
    789           AA           XX           AG
    111           BB           YY           TT

I want to edit this csv so that all lines starting with # are together in the beginning of the file. That way, I can keep a unique table concatenating both pieces of data and with unique columns. Like this:

# Comment line with information related to the business
# Comment line with other information
customer_id    column_1     column_2     column_3
    123           A            XX           AG
    456           B            YY           TT
    789           AA           XX           AG
    111           BB           YY           TT

My csv file looks like this:

在此处输入图片说明

Any ideas? Thank you very much!

Update:

I have this python code to generate a test df:

    input_data = {
                  'customer_id': [123, 456],
                  'column_1': ['A', 'B'],
                  'column_2': ['XX', 'YY'],
                  'column_3': ['AG', 'TT']
                  }
    input_df = pd.DataFrame(input_data, columns=['customer_id', 'column_1', 'column_2', 'column_3'])

    input_df.to_csv("test-matrix.csv", index=False)

    a = "# Information as a comment"

    # I am running the following twice, so I can have the concatenated tables, as this will happen in my code
    with open("test-matrix.csv",'a') as file:
        file.write(a + '\n')
        input_df.to_csv(file, index=False)
        print("APPENDING!")

    with open("test-matrix.csv",'a') as file:
        file.write(a + '\n')
        input_df.to_csv(file, index=False)
        print("APPENDING!")

    df = pd.read_csv("test-matrix.csv")

    print(df)

You can convert one CSV to another with following script:

comments = []
header = ''
data = []
with open('myFile.csv', 'r') as f:
    lines = f.readlines()

for i in range(len(lines)):
    if not lines[i].startswith('#') and not lines[i-1].startswith('#'):
        data.append(lines[i])
    elif lines[i].startswith('#'):
        comments.append(lines[i])
    elif lines[i-1].startswith('#'):
        header = lines[i]

with open('result.csv', 'w') as f:
    f.writelines(comments)
    f.write(header)
    f.writelines(data)

Output file will be:

# Comment line with information related to the business
# Comment line with other information
customer_id    column_1     column_2     column_3
    123           A            XX           AG
    456           B            YY           TT
    789           AA           XX           AG
    111           BB           YY           TT

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM