简体   繁体   中英

How to join columns in CSV files using Pandas in Python

I have a CSV file that looks something like this:

# data.csv (this line is not there in the file)
Names, Age, Names
John, 5, Jane
Rian, 29, Rath

And when I read it through Pandas in Python I get something like this:

import pandas as pd

data = pd.read_csv("data.csv")
print(data)

And the output of the program is:

  Names   Age  Names
0  John     5   Jane
1  Rian    29   Rath

Is there any way to get:

  Names   Age  
0  John     5   
1  Rian    29   
2  Jane
3  Rath

First, I'd suggest having unique names for each column. Either go into the csv file and change the name of a column header or do so in pandas.

Using 'Names2' as the header of the column with the second occurence of the same column name, try this:

Starting from

datalist = [['John', 5, 'Jane'], ['Rian', 29, 'Rath']]
df = pd.DataFrame(datalist, columns=['Names', 'Age', 'Names2'])

We have

  Names  Age Names
0  John    5  Jane
1  Rian   29  Rath

So, use:

dff = pd.concat([df['Names'].append(df['Names2'])
                                    .reset_index(drop=True), 
                 df.iloc[:,1]], ignore_index=True, axis=1)
                .fillna('').rename(columns=dict(enumerate(['Names', 'Ages'])))

to get your desired result.

From the inside out:
df.append combines the columns.
pd.concat( ... ) combines the results of df.append with the rest of the dataframe.

To discover what the other commands do, I suggest removing them one-by-one and looking at the results.

Please forgive the formating of dff . I'm trying to make everything clear from an educational perspective. Adjust indents so the code will compile.

You can use:
usecols which helps to read only selected columns.
low_memory is used so that we Internally process the file in chunks.

import pandas as pd

data = pd.read_csv("data.csv", usecols = ['Names','Age'], low_memory = False))
print(data)

Please have unique column name in your csv

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM