简体   繁体   中英

How to combine many python dataframes to a single dataframe?

I have over 1500 python dataframes that I need to combine to one large dataframe. The issue that I have is that dataframes have unique column headers and different sizes.

For example dataframe 1 is:

type    sc98*c.firstname    sc98*c.lastname    sc98*c.username    text                  createdAt    statusofExpiration
need    John                Doe                johndoe            I need a new car.     111111       expired

And dataframe 2 is:

type    l8!7s4fn.firstname    l8!7s4fn.lastname    l8!7s4fn.username    text                    tags.0    tags.1    image.0        createdAt    statusOfExpiration
need    Matt                  Smith                mattsmith            I need a yoga trainer.  yoga      trainer   blankurl.com/  22222        fulfilled

And I want to end up with a data frame like:

type    firstname    lastname    username    text                    createdAt    statusofExpiration    tags.0    tags.1    image.0
need    John         Doe         johndoe     I need a new car.       111111       expired       
need    Matt         Smith       mattsmith   I need a yoga trainer.  222222       fulfilled             yoga      trainer   blankurl.com/

As I mentioned, I won't be able to call the values by indices because of the variable dataframes sizes and I can't call the values by the column name because dataframes have a unique identifiers (eg id.username) in the column headers.

Is there anyway to get around this problem?

Since the data frames have unique column headers and different sizes there is not a simple way to concatenate the data frames. I would reccomend looking into the following:

df.filter(like='firstname')  # select columns containing the word firstname

This way you can loop through the column names in all of the data frames and rename them based on partial matches.

Look into this post: Pandas rename colums with wildcard

You can do this to concatenate or merge multiple data frames. Hope this help !

df1 = DataFrame(
{
    'First Name': firstname_list,
    'Last Name': lastname_list,
 }
)

df2 = DataFrame(
{
    'Key1': value_list1,
    'Key2': value_list2,
 }
)

frames = [df1, df2]

concatenated_df = pd.concat(frames)
concatenated_df.to_csv(r'dataset.csv', sep=',', index=False)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM