I have two pandas dataframes, called data and data1 (which I extracted both from an unestructured excel file).
data is a one row dataframe. data1 is a multiple row dataframe (it will vary depending on the original excel file).
What I want to achieve is to concatenate both, but the values from data repeat for each row in data1. resulting like this:
data | data | data | data1 | data1 | data1 |
---|---|---|---|---|---|
One | Two | Three | asda | dsad | dsass |
One | Two | Three | dsad | dasda | dasds |
One | Two | Three | asda | asdsss | dsass |
One | Two | Three | adsa | dsad | asdds |
Is there an efficient way to do this? I've been doing it manually, but it is taking too long because there are like 1k+ files.
Best regards.
Try something like this:
pd.concat([data.reindex(data.index.repeat(len(data1))).reset_index(drop=True),
data1],
axis=1,
ignore_index=True)
Details:
You can do something like this:
data = pd.DataFrame(data = ['One', 'Two', 'Three'])
data = data.T
data1 = pd.DataFrame({"col1": ['asda', 'dsad', 'adsa'],
"col2": ['dsad', 'dasda', 'asdsss'],
"col3": ['dsass', 'dasds', 'asdds']})
data.merge(data1, how = 'cross')
which should give:
0 1 2 col1 col2 col3
0 One Two Three asda dsad dsass
1 One Two Three dsad dasda dasds
2 One Two Three adsa asdsss asdds
It's then kind of down to you how you want to deal with your column names. You cannot have more than one column with the same name so data
can't be reused.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.