Suppose I have a the following dataframe with columns name, preference, fruits :
name preference fruits
adam likes apples
mike dislikes orange
If the dataframe above had a one to many relationship like column name would have have multiple relationship with column preference, fruits . For example the output dataframe I am looking for is:
name preference fruits
adam likes apples
adam likes orange
adam dislikes apple
adam dislikes orange
mike likes apples
mike likes orange
mike dislikes apple
mike dislikes orange
Wondering if it is possible. From my knowledge about pandas so far I believe I will have to use groupby? Any help is appreciated! Thanks!
Is it just cross product:
(pd.MultiIndex.from_product([df[col] for col in df],
names=df.columns)
.to_frame().reset_index(drop=True)
)
Output:
name preference fruits
0 adam likes apples
1 adam likes orange
2 adam dislikes apples
3 adam dislikes orange
4 mike likes apples
5 mike likes orange
6 mike dislikes apples
7 mike dislikes orange
I'd use itertools.product
import pandas as pd
from itertools import product
df = pd.DataFrame({
'name': ['adam', 'mike'],
'preference': ['likes', 'dislikes'],
'fruits': ['apples', 'oranges']
})
ndf = pd.DataFrame(
product(*[df[c] for c in df.columns]),
columns=df.columns
)
print(ndf)
# name preference fruits
# 0 adam likes apples
# 1 adam likes oranges
# 2 adam dislikes apples
# 3 adam dislikes oranges
# 4 mike likes apples
# 5 mike likes oranges
# 6 mike dislikes apples
# 7 mike dislikes oranges
As for speed, this seems to be a bit faster as well.
%%timeit
ndf = pd.DataFrame(
product(*[df[c] for c in df.columns]),
columns=df.columns
)
# 624 µs ± 32.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
(pd.MultiIndex.from_product([df[col] for col in df],
names=df.columns)
.to_frame().reset_index(drop=True)
)
# 3.51 ms ± 176 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.