I am working with 2 dataframes, I am trying to create multiple dfs from df1
based on row values of df2
. I am unable to find any documentation around how to get this done.
import pandas as pd
import numpy as np
df1 = pd.DataFrame({
'A': 'foo bar bro bir fin car zoo loo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8), 'D': np.arange(8) * 2
})
print(df1)
df2 = pd.DataFrame({
'col1': 'foo bar bro bir'.split(),
'col2': 'B B C B '.split(),
'col3': 'D C D D '.split()
})
print(df2)
How do I create a dataframe called 'foo'
which takes only columns B
and D
in df1
(which are inputs from df2
). Same for another dataframe 'bar'
, 'bro'
& 'bir'
. So an example of the output of df_foo
& df_bar
will be
df_foo = pd.DataFrame({'B': 'one', 'D': 0})
df_bar = pd.DataFrame({'B': 'one', 'C': 1})
I could not find any documentation on how can this be done.
What about using loc
for (label based) indexing? An example:
df1_ = df1.set_index('A') # use column A to "rename" rows.
print(df1_.loc[('foo',), ('B', 'D')]) # use `.loc` to access values via their label coordinates.
#
# B D
# A
# foo one 0
So, to build a new dataframe by taking df2
's rows as input to be used within df1
, you can do
df_all = pd.concat((
df1_.loc[(row.col1,), (row.col2, row.col3)]
for _, row in df2.iterrows()
))
print(df_all)
# B C D
# A
# foo one NaN 0.0
# bar one 1.0 NaN
# bro NaN 2.0 4.0
# bir three NaN 6.0
and finally, an example with 'bar'
(replace 'bar'
by 'foo'
or whatever)
df_bar = df_all.loc['bar'].dropna()
print(df_bar)
# B one
# C 1
# Name: bar, dtype: object
# or, to keep playing with dataframes
print( df_all.loc[('bar',), :].dropna(axis=1) )
# B C
# A
# bar one 1.0
If you have more than 3 columns, lets say 70-80 columns in df1
, something you can do is
idx = 'col1'
cols = [c for c in df2.columns.tolist() if c != idx]
df_agno = pd.concat((
df1_.loc[
(row[idx],), row[cols]
] for _, row in df2.iterrows()
))
print(df_agno)
# B C D
# A
# foo one NaN 0.0
# bar one 1.0 NaN
# bro NaN 2.0 4.0
# bir three NaN 6.0
print( df_agno.loc[('bar',), :].dropna(axis=1) )
# B C
# A
# bar one 1.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.