简体   繁体   中英

pandas groupby multiple columns with python and streamlit

I have a groupby function that i want to group multiple columns in order to plot a chart later. The dataframe's columns are dynamic where user select it from a selectbox and multiselect widgets The problem is that i am able now just to take the first or the last item from the multiselect widget like so:

 some_columns_df = df.loc[:,['gender','country','city','hoby','company','status']]
 some_collumns = some_columns_df.columns.tolist()

 select_box_var= st.selectbox("Choose X Column",some_collumns)
 multiselect_var= st.multiselect("Select Columns To GroupBy",some_collumns)  

 test_g3 = df.groupby([select_box_var,multiselect_var[0]]).size().reset_index(name='count')

if user select more than 1 item from the multiselect let say he choose 4 item it becomes like below:

 test_g3 = df.groupby([select_box_var,multiselect_var[0,1,2,3]]).size().reset_index(name='count')

is this possible ?

multiselect_var is a list while select_box_var is a single variable. Put it inside a list and add both lists together.

Try this:

 test_g3 = df.groupby([select_box_var] + multiselect_var).size().reset_index(name='count')

From streamlit docs for multiselect here , the api returns a list always. And your selectbox returns a string as you have a list of strings as option.

So your code can be modified to,

df.groupby([select_box_var] + multiselect_var).size().reset_index(name='count')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM