简体   繁体   中英

How to crosstab columns from dataframe based on a condition?

I often need cross tables for pre-analysis of my data. I can produce a basic cross table with pd.crosstab(df['column'], df['column']) but fail to add a crition (logical expression), to filter this cross table only to a subset of my dataframe.

I've tried pd.crosstab(df['health'], df['money']) if df['year']==1988 and several postions for the if. I hope it's easy to solve, but I'm relatively new to Python and Pandas.

import pandas as pd
df = pd.DataFrame({'year': ['1988', '1988', '1988', '1988', '1989', '1989', '1989', '1989'],
                   'health': ['2', '2', '3', '1', '3', '5', '2', '1'],
                   'money': ['5', '7', '8', '8', '3', '3', '7', '8']}).astype(int)

# cross table for 1988 and 1999
pd.crosstab(df['health'], df['money'])

Filter by boolean indexing before crosstab :

df1 = df[df['year']==1988]
df2 = pd.crosstab(df1['health'], df1['money'])

EDIT: You can filter each column separately:

mask = df['year']==1988
df2 = pd.crosstab(df.loc[mask, 'health'], df.loc[mask, 'money'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM