简体   繁体   中英

Transforming pandas dataframe into a summary table based on a list of values

I am trying to create a summary table from a dataframe that looks like below example. The columns have a set list of unique values.

tdf = pd.DataFrame({"A": ["ind1", "ind2", "ind1", "ind3", "ind3", "ind1", "ind1"],
                   "B": ["ind3", "ind1", "ind3", "ind1", "ind1","ind3", "ind2"],
                   "C": ["ind1","ind1","ind2","ind2","ind3","ind3","ind3"],
                   "D": ["ind3","ind1","ind2","ind3","ind2","ind1","ind3"],
                   "E": ["ind1","ind3","ind1","ind1","ind2","ind2","ind2"]})

I'd then need to create a new table-like object that has a header that correspond to the columns and 3 rows with frequency counts of the set index values.

setvalues = ['ind1','ind2','ind3']

result = pd.DataFrame({"A": [4,1,2],
                   "B": [3,1,3],
                   "C": [2,2,3],
                   "D": [2,2,3],
                   "E": [3,3,1]})

I tried pivot tables but it wasn't returning the required format. In Excel I could just set the index values and do a simple COUNTIF on the columns but I am struggling to implement it in Python.

Here we can using value_counts

tdf.apply(pd.Series.value_counts)
      A  B  C  D  E
ind1  4  3  2  2  3
ind2  1  1  2  2  3
ind3  2  3  3  3  1
import pandas as pd

tdf = pd.DataFrame({"A": ["ind1", "ind2", "ind1", "ind3", "ind3", "ind1", "ind1"],
                   "B": ["ind3", "ind1", "ind3", "ind1", "ind1","ind3", "ind2"],
                   "C": ["ind1","ind1","ind2","ind2","ind3","ind3","ind3"],
                   "D": ["ind3","ind1","ind2","ind3","ind2","ind1","ind3"],
                   "E": ["ind1","ind3","ind1","ind1","ind2","ind2","ind2"]})

full = tdf.apply(pd.value_counts).fillna(0);
print(full)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM