简体   繁体   中英

What is the easiest way to transpose/invert a pandas dataframe?

I have the following pandas dataframe:

Person     Item1      Item2     Item3     Item4
Adam       Apple      Eggs      Cookie
Alex       Chocolate  Orange    Eggs      Potato
Gina       Eggs       Apple     Orange    Milk

I want to convert it into this:

Item      Count     Person1     Person2     Person3
Apple     2         Adam        Gina
Eggs      3         Adam        Alex        Gina
Cookie    1         Adam
Chocolate 1         Alex
Orange    2         Alex        Gina
Potato    1         Alex
Milk      1         Gina

I have thoroughly searched for my query before posting, but I did not find any matches (maybe there is a better way to rephrase my question). I am sorry if this is a duplicate, but if it is, please direct me to where this question was previously answered.

Use melt for reshape first:

df = df.melt('Person', value_name='Item')
print (df)
   Person variable       Item
0    Adam    Item1      Apple
1    Alex    Item1  Chocolate
2    Gina    Item1       Eggs
3    Adam    Item2       Eggs
4    Alex    Item2     Orange
5    Gina    Item2      Apple
6    Adam    Item3     Cookie
7    Alex    Item3       Eggs
8    Gina    Item3     Orange
9    Adam    Item4        NaN
10   Alex    Item4     Potato
11   Gina    Item4       Milk

Then aggregate custom function for list s with GroupBy.size and then create new DataFrame by constructor and join to count column:

f = lambda x: x.tolist()
f.__name__ = 'Person'
df1 = df.groupby('Item', sort=False)['Person'].agg([f, 'size'])

df2 = pd.DataFrame(df1.pop('Person').values.tolist(), index=df1.index).add_prefix('Person')
df3 = df1.join(df2).reset_index()
print (df3)
        Item  size Person0 Person1 Person2
0      Apple     2    Adam    Gina    None
1  Chocolate     1    Alex    None    None
2       Eggs     3    Gina    Adam    Alex
3     Orange     2    Alex    Gina    None
4     Cookie     1    Adam    None    None
5     Potato     1    Alex    None    None
6       Milk     1    Gina    None    None

This isn't quite what you're looking for, but I'm not sure that "transposition" exists as a simple function. (By the way, transpose , following linear algebra, usually means rotating a dataframe 90°).

# get items
items = []
for c in df.columns[1:]:
    items.extend(df[c].values)
items = list(set(items))
items.remove(None)

people = df.Person.values
counts = {}
for p in people:
    counts[p] = [1 if item in df[df['Person'] == p].values else 0 for item in items]

new = pd.DataFrame(counts, index=items)
new['Count'] = new.sum(axis=1)

Output:

|           | Adam | Alex | Gina | Count |
|-----------|------|------|------|-------|
| Cookie    | 1    | 0    | 0    | 1     |
| Chocolate | 0    | 1    | 0    | 1     |
| Potato    | 0    | 1    | 0    | 1     |
| Eggs      | 1    | 1    | 1    | 3     |
| Milk      | 0    | 0    | 1    | 1     |
| Orange    | 0    | 1    | 1    | 2     |
| Apple     | 1    | 0    | 1    | 2     |

EDIT: as usual, jezrael has the correct answer, but I tweaked this to get the output you want. It might be a bit easier to understand for a beginner.

Given 'df' as your example:

item_counts = {}
for item in items:
    counts = {}
    count = 0
    for p in people:
        if item in df[df['Person'] == p].values:
            count += 1
            counts['Person' + str(count)] = p
    counts['count'] = count
    item_counts[item] = counts

new = pd.DataFrame.from_dict(item_counts, orient='index')
new = new[['count', 'Person1', 'Person2', 'Person3']] # rearrange columns, optional

Output:

|           | count | Person1 | Person2 | Person3 |
|-----------|-------|---------|---------|---------|
| Apple     | 2     | Adam    | Gina    | NaN     |
| Chocolate | 1     | Alex    | NaN     | NaN     |
| Cookie    | 1     | Adam    | NaN     | NaN     |
| Eggs      | 3     | Adam    | Alex    | Gina    |
| Milk      | 1     | Gina    | NaN     | NaN     |
| Orange    | 2     | Alex    | Gina    | NaN     |
| Potato    | 1     | Alex    | NaN     | NaN     |

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM