简体   繁体   English

从熊猫系列中高效创建多个蒙版

[英]Efficiently creating multiple masks from pandas series

Given a series that looks like: 鉴于一系列看起来像:

0    foo
1    bar
2    foo
3    foo
4    bar
5    baz

How can I create a dataframe where each column is a mask for a unique value in the series? 如何创建一个数据框,其中每列是系列中唯一值的掩码? In this example, it would look like: 在这个例子中,它看起来像:

    foo     bar     baz
0   True    False   False
1   False   True    False
2   True    False   False
3   True    False   False
4   False   True    False
5   False   False   True

Using get_dummies 使用get_dummies

s.str.get_dummies().astype(bool)
Out[392]: 
     bar    baz    foo
0  False  False   True
1   True  False  False
2  False  False   True
3  False  False   True
4   True  False  False
5  False   True  False

Or we try something new crosstab 或者我们尝试一些新的crosstab

pd.crosstab(s.index,s).astype(bool)
Out[395]: 
a        bar    baz    foo
row_0                     
0      False  False   True
1       True  False  False
2      False  False   True
3      False  False   True
4       True  False  False
5      False   True  False

Here's one with array-initialization - 这是一个有array-initialization -

def series_hotencode(s):
    a,b = s.factorize()
    ar = np.zeros((len(a),len(b)), dtype=bool)
    ar[np.arange(len(a)),a] = 1
    return pd.DataFrame(ar,columns=b)

Sample run - 样品运行 -

In [40]: s
Out[40]: 
0    foo
1    bar
2    foo
3    foo
4    bar
5    baz
Name: 1, dtype: object

In [41]: series_hotencode(s)
Out[41]: 
     foo    bar    baz
0   True  False  False
1  False   True  False
2   True  False  False
3   True  False  False
4  False   True  False
5  False  False   True

Let's try pd.factorize + np.eye for a fast, concise solution. 让我们试试pd.factorize + np.eye来获得快速,简洁的解决方案。

x,y = pd.factorize(s)
pd.DataFrame(np.eye(len(y), dtype=bool)[x], columns=y)

     foo    bar    baz
0   True  False  False
1  False   True  False
2   True  False  False
3   True  False  False
4  False   True  False
5  False  False   True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM