簡體   English   中英

如何將多個功能應用於單個 DataFrame 列?

[英]How to apply multiple functions onto a single DataFrame column?

說我有df:

Name         Sequence
Bob             IN,IN
Marley         OUT,IN
Jack     IN,IN,OUT,IN
Harlow               

df 具有名稱和“輸入/輸出”序列。 序列列中可以有空白值。 如何以有效的方式將這兩個函數應用於序列列? 像這樣的偽代碼:

df['Sequence'] = 轉換器(sequencer(df['Sequence']))

# takes string of IN/OUT, converts to bits, returns bitstring. 'IN,OUT,IN' -> '010'
def sequencer(seq):
    # 'IN,IN' -> ['IN', 'IN']
    seq = seq.split(',')
    # get sequence up to 3 unique digits. [0,0,1,1,0] = sequence 010
    seq = [1 if x == 'IN' else 0 for x in seq]
    a = seq[0]
    try:
        b = seq.index(1-a, 1)
    except:
        return str(a)
    if a not in seq[b+1]:
        return str(a) + str(1-a)

    return str(a) + str(1-a) + str(a)

# converts bitstring back into in/out format
def converter(seq):
    return '-'.join(['IN' if x == '1' else 'OUT' for x in seq])

導致這個 dataframe?

Name         Sequence
Bob                IN
Marley         OUT-IN
Jack        IN-OUT-IN
Harlow  

我在這里看了這篇文章,評論說不要使用 apply,因為它效率低下,我需要效率,因為我正在處理一個大型數據集。

itertools

  • 使用groupby獲得獨特(不重復)的東西
  • 使用islicde獲得前 3 個。

from itertools import islice, groupby

def f(s):
    return '-'.join([k for k, _ in islice(groupby(s.split(',')), 3)])

df.assign(Sequence=[*map(f, df.Sequence.fillna(''))])

     Name   Sequence
0     Bob         IN
1  Marley     OUT-IN
2    Jack  IN-OUT-IN
3  Harlow           

具有更好封閉性的變體,可實現最大的未來靈活性。

from itertools import islice, groupby

def get_f(n, splitter=',', joiner='-'):
    def f(s):
        return joiner.join([k for k, _ in islice(groupby(s.split(splitter)), n)])
    return f

df.assign(Sequence=[*map(get_f(3), df.Sequence.fillna(''))])

另一個讓我在做什么更明顯的變體(不那么令人討厭的 Python bling)

from itertools import islice, groupby

def get_f(n, splitter=',', joiner='-'):
    def f(s):
        return joiner.join([k for k, _ in islice(groupby(s.split(splitter)), n)])
    return f

f = get_f(3)
df['Sequence-InOut'] = [f(s) for s in df.Sequence.fillna('')]
df

     Name      Sequence Sequence-InOut
0     Bob         IN,IN             IN
1  Marley        OUT,IN         OUT-IN
2    Jack  IN,IN,OUT,IN      IN-OUT-IN
3  Harlow          None               

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM