簡體   English   中英

將一長串的0和1轉換為numpy數組或pandas數據框

[英]Converting a long list of sequence of 0's and 1's into a numpy array or pandas dataframe

我有一個很長的序列列表(假設長度為16),由0和1組成。例如

s = ['0100100000010111', '1100100010010101', '1100100000010000', '0111100011110111', '1111100011010111']

現在,我想將每一位都當作一個功能,因此我需要將其轉換為numpy數組或pandas數據框。 為此,我需要逗號分隔序列中存在的所有位,這對於大型數據集是不可能的。

所以我嘗試的是生成字符串中的所有位置:

slices = []
for j in range(len(s[0])):
    slices.append((j,j+1)) 

print(slices)
[(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9), (9, 10), (10, 11), (11, 12), (12, 13), (13, 14), (14, 15), (15, 16)]


new = []
for i in range(len(s)):
    seq = s[i]
    for j in range(len(s[i])):
    ## I have tried both of these LOC but couldn't figure out 
    ## how it could be done        
    new.append([s[slice(*slc)] for slc in slices])
    new.append(s[j:j+1])
print(new)

預期輸出:

new = [[0,1,0,0,1,0,0,0,0,0,0,1,0,1,1,1], [1,1,0,0,1,0,0,0,1,0,0,1,0,1,0,1], [1,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0], [0,1,1,1,1,0,0,0,1,1,1,1,0,1,1,1], [1,1,1,1,1,0,0,0,1,1,0,1,0,1,1,1]]

提前致謝!!

使用np.array構造函數和列表理解:

np.array([list(row) for row in s], dtype=int)

array([[0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1],
       [1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1],
       [1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1],
       [1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1]])

在一行中,沒有for循環:

np.array(s).view('<U1').astype(int).reshape(len(s), -1)

array([[0, 1, 0, ..., 1, 1, 1],
       [1, 1, 0, ..., 1, 0, 1],
       [1, 1, 0, ..., 0, 0, 0],
       [0, 1, 1, ..., 1, 1, 1],
       [1, 1, 1, ..., 1, 1, 1]])

仍然比列表理解慢一點

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM