[英]applying RLE on numpy 2d array
我有一個像這樣的 numpy 二維數組:
np.array([[1,1,1,0], [1,0,0,1]])
如何有效地在這個二維數組上應用 RLE? 我的數據集的形狀是 (4000, 3000)
我可以在不使用 numpy 的情況下使用此邏輯對字符串進行 rle。
for i in new_bin_data:
if i == '0':
if prev != i:
final_result.append(count)
count = 0
prev = '0'
count += 1
else:
if prev != i:
final_result.append(count)
count = 0
count += 1
prev = '1'
不確定你在找什么。 這是一些計算行的 RLE 編碼的代碼。
def rle(inarray):
"""
From: https://stackoverflow.com/questions/1066758/find-length-of-sequences-of-identical-values-in-a-numpy-array-run-length-encodi
run length encoding. Partial credit to R rle function.
Multi datatype arrays catered for including non Numpy
returns: tuple (runlengths, startpositions, values)
"""
ia = np.asarray(inarray) # force numpy
n = len(ia)
if n == 0:
return (None, None, None)
else:
y = ia[1:] != ia[:-1] # pairwise unequal (string safe)
i = np.append(np.where(y), n - 1) # must include last element posi
z = np.diff(np.append(-1, i)) # run lengths
p = np.cumsum(np.append(0, z))[:-1] # positions
return (z, p, ia[i])
def rle_2d(a, unused_value):
"""
compute rle encoding of each row in the input array
Args:
a: 2d numpy array
unused_value: a value that does not appear in the input array a
Returns:
list of (length, positions, values) tuples. The length of the list is the number of rows in
the input matrix
"""
r, c = a.shape # rows, columns
a = np.hstack([a, np.ones((r, 1), dtype=a.dtype) * unused_value])
a = a.reshape(-1)
l, p, v = rle(a) # length, positions, values
y = p // c
x = p % c
rl, rp, rv = rle(y)
result = []
for i in range(r):
li = l[rp[i]: rp[i] + rl[i]]
pi = x[rp[i]: rp[i] + rl[i]]
vi = v[rp[i]: rp[i] + rl[i]]
assert(rv[i] == i)
result.append((li, pi, vi))
return result
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.