简体   繁体   中英

Determine number of consecutive identical points in a grid

I have dataframe and grid size is 12*8
I want to calculate the number of consecutive red dots (only in the vertical direction ) and make new column with it (col = consecutive red ) for blue it will be zero

for example

X y red/blue consecutive red 
1 1  blue    0
1 3  red     3     
1 4  red     3
1 2  blue    0
1 5  red     3
9 4  red     5

[![enter image description here][1]][1] 在此处输入图像描述 Already have data for first 3 columns

from sklearn.neighbors import BallTree 

red_points = df[df.red/blue== red]
blue_points = df[df.red/blue!= red]

tree = BallTree(red_points[['x','y']], leaf_size=40, metric='minkowski')

distance, index = tree.query(df[['x','y']], k=2)

I am not aware of such algorithm (there may very well be one), but writing the algo isn't that hard (I work with numpy because I'm used to it and because you can easily accelerate with CUDA and port to other data science python tools).

The data (0=blue, 1=red):

import numpy as np
import pandas as pd
# Generating dummy data for testing
ROWS=10
COLS=20
X = np.random.randint(2, size=(ROWS, COLS))
# Visualizing
df = pd.DataFrame(data=X)
bg='background-color: '
df.style.apply(lambda x: [bg+'red' if v>=1 else bg+'blue' for v in x])

在此处输入图像描述

The algorithm:

result = np.zeros((ROWS,COLS),dtype=np.int)
for y,x in np.ndindex(X.shape):
    if X[y, x]==0:
        continue
    cons = 1 # consecutive in any direction including current
    # Going backwward while we can
    prev = y-1
    while prev>=0:
        if X[prev,x]==0:
            break
        cons+=1
        prev-=1
    # Going forward while we can
    nxt = y+1
    while nxt<=ROWS-1:
        if X[nxt,x]==0:
            break
        cons+=1
        nxt+=1
    result[y,x]=cons
df2 = pd.DataFrame(data=result)
df2.style.apply(lambda x: [bg+'red' if v>=1 else bg+'blue' for v in x])

And the result:

在此处输入图像描述

Please note that in numpy the first coordinate represents the row index (y in your case), and the second the column (x in your case), you can use transpose on your data if you want to swap to x,y.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM