I have dataframe and grid size is 12*8
I want to calculate the number of consecutive red dots (only in the vertical direction ) and make new column with it (col = consecutive red ) for blue it will be zero
for example
X y red/blue consecutive red
1 1 blue 0
1 3 red 3
1 4 red 3
1 2 blue 0
1 5 red 3
9 4 red 5
[![enter image description here][1]][1] Already have data for first 3 columns
from sklearn.neighbors import BallTree
red_points = df[df.red/blue== red]
blue_points = df[df.red/blue!= red]
tree = BallTree(red_points[['x','y']], leaf_size=40, metric='minkowski')
distance, index = tree.query(df[['x','y']], k=2)
I am not aware of such algorithm (there may very well be one), but writing the algo isn't that hard (I work with numpy because I'm used to it and because you can easily accelerate with CUDA and port to other data science python tools).
The data (0=blue, 1=red):
import numpy as np
import pandas as pd
# Generating dummy data for testing
ROWS=10
COLS=20
X = np.random.randint(2, size=(ROWS, COLS))
# Visualizing
df = pd.DataFrame(data=X)
bg='background-color: '
df.style.apply(lambda x: [bg+'red' if v>=1 else bg+'blue' for v in x])
The algorithm:
result = np.zeros((ROWS,COLS),dtype=np.int)
for y,x in np.ndindex(X.shape):
if X[y, x]==0:
continue
cons = 1 # consecutive in any direction including current
# Going backwward while we can
prev = y-1
while prev>=0:
if X[prev,x]==0:
break
cons+=1
prev-=1
# Going forward while we can
nxt = y+1
while nxt<=ROWS-1:
if X[nxt,x]==0:
break
cons+=1
nxt+=1
result[y,x]=cons
df2 = pd.DataFrame(data=result)
df2.style.apply(lambda x: [bg+'red' if v>=1 else bg+'blue' for v in x])
And the result:
Please note that in numpy the first coordinate represents the row index (y in your case), and the second the column (x in your case), you can use transpose on your data if you want to swap to x,y.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.