简体   繁体   中英

Grouping columns from data by same value in first column

So I'm trying to figure out a way to group up all the rows in data that have the same value in the first column.

So say I have:

col 1:     col 2:
0          3
0          4
0          5
1          9
1          10
2          7

I want to use either some basic python or numpy to read that data from col 1 and find all the ones that have 0 and group that row up together in a list or something, and then all the ones that have a 1 in col1, etc. . etc. .. I was able to figure this out if the numbers just increase by 1 in col 1, but my inputs have have any sort of float so that isn't reliable.

I've used this in the past, when trying to avoid using a mask with for u in np.unique or going to pandas or itertools.groupby :

np.split(col2, np.where(np.diff(col1))[0]+1)

Works for floats in col1 :

col1 = np.sort(np.repeat(np.random.rand(4), np.random.randint(2,4,4)))
col2 = np.arange(len(col1))

col1
#array([ 0.39855008,  0.39855008,  0.84331316,  0.84331316,  0.94124952,
#        0.94124952,  0.94124952,  0.9480605 ,  0.9480605 ,  0.9480605 ])

np.split(col2, np.where(np.diff(col1))[0]+1)
#[array([0, 1]), array([2, 3]), array([4, 5, 6]), array([7, 8, 9])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM