choose from one dataframe based on another dataframe

Question

I have two data frames one like this:

      point    sector 
1        1          4
2        2          5
3        3          2
4        4          1
5        5          5
6        6          1
7        7          4
8        8          3
10      10          5
11      11          2
12      12          1
13      13          3
14      14          1
15      15          4
16      16          3
17      17          2
18      18          1
19      19          1
20      20          1
21   alt 1          2
22   alt 3          3
23   alt 2          5

And the other like this, where the entry corresponds to the sector I want the point to come from.

p1  p2  p3  p4          
1   2   3   4
1   2   3   5
1   2   4   5
1   3   4   5
2   3   4   5

What I want to do is create another data frame that will give me a randomly selected set of points from the first dataframe based on their sector.

For example:

        p1 p2 p3 p4
lane 1: 12 3  8  7

As you can see the numbers from lane 1 all have sectors that are in line 1 of the 2nd dataframe. I have been trying to use df.loc but was wondering if there is a better way?

Answer 1

For each row, fetch data from the first dataframe and random choice it:

df2.apply(lambda r: df.loc[r].groupby(level=0).point.apply(np.random.choice).values, axis=1)
Out[132]: 
      p1     p2     p3     p4
0      4     11  alt 3      1
1      6     11     13  alt 2
2      4     17      7  alt 2
3     19  alt 3     15      5
4  alt 1     13      7     10

choose from one dataframe based on another dataframe

Question

1 answers

solution1
0 2017-02-03 16:43:22

choose from one dataframe based on another dataframe

Question

1 answers

solution1 0 2017-02-03 16:43:22

solution1
0 2017-02-03 16:43:22