收集其他熊猫df（具有相同索引）中列出的熊猫df中的细胞

Question

Consider the following example (the two elements of interest are final_df and pivot_df . The rest of the code is just to construct these two df's): 考虑下面的示例（感兴趣的两个元素是final_df和pivot_df ，其余代码仅用于构造这两个df）：

import numpy
import pandas

numpy.random.seed(0)
input_df = pandas.concat([pandas.Series(numpy.round_(numpy.random.random_sample(10,), 2)),
                          pandas.Series(numpy.random.randint(0, 2, 10))], axis = 1) 
input_df.columns = ['key', 'val']


pivot_df = input_df.pivot(columns = 'key', values = 'val')\
                   .fillna(method = 'pad')\
                   .cumsum()

index_df = pivot_df.notnull()\
                   .multiply(pivot_df.columns, axis = 1)\
                   .replace({0.0: numpy.nan})\
                   .values

final_df = numpy.delete(numpy.partition(index_df, 3, axis = 1),
                        numpy.s_[3:index_df.shape[1]], axis = 1)
final_df.sort(axis = 1)            
final_df = pandas.DataFrame(final_df)

final_df contains as many rows as pivot_df . final_df包含尽可能多的行作为pivot_df 。 I want to use these two to construct a third df: bingo_df . 我想用这两个来构造第三个df： bingo_df 。

bingo_df should have the same dimensions as final_df . bingo_df应该具有与final_df相同的尺寸。 Then, the cells of bingo_df should contain: 然后， bingo_df的单元bingo_df应包含：

Whenever the entry (row = i, col = j) of final_df is numpy.nan , the entry (i,j) of bingo_df should be numpy.nan as well. 每当final_df的条目(row = i, col = j)为final_df ， numpy.nan的条目(i,j) bingo_df应为numpy.nan 。
Otherwise, [whenever the entry (i, j) of final_df is not numpy.nan ] the entry (i,j) of bingo_df should be the value at cell [i, final_df[i, j].value] of pivot_df (in fact final_df[i, j].value is either the name of a column of pivot_df or numpy.nan ) 否则，[每当条目(i, j)的final_df不是numpy.nan ]的条目(i,j)的bingo_df应该在单元中的值[i, final_df[i, j].value]的pivot_df （在事实final_df[i, j].value是pivot_df或numpy.nan的列的名称）

Expected ouput: 预期输出：

so the first row of final_df is 所以final_df的第一行是

0.55, nan, nan . 0.55, nan, nan 。

So I'm expecting the first row of bingo_df to be: 所以我期望bingo_df的第一行是：

0.0, nan, nan

because the value in cell (row = 0, col = 0.55) of pivot_df is 0 (and the two subsequent numpy.nan in the first row of final_df should also be numpy.nan in bingo_df ) 因为在单元中的值(row = 0, col = 0.55)的pivot_df是0 （和随后的两个numpy.nan的第一行中final_df还应numpy.nan在bingo_df ）

so the second row of final_df is 所以final_df的第二行是

0.55, 0.72, nan

So I'm expecting the second row of bingo_df to be: 所以我期望bingo_df的第二行是：

0.0, 1.0, nan

because the value in cell (row = 1, col = 0.55) of pivot_df is 0.0 and the value in cell (row = 1, col = 0.72) of pivot_df is 1.0 因为pivot_df单元格(row = 1, col = 0.55)的pivot_df 0.0 ，而pivot_df单元格中(row = 1, col = 0.72)的pivot_df 1.0

Answer 1

IIUC lookup IIUC lookup

s=final_df.stack()
pd.Series(pivot_df.lookup(s.index.get_level_values(0),s),index=s.index).unstack()
Out[87]: 
     0    1    2
0  0.0  NaN  NaN
1  0.0  1.0  NaN
2  0.0  1.0  2.0
3  0.0  0.0  2.0
4  0.0  0.0  0.0
5  0.0  0.0  0.0
6  0.0  1.0  0.0
7  0.0  2.0  0.0
8  0.0  3.0  0.0
9  0.0  0.0  4.0

收集其他熊猫df（具有相同索引）中列出的熊猫df中的细胞

问题描述

Expected ouput: 预期输出：

1 个解决方案

解决方案1
3 已采纳 2018-09-02 19:54:40

收集其他熊猫df（具有相同索引）中列出的熊猫df中的细胞

问题描述

Expected ouput: 预期输出：

1 个解决方案

解决方案1 3 已采纳 2018-09-02 19:54:40

解决方案1
3 已采纳 2018-09-02 19:54:40