简体   繁体   English

遍历 pandas 数据框中的行。 如果特定列之前存在空白,则将所有列值移到

[英]Iterate over rows in pandas dataframe. If blanks exist before a specific column, move all column values over

I am attempting to iterate over all rows in a pandas dataframe and move all leftmost columns within each row over until all the non null column values in each row touch.我正在尝试遍历 pandas 数据框中的所有行并将每行中的所有最左边的列移动,直到每行中的所有非空列值都接触。 The amount of column movement depends on the number of empty columns between the first null value and the cutoff column.列移动量取决于第一个空值和截止列之间的空列数。

In this case I am attempting to 'close the gap' between values in the leftmost columns into the column 'd' touching the specific cutoff column 'eee'.在这种情况下,我试图将最左侧列中的值之间的差距“缩小”到“d”列中,从而触及特定的截止列“eee”。 The correlating 'abc' rows should help to visualize the problem.相关的“abc”行应该有助于可视化问题。

Column 'eee' or columns to the right of 'eee' should not be touched or moved不应触摸或移动“eee”列或“eee”右侧的列

def moveOver(): def moveOver():

df = {
    'aaa': ['a', 'a', 'a', 'a', 'a', 'a'],
    'bbb': ['', 'b', 'b', 'b', '', 'b'],
    'ccc': ['', '', 'c', 'c', '', 'c'],
    'ddd': ['', '', '', 'd', '', ''],
    'eee': ['b', 'c', 'd', 'e', 'b', 'd'],
    'fff': ['c', 'd', 'e', 'f', 'c', 'e'],
    'ggg': ['d', 'e', 'f', 'g', 'd', 'f']
}

In row 1 AND 5: 'a' would be moved over 3 column index's to column 'ddd'在第 1 行和第 5 行中:“a”将移动到 3 列索引上的“ddd”列

In row 2: ['a','b'] would be moved over 2 column index's to columns ['ccc', 'ddd'] respectively在第 2 行: ['a','b'] 将分别移动到 2 列索引的列 ['ccc', 'ddd']

etc.等等

finalOutput = {
    'aaa': ['', '', '', 'a', '', ''],
    'bbb': ['', '', 'a', 'b', '', 'a'],
    'ccc': ['', 'a', 'b', 'c', '', 'b'],
    'ddd': ['a', 'b', 'c', 'd', 'a', 'c'],
    'eee': ['b', 'c', 'd', 'e', 'b', 'd'],
    'fff': ['c', 'd', 'e', 'f', 'c', 'e'],
    'ggg': ['d', 'e', 'f', 'g', 'd', 'f']
}

You can do this:你可以这样做:

keep_cols = df.columns[0:df.columns.get_loc('eee')]
df.loc[:,keep_cols] = [np.roll(v, Counter(v)['']) for v in df[keep_cols].values]

print(df):打印(df):

  aaa bbb ccc ddd eee fff ggg
0               a   b   c   d
1           a   b   c   d   e
2       a   b   c   d   e   f
3   a   b   c   d   e   f   g
4               a   b   c   d
5       a   b   c   d   e   f

Explanation:解释:

  • You want to consider only those columns which are to the left of 'eee', so you take those columns as stored in keep_cols您只想考虑“eee”左侧的那些列,因此您将这些列存储在 keep_cols

  • Next you'd want each row to be shifted by some amount (we need to know how much), to shift I used numpy's roll .接下来,您希望每行移动一定量(我们需要知道多少),以移动我使用 numpy 的roll But how much amount?但是多少钱? It is given by number of blank values - for that I used Counter from collections.它由空白值的数量给出 - 为此我使用了集合中的Counter

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM