简体   繁体   中英

Iterate over rows in pandas dataframe. If blanks exist before a specific column, move all column values over

I am attempting to iterate over all rows in a pandas dataframe and move all leftmost columns within each row over until all the non null column values in each row touch. The amount of column movement depends on the number of empty columns between the first null value and the cutoff column.

In this case I am attempting to 'close the gap' between values in the leftmost columns into the column 'd' touching the specific cutoff column 'eee'. The correlating 'abc' rows should help to visualize the problem.

Column 'eee' or columns to the right of 'eee' should not be touched or moved

def moveOver():

df = {
    'aaa': ['a', 'a', 'a', 'a', 'a', 'a'],
    'bbb': ['', 'b', 'b', 'b', '', 'b'],
    'ccc': ['', '', 'c', 'c', '', 'c'],
    'ddd': ['', '', '', 'd', '', ''],
    'eee': ['b', 'c', 'd', 'e', 'b', 'd'],
    'fff': ['c', 'd', 'e', 'f', 'c', 'e'],
    'ggg': ['d', 'e', 'f', 'g', 'd', 'f']
}

In row 1 AND 5: 'a' would be moved over 3 column index's to column 'ddd'

In row 2: ['a','b'] would be moved over 2 column index's to columns ['ccc', 'ddd'] respectively

etc.

finalOutput = {
    'aaa': ['', '', '', 'a', '', ''],
    'bbb': ['', '', 'a', 'b', '', 'a'],
    'ccc': ['', 'a', 'b', 'c', '', 'b'],
    'ddd': ['a', 'b', 'c', 'd', 'a', 'c'],
    'eee': ['b', 'c', 'd', 'e', 'b', 'd'],
    'fff': ['c', 'd', 'e', 'f', 'c', 'e'],
    'ggg': ['d', 'e', 'f', 'g', 'd', 'f']
}

You can do this:

keep_cols = df.columns[0:df.columns.get_loc('eee')]
df.loc[:,keep_cols] = [np.roll(v, Counter(v)['']) for v in df[keep_cols].values]

print(df):

  aaa bbb ccc ddd eee fff ggg
0               a   b   c   d
1           a   b   c   d   e
2       a   b   c   d   e   f
3   a   b   c   d   e   f   g
4               a   b   c   d
5       a   b   c   d   e   f

Explanation:

  • You want to consider only those columns which are to the left of 'eee', so you take those columns as stored in keep_cols

  • Next you'd want each row to be shifted by some amount (we need to know how much), to shift I used numpy's roll . But how much amount? It is given by number of blank values - for that I used Counter from collections.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM