I have a Pandas dataframe such that
|user_id|value|No|
|:-:|:-:|:-:|
|id1|100|1|
|id1|200|2|
|id1|250|3|
|id2|NaN|1|
|id2|100|2|
|id3|400|1|
|id3|NaN|2|
|id3|200|3|
|id4|NaN|1|
|id4|NaN|2|
|id4|300|3|.
Then I want the folloing dataset:
|user_id|value|No|NewNo|
|:-:|:-:|:-:|:-:|
|id1|100|1|1|
|id1|200|2|2|
|id1|250|3|3|
|id2|100|2|1|
|id3|400|1|1|
|id3|NaN|2|2|
|id3|200|3|3|
|id4|300|3|1|
namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.
you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null
df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()
# application of copy here creates a new data frame and allows us to assign
# values to the result (df2). This is needed to create the column `NewNo`
# in the next & final step
# df2 outputs:
user_id value No
0 'id1' 100.0 1
1 'id1' 200.0 2
2 'id1' 250.0 3
4 'id2' 100.0 2
5 'id3' 400.0 1
6 'id3' NaN 2
7 'id3' 200.0 3
10 'id4' 300.0 3
Generate NewNo
column using ranking within the group.
df2['NewNo'] = df2.groupby('user_id').No.rank()
# df2 outputs:
user_id value No NewNo
0 'id1' 100.0 1 1.0
1 'id1' 200.0 2 2.0
2 'id1' 250.0 3 3.0
4 'id2' 100.0 2 1.0
5 'id3' 400.0 1 1.0
6 'id3' NaN 2 2.0
7 'id3' 200.0 3 3.0
10 'id4' 300.0 3 1.0
groupby
+ first_valid_index
+ cumcount
You can calculate indices for first non-null values by group, then use Boolean indexing:
# use transform to align groupwise first_valid_index with dataframe
firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)
# apply Boolean filter
res = df[df.index >= firsts]
# use groupby + cumcount to add groupwise labels
res['NewNo'] = res.groupby('user_id').cumcount() + 1
print(res)
user_id value No NewNo
0 id1 100.0 1 1
1 id1 200.0 2 2
2 id1 250.0 3 3
4 id2 100.0 2 1
5 id3 400.0 1 1
6 id3 NaN 2 2
7 id3 200.0 3 3
10 id4 300.0 3 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.