![](/img/trans.png)
[英]Apply a function to translate a column in pandas dataframe with condition on other columns
[英]Appy/lambda apply function to dataframe with specific condition in other column
我有一個看起來像這樣的數據框:
p = {'parentId':['071cb2c2-d1be-4154-b6c7-a29728357ef3', 'a061e7d7-95d2-4812-87c1-24ec24fc2dd2', 'Highest Level', '071cb2c2-d1be-4154-b6c7-a29728357ef3'],
'id_x': ['a061e7d7-95d2-4812-87c1-24ec24fc2dd2', 'd2b62e36-b243-43ac-8e45-ed3f269d50b2', '071cb2c2-d1be-4154-b6c7-a29728357ef3', 'a0e97b37-b9a1-4304-9769-b8c48cd9f184'],
'type': ['c', 'c', 'c', 'r']}
df = pd.DataFrame(data = p)
df
| parentId | id_x | type |
| ------------------------------------ | ------------------------------------ | ------ |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | c |
| a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | d2b62e36-b243-43ac-8e45-ed3f269d50b2 | c |
| Highest Level | 071cb2c2-d1be-4154-b6c7-a29728357ef3 | c |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a0e97b37-b9a1-4304-9769-b8c48cd9f184 | r |
我創建了一個 function 來計算與特定id_x
匹配的parentId
的數量。
def node_counter(id_x, parent_ID):
counter = 0
for child in parent_ID:
if child == id_x:
counter += 1
return counter
df['Amount'] = df.apply(lambda x: node_counter(x['id_x'], df['parentId']), axis=1)
df
| parentId | id_x | type | Amount |
| ------------------------------------ | ------------------------------------ | ---- | ------ |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | c | 1 |
| a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | d2b62e36-b243-43ac-8e45-ed3f269d50b2 | c | 0 |
| Highest Level | 071cb2c2-d1be-4154-b6c7-a29728357ef3 | c | 2 |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a0e97b37-b9a1-4304-9769-b8c48cd9f184 | r | 0 |
Now I want to create a new column Amount c
with the same function, but only let it count if the type
is c
or r
.
結果應該看起來像
| parentId | id_x | type | Amount | Amount c |
| ------------------------------------ | ------------------------------------ | ---- | ------ | -------- |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | c | 1 | 1 |
| a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | d2b62e36-b243-43ac-8e45-ed3f269d50b2 | c | 0 | 0 |
| Highest Level | 071cb2c2-d1be-4154-b6c7-a29728357ef3 | c | 2 | 1 |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a0e97b37-b9a1-4304-9769-b8c48cd9f184 | r | 0 | 0 |
或用於r
| ParentId | id_x | type | Amount | Amount r |
| ------------------------------------ | ------------------------------------ | ---- | ------ | -------- |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | c | 1 | 0 |
| a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | d2b62e36-b243-43ac-8e45-ed3f269d50b2 | c | 0 | 0 |
| Highest Level | 071cb2c2-d1be-4154-b6c7-a29728357ef3 | c | 2 | 1 |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a0e97b37-b9a1-4304-9769-b8c48cd9f184 | r | 0 | 0 |
我嘗試了以下方法,但收到了錯誤的結果:
df['Amount C'] = df.apply(lambda x: node_counter(x['id_x'], df['parentId']) if (x['type'] == 'c') else 0, axis=1)
df
| ParentId | id_x | type | Amount | Amount c |
| ------------------------------------ | ------------------------------------ | ---- | ------ | -------- |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | c | 1 | 1 |
| a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | d2b62e36-b243-43ac-8e45-ed3f269d50b2 | c | 0 | 0 |
| Highest Level | 071cb2c2-d1be-4154-b6c7-a29728357ef3 | c | 2 | 2 |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a0e97b37-b9a1-4304-9769-b8c48cd9f184 | r | 0 | 0 |
如何在 lambda/apply 中正確應用 if 條件?
一種解決方案是設置默認值 0,然后將 appy 用於切片 dataframe:
df['Amount C'] = 0 # set default value 0
mask_type = df['type'] == 'c' # build index mask
df.loc[mask_type, 'Amount C'] = df.loc[mask_type].apply(lambda x: node_counter(x['id_x'], df['parentId']), axis=1)
我還必須在 function 中為parentId
設置索引掩碼,並且它有效。
df['Amount C'] = 0 # set default value 0
mask_type = df['type'] == 'c' # build index mask
df.loc[mask_type,'Amount C'] = df.loc[mask_type].apply(lambda x: node_counter(x['id_x'], df.loc[mask_type,'parentId']), axis=1)
| parentId | id_x | type | Amount | Amount c |
| ------------------------------------ | ------------------------------------ | ---- | ------ | -------- |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | c | 1 | 1 |
| a061e7d7-95d2-4812-87c1-24ec24fc2dd2 | d2b62e36-b243-43ac-8e45-ed3f269d50b2 | c | 0 | 0 |
| Highest Level | 071cb2c2-d1be-4154-b6c7-a29728357ef3 | c | 2 | 1 |
| 071cb2c2-d1be-4154-b6c7-a29728357ef3 | a0e97b37-b9a1-4304-9769-b8c48cd9f184 | r | 0 | 0 |
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.