![](/img/trans.png)
[英]Remove elements from lists of a list in one column from a list in another column and replace with new values Python Pandas
[英]If list of lists values are present in Pandas dataframe column replace them with values from another Pandas column
我有一個列表列表,其中包含以下類型的單詞tockens:
[['java_developer'],
['ETL', 'database_administrator'],
...
['web-developer', 'c#', 'ms_sql']]
我也有一個鍵值熊貓數據框,其中第一列鍵和第二列是值。 例如:
Key Value
0 java_developer java
1 web-developer web
2 database_administrator database
3 ETL ETL
4 ms_sql database
... ... ...
100 c# c#
我想收到以下類型的列表:
[['java'],
['ETL', 'database'],
...
['web', 'c#', 'database']]
如何實施?
使用get
為DataFrame
缺失值添加一些值,例如None
:
#added val to last sublist for better sample
L = [['java_developer'],
['ETL', 'database_administrator'],
['web-developer', 'c#', 'ms_sql', 'val']]
#create dictionary from DataFrame
d = df.set_index('Key')['Value'].to_dict()
print (d)
{'java_developer': 'java', 'web-developer': 'web',
'database_administrator': 'database', 'ETL': 'ETL',
'ms_sql': 'database', 'c#': 'c#'}
#in nested list comprehension repalce by dict
L1 = [[d.get(y, None) for y in x] for x in L]
print (L1)
[['java'], ['ETL', 'database'], ['web', 'c#', 'database', None]]
或刪除不匹配的值添加過濾:
L1 = [[d.get(y) for y in x if y in d] for x in L]
print (L1)
[['java'], ['ETL', 'database'], ['web', 'c#', 'database']]
如果字典中不存在相同的值:
L1 = [[d.get(y, y) for y in x] for x in L]
print (L1)
[['java'], ['ETL', 'database'], ['web', 'c#', 'database', 'val']]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.