[英]How to apply a function to an entire pandas df in creating additional columns?
[英]How to create columns in pandas df with .apply and user defined function
我試圖一次在 Pandas DataFrame 中創建幾個列,其中每個列名是字典中的一個鍵,如果存在與該鍵對應的任何值,則該函數返回 1。
我的 DataFrame 有 3 列,jp_ref、jp_title 和 jp_description。 本質上,我正在 jp_descriptions 中搜索分配給該鍵的相關單詞,並根據 jp_description 中是否存在任何值,用 1 和 0 填充分配給該鍵的列。
jp_tile = [‘software developer’, ‘operations analyst’, ‘it project manager’]
jp_ref = [‘j01’, ‘j02’, ‘j03’]
jp_description = [‘software developer with java and sql experience’, ‘operations analyst with ms in operations research, statistics or related field. sql experience desired.’, ‘it project manager with javascript working knowledge’]
myDict = {‘jp_title’:jp_title, ‘jp_ref’:jp_ref, ‘jp_description’:jp_description}
data = pd.DataFrame(myDict)
technologies = {'java':['java','jdbc','jms','jconsole','jprobe','jax','jax-rs','kotlin','jdk'],
'javascript':['javascript','js','node','node.js','mustache.js','handlebar.js','express','angular'
'angular.js','react.js','angularjs','jquery','backbone.js','d3'],
'sql':['sql','mysql','sqlite','t-sql','postgre','postgresql','db','etl']}
def term_search(doc,tech):
for term in technologies[tech]:
if term in doc:
return 1
else:
return 0
for tech in technologies:
data[tech] = data.apply(term_search(data['jp_description'],tech))
我收到以下錯誤但不明白:
TypeError: ("'int' object is not callable", 'occurred at index jp_ref')
您的邏輯是錯誤的,您在循環中遍歷列表,並且在第一次迭代后返回 0 或 1,因此jp_description
值永遠不會與完整列表進行比較。
您拆分 jp_description 並使用 technology dict 檢查公共元素,如果公共元素存在,則表示找到子字符串,因此返回 1 else 0
def term_search(doc,tech):
doc = doc.split(" ")
common_elem = list(set(doc).intersection(technologies[tech]))
if len(common_elem)>0:
return 1
return 0
for tech in technologies:
data[tech] = data['jp_description'].apply(lambda x : term_search(x,tech))
jp_title jp_ref jp_description java javascript sql
0 software developer j01 software developer.... 1 0 1
1 operations analyst j02 operations analyst .. 0 0 1
2 it project manager j03 it project manager... 0 1 0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.