简体   繁体   English

如何从字典的字典列表中提取元素

[英]How to extract the elements from list of dictionary of dictionary

  • I have a dictionary below which is coming from elastic search我下面有一本来自弹性搜索的字典

  • I need to extract some of elements and added to list我需要提取一些元素并添加到列表中

    searchtest = [{'_index': 'courses',
      '_type': 'classroom',
      '_id': '6',
      '_score': 1.0,
      '_source': {'name': 'Cost Accounting 400',
       'room': 'E7',
       'professor': {'name': 'Bill Cage',
        'department': 'accounting',
        'facutly_type': 'full-time',
        'email': 'cageb@onuni.com'},
       'students_enrolled': 31,
       'course_publish_date': '2014-12-31',
       'course_description': 'Cst Act 400 is an advanced course from the business school taken by final year accounting majors that covers the subject of business incurred costs and how to record them in financial statements'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '7',
      '_score': 1.0,
      '_source': {'name': 'Computer Internals 250',
       'room': 'C8',
       'professor': {'name': 'Gregg Payne',
        'department': 'engineering',
        'facutly_type': 'part-time',
        'email': 'payneg@onuni.com'},
       'students_enrolled': 33,
       'course_publish_date': '2012-08-20',
       'course_description': 'cpt Int 250 gives students an integrated and rigorous picture of applied computer science, as it comes to play in the construction of a simple yet powerful computer system. '}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '8',
      '_score': 1.0,
      '_source': {'name': 'Accounting Info Systems 350',
       'room': 'E3',
       'professor': {'name': 'Bill Cage',
        'department': 'accounting',
        'facutly_type': 'full-time',
        'email': 'cageb@onuni.com'},
       'students_enrolled': 19,
       'course_publish_date': '2014-05-15',
       'course_description': 'Act Sys 350 is an advanced course providing students a practical understanding of an accounting system in database technology. Students will use MS Access to build a transaction ledger system'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '9',
      '_score': 1.0,
      '_source': {'name': 'Tax Accounting 200',
       'room': 'E7',
       'professor': {'name': 'Thomas Baszo',
        'department': 'finance',
        'facutly_type': 'part-time',
        'email': 'baszot@onuni.com'},
       'students_enrolled': 17,
       'course_publish_date': '2016-06-15',
       'course_description': 'Tax Act 200 is an intermediate course covering various aspects of tax law'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '10',
      '_score': 1.0,
      '_source': {'name': 'Capital Markets 350',
       'room': 'E3',
       'professor': {'name': 'Thomas Baszo',
        'department': 'finance',
        'facutly_type': 'part-time',
        'email': 'baszot@onuni.com'},
       'students_enrolled': 13,
       'course_publish_date': '2016-01-11',
       'course_description': 'This is an advanced course teaching crucial topics related to raising capital and bonds, shares and other long-term equity and debt financial instrucments'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '5',
      '_score': 1.0,
      '_source': {'name': 'Theatre 410',
       'room': 'T18',
       'professor': {'name': 'Sebastian Hern',
        'department': 'art',
        'facutly_type': 'part-time',
        'email': ''},
       'students_enrolled': 47,
       'course_publish_date': '2013-01-27',
       'course_description': 'Tht 410 is an advanced elective course disecting the various plays written by shakespere during the 16th century'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '1',
      '_score': 1.0,
      '_source': {'name': 'Accounting 101',
       'room': 'E3',
       'professor': {'name': 'Thomas Baszo',
        'department': 'finance',
        'facutly_type': 'part-time',
        'email': 'baszot@onuni.com'},
       'students_enrolled': 27,
       'course_publish_date': '2015-01-19',
       'course_description': 'Act 101 is a course from the business school on the introduction to accounting that teaches students how to read and compose basic financial statements'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '2',
      '_score': 1.0,
      '_source': {'name': 'Marketing 101',
       'room': 'E4',
       'professor': {'name': 'William Smith',
        'department': 'finance',
        'facutly_type': 'part-time',
        'email': 'wills@onuni.com'},
       'students_enrolled': 18,
       'course_publish_date': '2015-06-21',
       'course_description': 'Mkt 101 is a course from the business school on the introduction to marketing that teaches students the fundamentals of market analysis, customer retention and online advertisements'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '3',
      '_score': 1.0,
      '_source': {'name': 'Anthropology 230',
       'room': 'G11',
       'professor': {'name': 'Devin Cranford',
        'department': 'history',
        'facutly_type': 'full-time',
        'email': 'devinc@onuni.com'},
       'students_enrolled': 22,
       'course_publish_date': '2013-08-27',
       'course_description': 'Ant 230 is an intermediate course on human societies and cultures and their development. A focus on the Mayans civilization is rooted in this course'}},
     {'_index': 'courses',
      '_type': 'classroom',
      '_id': '4',
      '_score': 1.0,
      '_source': {'name': 'Computer Science 101',
       'room': 'C12',
       'professor': {'name': 'Gregg Payne',
        'department': 'engineering',
        'facutly_type': 'full-time',
        'email': 'payneg@onuni.com'},
       'students_enrolled': 33,
       'course_publish_date': '2013-08-27',
       'course_description': 'CS 101 is a first year computer science introduction teaching fundamental data structures and alogirthms using python. '}}]

Code is below代码如下

import json
import pprint


def details(searchtest):
    response = []
    for each in searchtest:
        course = {
            'name': each['_source']['name'],
            'proffesor':[]}
        
        for prof in each.get('professor', []):
            course['proffesor'].append(prof['_source']['name'])
            course['proffesor'].append(prof['_source']['department'])
        response.append(course)
    return response

if __name__ == "__main__":
    import pprint
    pp = pprint.PrettyPrinter(4)
    pp.pprint(details(searchtest['hits']['hits']))

My output我的 output

[   {'name': 'Cost Accounting 400', 'proffesor': []},
    {'name': 'Computer Internals 250', 'proffesor': []},
    {'name': 'Accounting Info Systems 350', 'proffesor': []},
    {'name': 'Tax Accounting 200', 'proffesor': []},
    {'name': 'Capital Markets 350', 'proffesor': []},
    {'name': 'Theatre 410', 'proffesor': []},
    {'name': 'Accounting 101', 'proffesor': []},
    {'name': 'Marketing 101', 'proffesor': []},
    {'name': 'Anthropology 230', 'proffesor': []},
    {'name': 'Computer Science 101', 'proffesor': []}]
  • My output proffesor list is coming as empty.我的 output 教授名单是空的。 The list has to filled with proffesor.name and proffesor.department该列表必须填写 proffesor.name 和 proffesor.department

Expected output will contain list of professor name and department.预计 output 将包含教授姓名和部门列表。

Error line in the code is below代码中的错误行如下

for prof in each.get('professor', []):

In [53]: response
Out[53]: []

In [54]: for search in searchtest:
    ...:     response.append({'name':search["_source"]['name'],'professor':search["_source"]["professor"]})
    ...:

In [55]: response
Out[55]:
[{'name': 'Cost Accounting 400',
  'professor': {'name': 'Bill Cage',
   'department': 'accounting',
   'facutly_type': 'full-time',
   'email': 'cageb@onuni.com'}},
 {'name': 'Computer Internals 250',
  'professor': {'name': 'Gregg Payne',
   'department': 'engineering',
   'facutly_type': 'part-time',
   'email': 'payneg@onuni.com'}},
 {'name': 'Accounting Info Systems 350',
  'professor': {'name': 'Bill Cage',
   'department': 'accounting',
   'facutly_type': 'full-time',
   'email': 'cageb@onuni.com'}},
 {'name': 'Tax Accounting 200',
  'professor': {'name': 'Thomas Baszo',
   'department': 'finance',
   'facutly_type': 'part-time',
   'email': 'baszot@onuni.com'}},
 {'name': 'Capital Markets 350',
  'professor': {'name': 'Thomas Baszo',
   'department': 'finance',
   'facutly_type': 'part-time',
   'email': 'baszot@onuni.com'}},
 {'name': 'Theatre 410',
  'professor': {'name': 'Sebastian Hern',
   'department': 'art',
   'facutly_type': 'part-time',
   'email': ''}},
 {'name': 'Accounting 101',
  'professor': {'name': 'Thomas Baszo',
   'department': 'finance',
   'facutly_type': 'part-time',
   'email': 'baszot@onuni.com'}},
 {'name': 'Marketing 101',
  'professor': {'name': 'William Smith',
   'department': 'finance',
   'facutly_type': 'part-time',
   'email': 'wills@onuni.com'}},
 {'name': 'Anthropology 230',
  'professor': {'name': 'Devin Cranford',
   'department': 'history',
   'facutly_type': 'full-time',
   'email': 'devinc@onuni.com'}},
 {'name': 'Computer Science 101',
  'professor': {'name': 'Gregg Payne',
   'department': 'engineering',
   'facutly_type': 'full-time',
   'email': 'payneg@onuni.com'}}]

OR或者

As you want list如你所愿

In [57]: response = []

In [58]: for search in searchtest:
    ...:     response.append({'name':search["_source"]['name'],'professor':[search["_source"]["professor"]["name"],search["_source"]["professor"]["department"]]})
    ...:

In [59]: response
Out[59]:
[{'name': 'Cost Accounting 400', 'professor': ['Bill Cage', 'accounting']},
 {'name': 'Computer Internals 250',
  'professor': ['Gregg Payne', 'engineering']},
 {'name': 'Accounting Info Systems 350',
  'professor': ['Bill Cage', 'accounting']},
 {'name': 'Tax Accounting 200', 'professor': ['Thomas Baszo', 'finance']},
 {'name': 'Capital Markets 350', 'professor': ['Thomas Baszo', 'finance']},
 {'name': 'Theatre 410', 'professor': ['Sebastian Hern', 'art']},
 {'name': 'Accounting 101', 'professor': ['Thomas Baszo', 'finance']},
 {'name': 'Marketing 101', 'professor': ['William Smith', 'finance']},
 {'name': 'Anthropology 230', 'professor': ['Devin Cranford', 'history']},
 {'name': 'Computer Science 101', 'professor': ['Gregg Payne', 'engineering']}]

Using your code: You don't have to loop through the professor .使用您的代码:您不必遍历professor It's a dictionary, you can directly access by using keys它是一个字典,你可以通过键直接访问

In [64]: import json
    ...: import pprint
    ...: def details(searchtest):
    ...:     response = []
    ...:     for each in searchtest:
    ...:         course = {
    ...:             'name': each['_source']['name'],
    ...:             'proffesor':[]}
    ...:         prof = each['_source']['professor']
    ...:         course['proffesor'].append(prof['name'])
    ...:         course['proffesor'].append(prof['department'])
    ...:         response.append(course)
    ...:     return response
    ...:
    ...:

In [65]: details(searchtest)
Out[65]:
[{'name': 'Cost Accounting 400', 'proffesor': ['Bill Cage', 'accounting']},
 {'name': 'Computer Internals 250',
  'proffesor': ['Gregg Payne', 'engineering']},
 {'name': 'Accounting Info Systems 350',
  'proffesor': ['Bill Cage', 'accounting']},
 {'name': 'Tax Accounting 200', 'proffesor': ['Thomas Baszo', 'finance']},
 {'name': 'Capital Markets 350', 'proffesor': ['Thomas Baszo', 'finance']},
 {'name': 'Theatre 410', 'proffesor': ['Sebastian Hern', 'art']},
 {'name': 'Accounting 101', 'proffesor': ['Thomas Baszo', 'finance']},
 {'name': 'Marketing 101', 'proffesor': ['William Smith', 'finance']},
 {'name': 'Anthropology 230', 'proffesor': ['Devin Cranford', 'history']},
 {'name': 'Computer Science 101', 'proffesor': ['Gregg Payne', 'engineering']}]

I added None checks for every item in the response.我为响应中的每个项目添加了None检查。 This could be done more efficiently since the _source item is checked 3 times and the professor item 2 times.这可以更有效地完成,因为_source项目被检查了 3 次,而professor项目被检查了 2 次。

def details(searchtest):
    response = []
    for each in searchtest:
        course = {
            "name": each.get("_source", {}).get("name", ""),
            "professor": [
                each.get("_source", {}).get("professor", {}).get("name", ""),
                each.get("_source", {}).get("professor", {}).get("department", "")
            ]
        }
        response.append(course)
    return response

if __name__ == "__main__":
    import pprint
    pp = pprint.PrettyPrinter(4)
    pp.pprint(details(searchtest))
[   {'name': 'Cost Accounting 400', 'professor': ['Bill Cage', 'accounting']},
    {   'name': 'Computer Internals 250',
        'professor': ['Gregg Payne', 'engineering']},
    {   'name': 'Accounting Info Systems 350',
        'professor': ['Bill Cage', 'accounting']},
    {'name': 'Tax Accounting 200', 'professor': ['Thomas Baszo', 'finance']},
    {'name': 'Capital Markets 350', 'professor': ['Thomas Baszo', 'finance']},
    {'name': 'Theatre 410', 'professor': ['Sebastian Hern', 'art']},
    {'name': 'Accounting 101', 'professor': ['Thomas Baszo', 'finance']},
    {'name': 'Marketing 101', 'professor': ['William Smith', 'finance']},
    {'name': 'Anthropology 230', 'professor': ['Devin Cranford', 'history']},
    {   'name': 'Computer Science 101',
        'professor': ['Gregg Payne', 'engineering']}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM