简体   繁体   English

如何过滤字典值(在另一个字典中)

[英]How to filter a dictionary value (within another dictionary)

I will try to explain this as best as i can, so apologies in advanced for the long post. 我会尽力解释这个问题,所以请为长篇文章道歉。

Firstly, I have an API here ( http://dev.c0l.in:5984/income_statements/_all_docs ) and within this dictionary there are 5000 other dictionaries that I can access through their ID ( http://dev.c0l.in:5984/income_statements/30e901a7b7d8e98328dcd77c369b6ad7 ) 首先,我在这里有一个API( http://dev.c0l.in:5984/income_statements/_all_docs ),在这本词典中还有5000个其他词典,我可以通过他们的ID访问( http://dev.c0l.in :5984 / income_statements / 30e901a7b7d8e98328dcd77c369b6ad7

So far I've created a programme that sorts through these dictionaries and only prints out (to csv) the dictionaries related to a user input sector (eg healthcare) 到目前为止,我已经创建了一个程序,可以对这些词典进行排序,并且只打印(到csv)与用户输入扇区相关的词典(例如医疗保健)

However, I wanted to be able to implement a filter search, so that the programme will only print statements that are above or below a user input value eg Only retrieve data from (user input) closing stock and only companies below (<=) a value of closing stock - 40,000. 但是,我希望能够实现过滤器搜索,以便程序只打印高于或低于用户输入值的语句,例如,仅从(用户输入)关闭库存中检索数据,并且仅检索以下公司(<=)a期末股票价值 - 40,000。

My problem is that, I'm not necessarily sure how to. 我的问题是,我不一定确定如何

I understand how to get the user input, and how to access the dictionary within dictionary but i have no idea how to filter above or below a user input value. 我理解如何获取用户输入,以及如何访问字典中的字典,但我不知道如何过滤或低于用户输入值。

Here is a copy of my code, any pointers would be appreciated! 这是我的代码的副本,任何指针将不胜感激!

import urllib #Imports the url - library module (older the urllib2 but has some useful decodes if needed)
import urllib2 #Imports the Url- Library module (Most recently updated + used)
import csv #Imports the commands that allows for csv writing/reading
import json #Imports the ability to read/use Json data
import time #Imports the time module - allows the developer to examine benchmarks (How long did it take to fetch data)
import os


income_csv = csv.writer(open("Income Statement_ext.csv", "wb")) #This creates a CSV file and writes functions to it
financial_csv = csv.writer(open("Statement of financial position_ext.csv", "wb"))

#The two csv 'writers' create the headers for the information within the CSV file before the information from the api is added to it
financial_csv.writerow([
    ('Company name'),
    ('Non Current Assets'),
    ('Current Assets'),
    ('Equity'),
    ('Non-Current Assets'),
    ('Current Liabilities')])

income_csv.writerow([
    ('Company name'),
    ('Sales'),
    ('Opening Stock'),
    ('Purchases'),
    ('Closing Stock'),
    ('Expenses'),
    ('Interest payable'),
    ('Interest receivable')])

income_url = "http://dev.c0l.in:5984/income_statements/_all_docs"
income_request = urllib2.urlopen(income_url).read()
income_response = json.loads(income_request)
#defines the income url

financial_url = "http://dev.c0l.in:5984/financial_positions/_all_docs"
financial_request = urllib2.urlopen(financial_url).read()
financial_response = json.loads(financial_request)
#defines the financial postion url
count = 0
#sets the count for documents printed to 0
def income_statement_fn():
    global count #allows for the count to be kept globally
    print ("(Type help if you would like to see the available choices)")
    income_user_input = raw_input("Which sector would you like to iterate through in Income Statement?: ").lower()# Asks the user which sector within the chosen statement he/she would like to examine
    if income_user_input == "help":
        print ("Available sectors are: ")
        print ("Technology")
        print ("Healthcare")
        print ("Industrial goods")
        print ("Financial")
        print ("Utilities")
        print ("Basic materials")
        print ("Services") 
        income_statement_fn()

    elif income_user_input == "technology" or income_user_input == "healthcare" or income_user_input == "industrial goods" or income_user_input == "financial" or income_user_input == "utilities" or income_user_input == "basic materials" or income_user_input == "services":
        print 'Starting...' # I use this print to set a milestone (if it prints this, everything before it has worked without error)
        start = time.clock()
        start
        for item in income_response['rows']:
            is_url = "http://dev.c0l.in:5984/income_statements/" + item['id'] #This combines the api with the array's ID's allowing us to access every document automatically
            is_request = urllib2.urlopen(is_url).read() #Opens is_url and reads the data
            is_response = json.loads(is_request) #loads the data in json format
            if is_response.get ('sector') == income_user_input: #matches the sector the user inputed - allows us to access that dictionary
                income_csv.writerow([
                 is_response['company']['name'],
                 is_response['company']['sales'],
                 is_response['company']['opening_stock'],
                 is_response['company']['purchases'],
                 is_response['company']['closing_stock'],
                 is_response['company']['expenses'],
                 is_response['company']['interest_payable'],
                 is_response['company']['interest_receivable']]) # The lines of code above write the chosen fields to the csv file
            count +=1
            print ("filtering statements") + ("( "+" %s "+" )") % count
        start
        print start
        restart_fn()
    else:
        print ("Invalid input!")
        income_statement_fn()





def financial_statement_fn(): # Within this function is the code required to fetch information related to the financial position statement
    global count # Allows for the count to be kept globally (outside the function)
    print ("(Type help if you would like to see the available choices)")
    financial_user_input = raw_input("Which sector would you like to iterate through in financial statements?: ").lower()
    if financial_user_input == "help":
        print ("Available sectors are: ")
        print ("Technology")
        print ("Healthcare")
        print ("Industrial goods")
        print ("Financial")
        print ("Utilities")
        print ("Basic materials")
        print ("Services")
        financial_statement_fn()

    elif financial_user_input == "technology" or financial_user_input == "healthcare" or financial_user_input == "industrial goods" or financial_user_input == "financial" or financial_user_input == "utilities" or financial_user_input == "basic materials" or financial_user_input == "services":
        print 'Starting'
        for item in financial_response['rows']:
            fs_url = "http://dev.c0l.in:5984/financial_positions/" + item['id']#This combines the api with the array's ID's allowing us to access every document automatically
            fs_request = urllib2.urlopen(fs_url).read()
            fs_response = json.loads(fs_request)
            if fs_response.get ('sector') == financial_user_input:
                financial_csv.writerow([
                    fs_response['company']['name'],
                    fs_response['company']['non_current_assets'],
                    fs_response['company']['current_assets'],
                    fs_response['company']['equity'],
                    fs_response['company']['non_current_liabilities'],
                    fs_response['company']['current_liabilities']])
                count +=1
                print ("printing statements") + ("( "+" %s "+" )") % count
        print ("---------------------------------------------------------------------")
        print ("finished fetching data")
        print ("---------------------------------------------------------------------")
        restart_fn()

    else:
        print ("Invalid Input!")
        financial_statement_fn()


def launch_fn():
    print ("Please type 'help' if you would like to examine all available options")
    launch_user_input = raw_input("Welcome, Which statement would you like to examine?: ").lower()
    if launch_user_input == "income" or launch_user_input == "income statement":
        income_statement_fn()
    elif launch_user_input == "financial" or launch_user_input == "financial statement":
        financial_statement_fn()
    elif launch_user_input == "help" :
        print ("You can use the following commands on this menu: ")
        print ("---------------------------------------------------------------------")
        print ("Income or Income statement")
        print ("Will allow you to retrieve data relating to financial Income statements")
        print ("---------------------------------------------------------------------")
        print ("Financial or Financial statement")
        print ("Will allow you to retrieve data relating to the statement of financial position")
        print ("---------------------------------------------------------------------")
        launch_fn()
    else:
        print ("If you would like to look at the available options please type help")
        launch_fn()

def restart_fn():
    restart_prompt = raw_input("Would you like to examine another statement?: ").lower()
    if restart_prompt == 'y' or restart_prompt == 'yes':
        launch_fn()
        count = 0
    elif restart_prompt == 'n' or restart_prompt == 'no':
        raise SystemExit("Shutting down....")

def restart_api_down_fn():
    print ("Type 'y' or 'yes' to continue, 'n' or 'no' to exit or 'r' or 'reconnect' to test servers again")
    restart_prompt_api = raw_input("Would you like to continue anyway?: ").lower()
    if restart_prompt_api == 'r' or restart_prompt_api == 'reconnect' or restart_prompt_api == 'test':
        api_status_fn()
        count = 0
    elif restart_prompt_api == 'n' or restart_prompt_api == 'no':
        raise SystemExit("Shutting down....")
    elif restart_prompt_api == 'y' or restart_prompt_api == 'yes':
        print (" Continuing... Programme performance may be severely affected")
        launch_fn()
    else:
        print ("Invalid input...")
        restart_api_down_fn()

def api_status_fn():
    hostname_income = "http://dev.c0l.in:5984/income_statements" 
    response_income = os.system("ping -c 1 " + hostname_income)
    hostname_financial = "http://dev.c0l.in:5984/financial_positions"
    response_financial = os.system("ping -c 1 " + hostname_financial)
    global count
    count = 0

    if response_income == 0:
        print hostname_income, 'is up!'
        count +=1
    else:
        print hostname_income, 'is experiencing connection issues!'        

    if response_financial == 0:
        print hostname_financial, 'is up!'
        count +=1

    else:
        print hostname_financial, 'is experiencing connection issues!'

    if count == 2:
        launch_fn()

    elif count == 0:
        restart_api_down_fn() # Code only for UNIX SYSTEMS?

#def api_status_fn():
 #   hostname = "http://dev.c0l.in:5984/income_statements"
  #  ping = urllib.urlopen(hostname).getcode()
   # if ping == "200":
     #   print 'oh no!'
# add filtering & sorting







api_status_fn()

Please let me know if you need any additional explanations, 如果您需要任何其他解释,请告诉我们,

Cheers! 干杯!

I would say that your code is quite confused and you may have more luck with it if you try to break it down a little. 我会说你的代码很混乱,如果你试图把它分解一下,你可能会有更多的运气。 I will try to make some suggestions towards the end of this answer. 我会尝试在这个答案结束时提出一些建议。

Fundamentally you need to filter the specific results that you get. 从根本上说,您需要过滤得到的具体结果。 Looking at your code I can see the following: 查看您的代码,我可以看到以下内容:

elif financial_user_input == "technology" or financial_user_input == "healthcare" or financial_user_input == "industrial goods" or financial_user_input == "financial" or financial_user_input == "utilities" or financial_user_input == "basic materials" or financial_user_input == "services":
    print 'Starting'
    for item in financial_response['rows']:
        fs_url = "http://dev.c0l.in:5984/financial_positions/" + item['id']#This combines the api with the array's ID's allowing us to access every document automatically
        fs_request = urllib2.urlopen(fs_url).read()
        fs_response = json.loads(fs_request)
        if fs_response.get ('sector') == financial_user_input:

This code mixes the following responsibilities up: 此代码混合了以下职责:

  • Validating user input 验证用户输入
  • Requesting records 请求记录
  • Filtering records 过滤记录

If you split out these responsibilities into separate methods then you will find that your code is easier to reason about. 如果您将这些职责分解为单独的方法,那么您会发现您的代码更容易推理。 Also, as I will shortly show, splitting things up in this way allows you to recombine the different parts to customise the way in which the records are filtered etc. 此外,正如我将简要介绍的那样,以这种方式拆分可以让您重新组合不同的部分,以自定义过滤记录的方式等。

If it gets split up a little: 如果它分裂了一点:

def _get_single_record(id):
    """ Request an individual financial position.
        This does not filter """
    ... read and return the json decoded data ...

def _record_matches_sector(record, sector):
    """ Determine if the record provided matches the sector """
    return record['sector'] == sector

def _record_meets_closing_stock_limit(record, limit):
    """ Determine if the record provided has a
        closing stock of at least limit """
    return record['closing stock'] >= limit

def _get_all_filtered_records(ids, sector, limit):
    """ Return all financial position records that
        match the sector and closing stock limit """
    record_generator = (_get_single_record(id) for id in ids)
    return (
        record for record in record_generator
        if _record_matches_sector(record, sector)
        and _record_meets_closing_stock_limit(record, limit)
    )

This obviously just returns a generator which returns the records that match your sector and limit. 这显然只返回一个生成器,它返回与您的扇区和限制匹配的记录。 You can add more tests and so on, but updating the code to test for each of these is still quite manual. 您可以添加更多测试等,但更新代码以测试每个测试仍然非常手动。 What you need is a way to apply some selectable tests to the record_generator and return the results that match. 您需要的是一种将一些可选测试应用于record_generator并返回匹配结果的方法。

This is quite trivial in python because python treats functions as first class objects (meaning you can assign them to variables) and you can create custom functions quickly using lambdas. 这在python中非常简单,因为python将函数视为第一类对象(意味着您可以将它们分配给变量),并且您可以使用lambdas快速创建自定义函数。 This means you can restate the _get_all_filtered_records as: 这意味着您可以将_get_all_filtered_records为:

def _make_limit_test(limit):
    """ This returns a function which accepts records that meet the limit """
    return lambda record: record['closing stock'] >= limit

def _make_sector_test(sector):
    """ This returns a function which accepts records that match the sector """
    return lambda record: record['sector'] == sector

def _filter_records_by_tests(ids, tests):
     """ Returns all the records that pass all the tests """
     record_generator = (_get_single_financial_position_record(id) for id in ids)
     for record in record_generator:
         if all(test(record) for test in tests):
             yield record

You can then build the list of tests to pass by asking the user. 然后,您可以通过询问用户来构建要传递的测试列表。 This would be a sufficient demo just to verify that this approach works: 这将是一个足够的演示,只是为了验证这种方法是否有效:

def demo_filtering_by_healthcare_and_40k(ids):
    tests = [_make_sector_test("healthcare"), _make_limit_test(40000)]
    return _filter_records_by_tests(ids, tests)

As you can see my method names are quite long and the methods are quite short. 正如您所看到的,我的方法名称很长,而且方法很短。 This is really a matter of personal style, but I find that doing it that way makes it obvious what a method does and allows you to quickly comprehend the code to verify that it matches the name. 这实际上是个人风格的问题,但我发现以这种方式执行它会使方法显而易见,并允许您快速理解代码以验证它是否与名称匹配。

So to wrap this up, you are requesting records from the remote api. 所以要包装它,你要从远程api请求记录。 You can filter these by using list comprehensions. 您可以使用列表推导来过滤这些。 List comprehensions are extremely powerful and allow you to take source data and transform it and filter it. 列表推导非常强大,允许您获取源数据并对其进行转换并对其进行过滤。 It would help you a lot to read about them. 阅读它们会对你有很大帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM