簡體   English   中英

初學者嘗試調試python代碼以在網絡中立注釋上調用FCC API

[英]beginner trying to debug python code to call FCC API on net neutrality comments

以下是Jeffrey Fossett提供的一些代碼https://github.com/Fossj117/fossj117.github.io/blob/master/_code/2017-05-13-fcc-filings/final/fcc_filings_with_public_api.py

我剛剛設置了python環境,並試圖運行此代碼(我的環境中的代碼中插入了API密鑰)

'''使用FCC的公共API https://www.fcc.gov/ecfs/public-api-docs.html使用Python抓取關於摘要17-108的FCC歸檔的快速腳本

import requests 
import pandas as pd

def get_filings(endpoint, offset, n_records, proceeding, api_key): 
''' 
Gets FCC filings about given proceeding from endpoint, starting 
at offset and collecting n_records (breaks if n_records too large)
'''

print "Trying to get filings {} to {}...".format(str(offset), str(offset + n_records))

payload = {'limit':n_records, 'proceedings.name': proceeding, 'offset':offset, 'api_key': api_key, "sort": "date_submission,ASC"}

r = requests.get(endpoint, params = payload)
filings = r.json()['filings']

print "...got {}, returned {} filings".format(r.reason, len(filings))

return filings


def clean_data(filings): 
''' 
Clean up the raw scraped data for analysis
'''

df = pd.DataFrame(filings)

df_filtered = df[['id_submission', 'contact_email', 'date_submission', 'date_received', 'date_disseminated','text_data', 'addressentity']]

# Extract geo data 
df_filtered['city'] = df_filtered.addressentity.apply(lambda x: x['city'] if 'city' in x.keys() else None)
df_filtered['state'] = df_filtered.addressentity.apply(lambda x: x['state'] if 'state' in x.keys() else None)
df_filtered['zip_code'] = df_filtered.addressentity.apply(lambda x: x['zip_code'] if 'zip_code' in x.keys() else None)

df_clean = df_filtered.drop(['addressentity'], axis = 1)

return df_clean

if __name__ == '__main__': 

# static params
PROCEEDING = '17-108'
ENDPOINT = 'https://publicapi.fcc.gov/ecfs/filings'

API_KEY = "" # Your API Key Here 

# initialize
OFFSET = 0
N_RECORDS = 10000 # larger than this seems to break the API

filings = []

# Main Loop
while True: 

    new_filings = get_filings(ENDPOINT, OFFSET, N_RECORDS, PROCEEDING, API_KEY)

    if new_filings: 

        filings += new_filings
        OFFSET += N_RECORDS

    else: 

        break 

# clean the data up & write it to a file for analysis
df_clean = clean_data(filings)
df_clean.to_csv('raw_data_pub_api_sorted_5_14_2AM.csv', encoding = 'utf-8')

運行此代碼時,得到以下輸出:

文件“ query.py”,第16行打印“試圖將文件{}到{} ...”。format(str(offset),str(offset + n_records))^ SyntaxError:語法無效

我猜想在print命令中或者在我要調用的靜態參數中有語法錯誤? (因為那是中斷的地方)。 那就是說,我有點不知所措。 任何幫助將非常感激。

您正在使用哪個版本的python? 在python3中,使用: print("whatever you wanna print")而不是print "whatever you wanna print"

您只需要將打印語句括在括號中即可,如下所示:

import requests 
import pandas as pd

def get_filings(endpoint, offset, n_records, proceeding, api_key): 
    ''' 
    Gets FCC filings about given proceeding from endpoint, starting 
    at offset and collecting n_records (breaks if n_records too large)
    '''

    print("Trying to get filings {} to {}...".format(str(offset), str(offset + n_records)))

    payload = {'limit':n_records, 'proceedings.name': proceeding, 'offset':offset, 'api_key': api_key, "sort": "date_submission,ASC"}

    r = requests.get(endpoint, params = payload)
    filings = r.json()['filings']

    print("...got {}, returned {} filings".format(r.reason, len(filings)))

    return filings


def clean_data(filings): 
    ''' 
    Clean up the raw scraped data for analysis
    '''

    df = pd.DataFrame(filings)

    df_filtered = df[['id_submission', 'contact_email', 'date_submission', 'date_received', 'date_disseminated','text_data', 'addressentity']]

    # Extract geo data 
    df_filtered['city'] = df_filtered.addressentity.apply(lambda x: x['city'] if 'city' in x.keys() else None)
    df_filtered['state'] = df_filtered.addressentity.apply(lambda x: x['state'] if 'state' in x.keys() else None)
    df_filtered['zip_code'] = df_filtered.addressentity.apply(lambda x: x['zip_code'] if 'zip_code' in x.keys() else None)

    df_clean = df_filtered.drop(['addressentity'], axis = 1)

    return df_clean

if __name__ == '__main__': 
    # static params
    PROCEEDING = '17-108'
    ENDPOINT = 'https://publicapi.fcc.gov/ecfs/filings'

    API_KEY = "rVFHpkCgR2oigr9vQmJREnrSUVtaJC1NIiMgYL8S" # Your API Key Here 

    # initialize
    OFFSET = 0
    N_RECORDS = 10000 # larger than this seems to break the API

    filings = []

    # Main Loop
    while True: 

        new_filings = get_filings(ENDPOINT, OFFSET, N_RECORDS, PROCEEDING, API_KEY)

        if new_filings: 
            filings += new_filings
            OFFSET += N_RECORDS
        else: 
            break 

    # clean the data up & write it to a file for analysis
    df_clean = clean_data(filings)
    df_clean.to_csv('raw_data_pub_api_sorted_5_14_2AM.csv', encoding = 'utf-8')

這對我有用,盡管我的計算機在收到16萬份申請后內存不足:)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM