简体   繁体   English

使用 Cloud Functions 和 Cloud Scheduler 创建到 BigQuery 的数据管道

[英]Creating a Data Pipeline to BigQuery Using Cloud Functions and Cloud Scheduler

I am trying to build a Data Pipeline that will download the data from this website and push it to a BigQuery Table.我正在尝试构建一个数据管道,该管道将从该网站下载数据并将其推送到 BigQuery 表。

def OH_Data_Pipeline(trigger='Yes'):
    if trigger=='Yes':
        import pandas as pd
        import pandas_gbq
        import datetime
        schema=[{'name': 'SOS_VOTERID', 'type': 'STRING'},{'name': 'COUNTY_NUMBER', 'type': 'STRING'}, {'name': 'COUNTY_ID', 'type': 'INT64'}, {'name': 'LAST_NAME', 'type': 'STRING'}, {'name': 'FIRST_NAME', 'type': 'STRING'}, {'name': 'MIDDLE_NAME', 'type': 'STRING'}, {'name': 'SUFFIX', 'type': 'STRING'}, {'name': 'DATE_OF_BIRTH', 'type': 'DATE'}, 
            {'name': 'REGISTRATION_DATE', 'type': 'DATE'}, {'name': 'VOTER_STATUS', 'type': 'STRING'}, 
            {'name': 'PARTY_AFFILIATION', 'type': 'STRING'}, {'name': 'RESIDENTIAL_ADDRESS1', 'type': 'STRING'}, 
            {'name': 'RESIDENTIAL_SECONDARY_ADDR', 'type': 'STRING'}, {'name': 'RESIDENTIAL_CITY', 'type': 'STRING'}, 
            {'name': 'RESIDENTIAL_STATE', 'type': 'STRING'}, {'name': 'RESIDENTIAL_ZIP', 'type': 'STRING'}, 
            {'name': 'RESIDENTIAL_ZIP_PLUS4', 'type': 'STRING'}, {'name': 'RESIDENTIAL_COUNTRY', 'type': 'STRING'}, 
            {'name': 'RESIDENTIAL_POSTALCODE', 'type': 'STRING'}, {'name': 'MAILING_ADDRESS1', 'type': 'STRING'}, 
            {'name': 'MAILING_SECONDARY_ADDRESS', 'type': 'STRING'}, {'name': 'MAILING_CITY', 'type': 'STRING'}, 
            {'name': 'MAILING_STATE', 'type': 'STRING'}, {'name': 'MAILING_ZIP', 'type': 'STRING'}, 
            {'name': 'MAILING_ZIP_PLUS4', 'type': 'STRING'}, {'name': 'MAILING_COUNTRY', 'type': 'STRING'}, 
            {'name': 'MAILING_POSTAL_CODE', 'type': 'STRING'}, {'name': 'CAREER_CENTER', 'type': 'STRING'}, 
            {'name': 'CITY', 'type': 'STRING'}, {'name': 'CITY_SCHOOL_DISTRICT', 'type': 'STRING'}, 
            {'name': 'COUNTY_COURT_DISTRICT', 'type': 'STRING'}, {'name': 'CONGRESSIONAL_DISTRICT', 'type': 'STRING'}, 
            {'name': 'COURT_OF_APPEALS', 'type': 'STRING'}, {'name': 'EDU_SERVICE_CENTER_DISTRICT', 'type': 'STRING'}, 
            {'name': 'EXEMPTED_VILL_SCHOOL_DISTRICT', 'type': 'STRING'}, {'name': 'LIBRARY', 'type': 'STRING'}, 
            {'name': 'LOCAL_SCHOOL_DISTRICT', 'type': 'STRING'}, {'name': 'MUNICIPAL_COURT_DISTRICT', 'type': 'STRING'}, 
            {'name': 'PRECINCT_NAME', 'type': 'STRING'}, {'name': 'PRECINCT_CODE', 'type': 'STRING'}, 
            {'name': 'STATE_BOARD_OF_EDUCATION', 'type': 'STRING'}, {'name': 'STATE_REPRESENTATIVE_DISTRICT', 'type': 'STRING'}, 
            {'name': 'STATE_SENATE_DISTRICT', 'type': 'STRING'}, {'name': 'TOWNSHIP', 'type': 'STRING'}, 
            {'name': 'VILLAGE', 'type': 'STRING'}, {'name': 'WARD', 'type': 'STRING'}, 
            {'name': 'PRIMARY_03_07_2000', 'type': 'STRING'}, {'name': 'GENERAL_11_07_2000', 'type': 'INT64'}, 
            {'name': 'SPECIAL_05_08_2001', 'type': 'STRING'}, {'name': 'GENERAL_11_06_2001', 'type': 'INT64'}, 
            {'name': 'PRIMARY_05_07_2002', 'type': 'STRING'}, {'name': 'GENERAL_11_05_2002', 'type': 'INT64'}, 
            {'name': 'SPECIAL_05_06_2003', 'type': 'STRING'}, {'name': 'GENERAL_11_04_2003', 'type': 'INT64'}, 
            {'name': 'PRIMARY_03_02_2004', 'type': 'STRING'}, {'name': 'GENERAL_11_02_2004', 'type': 'INT64'}, 
            {'name': 'SPECIAL_02_08_2005', 'type': 'STRING'}, {'name': 'PRIMARY_05_03_2005', 'type': 'STRING'}, 
            {'name': 'PRIMARY_09_13_2005', 'type': 'STRING'}, {'name': 'GENERAL_11_08_2005', 'type': 'INT64'}, 
            {'name': 'SPECIAL_02_07_2006', 'type': 'STRING'}, {'name': 'PRIMARY_05_02_2006', 'type': 'STRING'}, 
            {'name': 'GENERAL_11_07_2006', 'type': 'INT64'}, {'name': 'PRIMARY_05_08_2007', 'type': 'STRING'}, 
            {'name': 'PRIMARY_09_11_2007', 'type': 'STRING'}, {'name': 'GENERAL_11_06_2007', 'type': 'INT64'}, 
            {'name': 'PRIMARY_11_06_2007', 'type': 'STRING'}, {'name': 'GENERAL_12_11_2007', 'type': 'INT64'}, 
            {'name': 'PRIMARY_03_04_2008', 'type': 'STRING'}, {'name': 'PRIMARY_10_14_2008', 'type': 'STRING'}, 
            {'name': 'GENERAL_11_04_2008', 'type': 'INT64'}, {'name': 'GENERAL_11_18_2008', 'type': 'INT64'}, 
            {'name': 'PRIMARY_05_05_2009', 'type': 'STRING'}, {'name': 'PRIMARY_09_08_2009', 'type': 'STRING'}, 
            {'name': 'PRIMARY_09_15_2009', 'type': 'STRING'}, {'name': 'PRIMARY_09_29_2009', 'type': 'STRING'}, 
            {'name': 'GENERAL_11_03_2009', 'type': 'INT64'}, {'name': 'PRIMARY_05_04_2010', 'type': 'STRING'}, 
            {'name': 'PRIMARY_07_13_2010', 'type': 'STRING'}, {'name': 'PRIMARY_09_07_2010', 'type': 'STRING'}, 
            {'name': 'GENERAL_11_02_2010', 'type': 'INT64'}, {'name': 'PRIMARY_05_03_2011', 'type': 'STRING'}, 
            {'name': 'PRIMARY_09_13_2011', 'type': 'STRING'}, {'name': 'GENERAL_11_08_2011', 'type': 'INT64'}, 
            {'name': 'PRIMARY_03_06_2012', 'type': 'STRING'}, {'name': 'GENERAL_11_06_2012', 'type': 'INT64'}, 
            {'name': 'PRIMARY_05_07_2013', 'type': 'STRING'}, {'name': 'PRIMARY_09_10_2013', 'type': 'STRING'}, 
            {'name': 'PRIMARY_10_01_2013', 'type': 'STRING'}, {'name': 'GENERAL_11_05_2013', 'type': 'INT64'}, 
            {'name': 'PRIMARY_05_06_2014', 'type': 'STRING'}, {'name': 'GENERAL_11_04_2014', 'type': 'INT64'}, 
            {'name': 'PRIMARY_05_05_2015', 'type': 'STRING'}, {'name': 'PRIMARY_09_15_2015', 'type': 'STRING'}, 
            {'name': 'GENERAL_11_03_2015', 'type': 'INT64'}, {'name': 'PRIMARY_03_15_2016', 'type': 'STRING'}, 
            {'name': 'GENERAL_06_07_2016', 'type': 'INT64'}, {'name': 'PRIMARY_09_13_2016', 'type': 'STRING'}, 
            {'name': 'GENERAL_11_08_2016', 'type': 'INT64'}, {'name': 'PRIMARY_05_02_2017', 'type': 'STRING'}, 
            {'name': 'PRIMARY_09_12_2017', 'type': 'STRING'}, {'name': 'GENERAL_11_07_2017', 'type': 'INT64'}, 
            {'name': 'PRIMARY_05_08_2018', 'type': 'STRING'}, {'name': 'GENERAL_08_07_2018', 'type': 'INT64'}, 
            {'name': 'GENERAL_11_06_2018', 'type': 'INT64'}, {'name': 'PRIMARY_05_07_2019', 'type': 'STRING'}, 
            {'name': 'PRIMARY_09_10_2019', 'type': 'STRING'}, {'name': 'GENERAL_11_05_2019', 'type': 'INT64'}]
        prim_list = ['PRIMARY-03/07/2000', 'SPECIAL-05/08/2001', 'PRIMARY-05/07/2002', 'SPECIAL-05/06/2003', 'PRIMARY-03/02/2004', 
                'SPECIAL-02/08/2005', 'PRIMARY-05/03/2005', 'PRIMARY-09/13/2005', 'SPECIAL-02/07/2006', 'PRIMARY-05/02/2006', 
                'PRIMARY-05/08/2007', 'PRIMARY-09/11/2007', 'PRIMARY-11/06/2007', 'PRIMARY-03/04/2008', 'PRIMARY-10/14/2008', 
                'PRIMARY-05/05/2009', 'PRIMARY-09/08/2009', 'PRIMARY-09/15/2009', 'PRIMARY-09/29/2009', 'PRIMARY-05/04/2010', 
                'PRIMARY-07/13/2010', 'PRIMARY-09/07/2010', 'PRIMARY-05/03/2011', 'PRIMARY-09/13/2011', 'PRIMARY-03/06/2012', 
                'PRIMARY-05/07/2013', 'PRIMARY-09/10/2013', 'PRIMARY-10/01/2013', 'PRIMARY-05/06/2014', 'PRIMARY-05/05/2015', 
                'PRIMARY-09/15/2015', 'PRIMARY-03/15/2016', 'PRIMARY-09/13/2016', 'PRIMARY-05/02/2017', 'PRIMARY-09/12/2017', 
                'PRIMARY-05/08/2018', 'PRIMARY-05/07/2019', 'PRIMARY-09/10/2019']
        prim_list = [f.replace('-', '_').replace('/', '_') for f in prim_list]
        gen_list = ['GENERAL-11/07/2000', 'GENERAL-11/06/2001', 'GENERAL-11/05/2002', 'GENERAL-11/04/2003', 'GENERAL-11/02/2004', 
               'GENERAL-11/08/2005', 'GENERAL-11/07/2006', 'GENERAL-11/06/2007', 'GENERAL-12/11/2007', 'GENERAL-11/04/2008', 
               'GENERAL-11/18/2008', 'GENERAL-11/03/2009', 'GENERAL-11/02/2010', 'GENERAL-11/08/2011', 'GENERAL-11/06/2012', 
               'GENERAL-11/05/2013', 'GENERAL-11/04/2014', 'GENERAL-11/03/2015', 'GENERAL-06/07/2016', 'GENERAL-11/08/2016', 
               'GENERAL-11/07/2017', 'GENERAL-08/07/2018', 'GENERAL-11/06/2018', 'GENERAL-11/05/2019']
        gen_list = [f.replace('-', '_').replace('/', '_') for f in gen_list]
        party_list = ['PARTY_AFFILIATION']
        df=[pd.read_csv('https://www6.sos.state.oh.us/ords/f?p=VOTERFTP:DOWNLOAD::FILE:NO:2:P2_PRODUCT_NUMBER:{}'.format(88+f), encoding='Latin1', low_memory=False) for f in range(1, 17)]
        df=pd.concat(df)
        df.columns = [f.replace('-', '_').replace('/', '_') for f in df.columns]
        df['birth_year'] = df['DATE_OF_BIRTH'].map(lambda x: str(x)[:-6]).astype(int)
        df['Age'] = now.year - df['birth_year']
        for f in prim_list:
            df.loc[df[f]=='D', f]='Democrat'
            df.loc[df[f]=='R', f]='Republican'
            df.loc[df[f]=='G', f]='Green'
            df.loc[df[f]=='E', f]='Reform'
            df.loc[df[f]=='L', f]='Libertarian'
            df.loc[df[f]=='C', f]='Constitution'
            df.loc[df[f]=='N', f]='Natural Law'
            df.loc[df[f]=='S', f]='Socialist'
            df.loc[df[f]=='X', f]='Without Affiliation'
            df.loc[(df[f]=='') | (df[f].isnull()==True) | (df[f]==0), f]='Not Voted'
        for f in party_list:
            df.loc[df[f]=='D', f]='Democrat'
            df.loc[df[f]=='R', f]='Republican'
            df.loc[df[f]=='G', f]='Green'
            df.loc[df[f]=='E', f]='Reform'
            df.loc[df[f]=='L', f]='Libertarian'
            df.loc[df[f]=='C', f]='Constitution'
            df.loc[df[f]=='N', f]='Natural Law'
            df.loc[df[f]=='S', f]='Socialist'
            df.loc[df[f]=='X', f]='Unaffiliated'
            df.loc[(df[f]=='') | (df[f].isnull()==True) | (df[f]==0), f]='Unaffiliated'
        for g in gen_list:  
            df.loc[(df[g]!='') & (df[g].isnull()!=True) & (df[g]!=0) & (df[g]!='NaN'), g]=1
            df.loc[(df[g]=='') | (df[g].isnull()==True) | (df[g]==0) | (df[g]=='NaN'), g]=0
        df[gen_list]=df[gen_list].astype(int)
        df[prim_list]=df[prim_list].astype(str)
        df[party_list]=df[party_list].astype(str)
        df.to_gbq(destination_table='Voterfile.OH_Voterfile', project_id='oh-data-pipeline', if_exists='replace', table_schema=schema, reauth=False)
    else:
        pass

The problem is after defining the function in cloud functions, I will run the script in cloud scheduler, it will say the function ran, but no data will show up in BigQuery.问题是在云函数中定义函数后,我将在云调度程序中运行脚本,它会说该函数已运行,但在 BigQuery 中不会显示任何数据。

在此处输入图片说明

在此处输入图片说明

在此处输入图片说明

Here are the logs as well: [ { "insertId": "1idtfdbg5drzu63", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T21:12:39.949108697Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:39.949108697Z" }, { "insertId": "k9f9cjg5ds4bft", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00.271618Z" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:39.823311702Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:39.823311702Z" }, { "insertId": "1xnnrrug5g0c2qj", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:37.290359769Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:37.290359769Z" }, { "insertId": "sv8ssdg5e3blni", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00.183767Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:36.916739031Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:36.916739031Z" }, { "insertId": "7i1kgtfutdv2s", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T19:37:07.201347795Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T19:37:07.201347795Z" }, { "insertId": "19io9oog5fvqy42", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "resource": { "type": "cloud_scheduler_job", "labels": { "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline" } }, "timestamp": "2020-01-01T19:37:07.092810676Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T19:37:07.092810676Z" }, { "insertId": "1t7pz9vg5e70eo5", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:30:00.396767720Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:30:00.396767720Z" }, { "insertId": "1p23vr0g59sba7d", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.250018Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline" } }, "timestamp": "2020-01-01T17:30:00.267802278Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:30:00.267802278Z" }, { "insertId": "1yi5eng4p1lgiq", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:26:15.268636308Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:26:15.268636308Z" }, { "insertId": "1u1dz02g41np17v", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.369545Z" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:26:15.133041426Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:26:15.133041426Z" }, { "insertId": "1gzxg1lg4qi1i28", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:22:41.388248918Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:22:41.388248918Z" }, { "insertId": "1es7ag9g5bdguh5", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.257483Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T17:22:41.268121872Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:22:41.268121872Z" } ]这里还有日志: [ { "insertId": "1idtfdbg5drzu63", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T21:12:39.949108697Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:39.949108697Z" }, { "insertId": "k9f9cjg5ds4bft", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00.271618Z" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:39.823311702Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:39.823311702Z" }, { "insertId": "1xnnrrug5g0c2qj", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:37.290359769Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:37.290359769Z" }, { "insertId": "sv8ssdg5e3blni", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00.183767Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:36.916739031Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:36.916739031Z" }, { "insertId": "7i1kgtfutdv2s", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T19:37:07.201347795Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T19:37:07.201347795Z" }, { "insertId": "19io9oog5fvqy42", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "resource": { "type": "cloud_scheduler_job", "labels": { "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline" } }, "timestamp": "2020-01-01T19:37:07.092810676Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T19:37:07.092810676Z" }, { "insertId": "1t7pz9vg5e70eo5", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:30:00.396767720Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:30:00.396767720Z" }, { "insertId": "1p23vr0g59sba7d", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.250018Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline" } }, "timestamp": "2020-01-01T17:30:00.267802278Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:30:00.267802278Z" }, { "insertId": "1yi5eng4p1lgiq", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:26:15.268636308Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:26:15.268636308Z" }, { "insertId": "1u1dz02g41np17v", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.369545Z" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:26:15.133041426Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:26:15.133041426Z" }, { "insertId": "1gzxg1lg4qi1i28", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:22:41.388248918Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:22:41.388248918Z" }, { "insertId": "1es7ag9g5bdguh5", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.257483Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T17:22:41.268121872Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:22:41.268121872Z" } ] [ { "insertId": "1idtfdbg5drzu63", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T21:12:39.949108697Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:39.949108697Z" }, { "insertId": "k9f9cjg5ds4bft", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00.271618Z" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:39.823311702Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:39.823311702Z" }, { "insertId": "1xnnrrug5g0c2qj", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:37.290359769Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:37.290359769Z" }, { "insertId": "sv8ssdg5e3blni", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00.183767Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T21:12:36.916739031Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T21:12:36.916739031Z" }, { "insertId": "7i1kgtfutdv2s", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T19:37:07.201347795Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T19:37:07.201347795Z" }, { "insertId": "19io9oog5fvqy42", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-06T05:00:00Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "resource": { "type": "cloud_scheduler_job", "labels": { "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline" } }, "timestamp": "2020-01-01T19:37:07.092810676Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T19:37:07.092810676Z" }, { "insertId": "1t7pz9vg5e70eo5", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:30:00.396767720Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:30:00.396767720Z" }, { "insertId": "1p23vr0g59sba7d", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.250018Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline" } }, "timestamp": "2020-01-01T17:30:00.267802278Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:30:00.267802278Z" }, { "insertId": "1yi5eng4p1lgiq", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:26:15.268636308Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:26:15.268636308Z" }, { "insertId": "1u1dz02g41np17v", "jsonPayload": { "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.369545Z" }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:26:15.133041426Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:26:15.133041426Z" }, { "insertId": "1gzxg1lg4qi1i28", "jsonPayload": { "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader", "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline" }, "httpRequest": { "status": 200 }, "resource": { "type": "cloud_scheduler_job", "labels": { "job_id": "OH_Voterfile_Data_Loader", "project_id": "oh-data-pipeline", "location": "us-east4" } }, "timestamp": "2020-01-01T17:22:41.388248918Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:22:41.388248918Z" }, { "insertId": "1es7ag9g5bdguh5", "jsonPayload": { "targetType": "HTTP", "url": "https://us-central1-oh-data-pipeline.cloudfunctions.net/OH_Data_Pipeline", "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptStarted", "scheduledTime": "2020-01-01T17:30:00.257483Z", "jobName": "projects/oh-data-pipeline/locations/us-east4/jobs/OH_Voterfile_Data_Loader" }, "resource": { "type": "cloud_scheduler_job", "labels": { "project_id": "oh-data-pipeline", "location": "us-east4", "job_id": "OH_Voterfile_Data_Loader" } }, "timestamp": "2020-01-01T17:22:41.268121872Z", "severity": "INFO", "logName": "projects/oh-data-pipeline/logs/cloudscheduler.googleapis.com%2Fexecutions", "receiveTimestamp": "2020-01-01T17:22:41.268121872Z" } ]

Can you please help me figure out why this is not working?你能帮我弄清楚为什么这不起作用吗?

When I tested your code, I received the following error message: OH_Data_Pipeline() takes from 0 to 1 positional arguments but 2 were given当我测试您的代码时,我收到以下错误消息: OH_Data_Pipeline() takes from 0 to 1 positional arguments but 2 were given

You should modify your function definition to follow the sample code (I'm not sure what trigger is used for, so for now I'm just hardcoding it to 'Yes' ):您应该修改函数定义以遵循示例代码(我不确定trigger的用途,所以现在我只是将其硬编码为'Yes' ):

def OH_Data_Pipeline(event, context):
    trigger='Yes'
    ...

Also, make sure you have a requirements.txt file and it has the correct libraries specified:另外,请确保您有一个requirements.txt文件并且它指定了正确的库:

pandas
pandas_gbq
datetime

After all of these changes, I then receive this error:在所有这些更改之后,我收到此错误:

Error: function terminated. Recommended action: inspect logs for termination reason. Details:
<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1051)>

There seems to be an issue with the SSL certificate for the domain you are trying to access.您尝试访问的域的 SSL 证书似乎存在问题。 You will need to play around with read_csv() to allow working with the domain despite the concerns with its certificates.尽管担心其证书,您仍需要使用read_csv()来允许使用域。

I changed your code a little to run on my machine and also in Cloud Functions.我稍微更改了您的代码,以便在我的机器和 Cloud Functions 中运行。

def main(arg):
    import pandas as pd
    import pandas_gbq
    import datetime, io, requests
    from google.oauth2 import service_account
    schema=[{'name': 'SOS_VOTERID', 'type': 'STRING'},{'name': 'COUNTY_NUMBER', 'type': 'STRING'}, {'name': 'COUNTY_ID', 'type': 'INT64'}, {'name': 'LAST_NAME', 'type': 'STRING'}, {'name': 'FIRST_NAME', 'type': 'STRING'}, {'name': 'MIDDLE_NAME', 'type': 'STRING'}, {'name': 'SUFFIX', 'type': 'STRING'}, {'name': 'DATE_OF_BIRTH', 'type': 'DATE'}, 
        {'name': 'REGISTRATION_DATE', 'type': 'DATE'}, {'name': 'VOTER_STATUS', 'type': 'STRING'}, 
        {'name': 'PARTY_AFFILIATION', 'type': 'STRING'}, {'name': 'RESIDENTIAL_ADDRESS1', 'type': 'STRING'}, 
        {'name': 'RESIDENTIAL_SECONDARY_ADDR', 'type': 'STRING'}, {'name': 'RESIDENTIAL_CITY', 'type': 'STRING'}, 
        {'name': 'RESIDENTIAL_STATE', 'type': 'STRING'}, {'name': 'RESIDENTIAL_ZIP', 'type': 'STRING'}, 
        {'name': 'RESIDENTIAL_ZIP_PLUS4', 'type': 'STRING'}, {'name': 'RESIDENTIAL_COUNTRY', 'type': 'STRING'}, 
        {'name': 'RESIDENTIAL_POSTALCODE', 'type': 'STRING'}, {'name': 'MAILING_ADDRESS1', 'type': 'STRING'}, 
        {'name': 'MAILING_SECONDARY_ADDRESS', 'type': 'STRING'}, {'name': 'MAILING_CITY', 'type': 'STRING'}, 
        {'name': 'MAILING_STATE', 'type': 'STRING'}, {'name': 'MAILING_ZIP', 'type': 'STRING'}, 
        {'name': 'MAILING_ZIP_PLUS4', 'type': 'STRING'}, {'name': 'MAILING_COUNTRY', 'type': 'STRING'}, 
        {'name': 'MAILING_POSTAL_CODE', 'type': 'STRING'}, {'name': 'CAREER_CENTER', 'type': 'STRING'}, 
        {'name': 'CITY', 'type': 'STRING'}, {'name': 'CITY_SCHOOL_DISTRICT', 'type': 'STRING'}, 
        {'name': 'COUNTY_COURT_DISTRICT', 'type': 'STRING'}, {'name': 'CONGRESSIONAL_DISTRICT', 'type': 'STRING'}, 
        {'name': 'COURT_OF_APPEALS', 'type': 'STRING'}, {'name': 'EDU_SERVICE_CENTER_DISTRICT', 'type': 'STRING'}, 
        {'name': 'EXEMPTED_VILL_SCHOOL_DISTRICT', 'type': 'STRING'}, {'name': 'LIBRARY', 'type': 'STRING'}, 
        {'name': 'LOCAL_SCHOOL_DISTRICT', 'type': 'STRING'}, {'name': 'MUNICIPAL_COURT_DISTRICT', 'type': 'STRING'}, 
        {'name': 'PRECINCT_NAME', 'type': 'STRING'}, {'name': 'PRECINCT_CODE', 'type': 'STRING'}, 
        {'name': 'STATE_BOARD_OF_EDUCATION', 'type': 'STRING'}, {'name': 'STATE_REPRESENTATIVE_DISTRICT', 'type': 'STRING'}, 
        {'name': 'STATE_SENATE_DISTRICT', 'type': 'STRING'}, {'name': 'TOWNSHIP', 'type': 'STRING'}, 
        {'name': 'VILLAGE', 'type': 'STRING'}, {'name': 'WARD', 'type': 'STRING'}, 
        {'name': 'PRIMARY_03_07_2000', 'type': 'STRING'}, {'name': 'GENERAL_11_07_2000', 'type': 'INT64'}, 
        {'name': 'SPECIAL_05_08_2001', 'type': 'STRING'}, {'name': 'GENERAL_11_06_2001', 'type': 'INT64'}, 
        {'name': 'PRIMARY_05_07_2002', 'type': 'STRING'}, {'name': 'GENERAL_11_05_2002', 'type': 'INT64'}, 
        {'name': 'SPECIAL_05_06_2003', 'type': 'STRING'}, {'name': 'GENERAL_11_04_2003', 'type': 'INT64'}, 
        {'name': 'PRIMARY_03_02_2004', 'type': 'STRING'}, {'name': 'GENERAL_11_02_2004', 'type': 'INT64'}, 
        {'name': 'SPECIAL_02_08_2005', 'type': 'STRING'}, {'name': 'PRIMARY_05_03_2005', 'type': 'STRING'}, 
        {'name': 'PRIMARY_09_13_2005', 'type': 'STRING'}, {'name': 'GENERAL_11_08_2005', 'type': 'INT64'}, 
        {'name': 'SPECIAL_02_07_2006', 'type': 'STRING'}, {'name': 'PRIMARY_05_02_2006', 'type': 'STRING'}, 
        {'name': 'GENERAL_11_07_2006', 'type': 'INT64'}, {'name': 'PRIMARY_05_08_2007', 'type': 'STRING'}, 
        {'name': 'PRIMARY_09_11_2007', 'type': 'STRING'}, {'name': 'GENERAL_11_06_2007', 'type': 'INT64'}, 
        {'name': 'PRIMARY_11_06_2007', 'type': 'STRING'}, {'name': 'GENERAL_12_11_2007', 'type': 'INT64'}, 
        {'name': 'PRIMARY_03_04_2008', 'type': 'STRING'}, {'name': 'PRIMARY_10_14_2008', 'type': 'STRING'}, 
        {'name': 'GENERAL_11_04_2008', 'type': 'INT64'}, {'name': 'GENERAL_11_18_2008', 'type': 'INT64'}, 
        {'name': 'PRIMARY_05_05_2009', 'type': 'STRING'}, {'name': 'PRIMARY_09_08_2009', 'type': 'STRING'}, 
        {'name': 'PRIMARY_09_15_2009', 'type': 'STRING'}, {'name': 'PRIMARY_09_29_2009', 'type': 'STRING'}, 
        {'name': 'GENERAL_11_03_2009', 'type': 'INT64'}, {'name': 'PRIMARY_05_04_2010', 'type': 'STRING'}, 
        {'name': 'PRIMARY_07_13_2010', 'type': 'STRING'}, {'name': 'PRIMARY_09_07_2010', 'type': 'STRING'}, 
        {'name': 'GENERAL_11_02_2010', 'type': 'INT64'}, {'name': 'PRIMARY_05_03_2011', 'type': 'STRING'}, 
        {'name': 'PRIMARY_09_13_2011', 'type': 'STRING'}, {'name': 'GENERAL_11_08_2011', 'type': 'INT64'}, 
        {'name': 'PRIMARY_03_06_2012', 'type': 'STRING'}, {'name': 'GENERAL_11_06_2012', 'type': 'INT64'}, 
        {'name': 'PRIMARY_05_07_2013', 'type': 'STRING'}, {'name': 'PRIMARY_09_10_2013', 'type': 'STRING'}, 
        {'name': 'PRIMARY_10_01_2013', 'type': 'STRING'}, {'name': 'GENERAL_11_05_2013', 'type': 'INT64'}, 
        {'name': 'PRIMARY_05_06_2014', 'type': 'STRING'}, {'name': 'GENERAL_11_04_2014', 'type': 'INT64'}, 
        {'name': 'PRIMARY_05_05_2015', 'type': 'STRING'}, {'name': 'PRIMARY_09_15_2015', 'type': 'STRING'}, 
        {'name': 'GENERAL_11_03_2015', 'type': 'INT64'}, {'name': 'PRIMARY_03_15_2016', 'type': 'STRING'}, 
        {'name': 'GENERAL_06_07_2016', 'type': 'INT64'}, {'name': 'PRIMARY_09_13_2016', 'type': 'STRING'}, 
        {'name': 'GENERAL_11_08_2016', 'type': 'INT64'}, {'name': 'PRIMARY_05_02_2017', 'type': 'STRING'}, 
        {'name': 'PRIMARY_09_12_2017', 'type': 'STRING'}, {'name': 'GENERAL_11_07_2017', 'type': 'INT64'}, 
        {'name': 'PRIMARY_05_08_2018', 'type': 'STRING'}, {'name': 'GENERAL_08_07_2018', 'type': 'INT64'}, 
        {'name': 'GENERAL_11_06_2018', 'type': 'INT64'}, {'name': 'PRIMARY_05_07_2019', 'type': 'STRING'}, 
        {'name': 'PRIMARY_09_10_2019', 'type': 'STRING'}, {'name': 'GENERAL_11_05_2019', 'type': 'INT64'}]
    prim_list = ['PRIMARY-03/07/2000', 'SPECIAL-05/08/2001', 'PRIMARY-05/07/2002', 'SPECIAL-05/06/2003', 'PRIMARY-03/02/2004', 
            'SPECIAL-02/08/2005', 'PRIMARY-05/03/2005', 'PRIMARY-09/13/2005', 'SPECIAL-02/07/2006', 'PRIMARY-05/02/2006', 
            'PRIMARY-05/08/2007', 'PRIMARY-09/11/2007', 'PRIMARY-11/06/2007', 'PRIMARY-03/04/2008', 'PRIMARY-10/14/2008', 
            'PRIMARY-05/05/2009', 'PRIMARY-09/08/2009', 'PRIMARY-09/15/2009', 'PRIMARY-09/29/2009', 'PRIMARY-05/04/2010', 
            'PRIMARY-07/13/2010', 'PRIMARY-09/07/2010', 'PRIMARY-05/03/2011', 'PRIMARY-09/13/2011', 'PRIMARY-03/06/2012', 
            'PRIMARY-05/07/2013', 'PRIMARY-09/10/2013', 'PRIMARY-10/01/2013', 'PRIMARY-05/06/2014', 'PRIMARY-05/05/2015', 
            'PRIMARY-09/15/2015', 'PRIMARY-03/15/2016', 'PRIMARY-09/13/2016', 'PRIMARY-05/02/2017', 'PRIMARY-09/12/2017', 
            'PRIMARY-05/08/2018', 'PRIMARY-05/07/2019', 'PRIMARY-09/10/2019']
    prim_list = [f.replace('-', '_').replace('/', '_') for f in prim_list]
    gen_list = ['GENERAL-11/07/2000', 'GENERAL-11/06/2001', 'GENERAL-11/05/2002', 'GENERAL-11/04/2003', 'GENERAL-11/02/2004', 
           'GENERAL-11/08/2005', 'GENERAL-11/07/2006', 'GENERAL-11/06/2007', 'GENERAL-12/11/2007', 'GENERAL-11/04/2008', 
           'GENERAL-11/18/2008', 'GENERAL-11/03/2009', 'GENERAL-11/02/2010', 'GENERAL-11/08/2011', 'GENERAL-11/06/2012', 
           'GENERAL-11/05/2013', 'GENERAL-11/04/2014', 'GENERAL-11/03/2015', 'GENERAL-06/07/2016', 'GENERAL-11/08/2016', 
           'GENERAL-11/07/2017', 'GENERAL-08/07/2018', 'GENERAL-11/06/2018', 'GENERAL-11/05/2019']
    gen_list = [f.replace('-', '_').replace('/', '_') for f in gen_list]
    party_list = ['PARTY_AFFILIATION']
    df= [pd.read_csv(io.StringIO(str(requests.get('https://www6.sos.state.oh.us/ords/f?p=VOTERFTP:DOWNLOAD::FILE:NO:2:P2_PRODUCT_NUMBER:{}'.format(88+f), verify=False).text)),encoding='Latin1', low_memory=False) for f in range(1, 2) ]
    #df=[pd.read_csv('https://www6.sos.state.oh.us/ords/f?p=VOTERFTP:DOWNLOAD::FILE:NO:2:P2_PRODUCT_NUMBER:{}'.format(88+f), encoding='Latin1', low_memory=False) for f in range(1, 17)]
    df=pd.concat(df)
    df.columns = [f.replace('-', '_').replace('/', '_') for f in df.columns]
    df['birth_year'] = df['DATE_OF_BIRTH'].map(lambda x: str(x)[:-6]).astype(int)
    df['Age'] = datetime.datetime.now().year - df['birth_year']
    for f in prim_list:
        df.loc[df[f]=='D', f]='Democrat'
        df.loc[df[f]=='R', f]='Republican'
        df.loc[df[f]=='G', f]='Green'
        df.loc[df[f]=='E', f]='Reform'
        df.loc[df[f]=='L', f]='Libertarian'
        df.loc[df[f]=='C', f]='Constitution'
        df.loc[df[f]=='N', f]='Natural Law'
        df.loc[df[f]=='S', f]='Socialist'
        df.loc[df[f]=='X', f]='Without Affiliation'
        df.loc[(df[f]=='') | (df[f].isnull()==True) | (df[f]==0), f]='Not Voted'
    for f in party_list:
        df.loc[df[f]=='D', f]='Democrat'
        df.loc[df[f]=='R', f]='Republican'
        df.loc[df[f]=='G', f]='Green'
        df.loc[df[f]=='E', f]='Reform'
        df.loc[df[f]=='L', f]='Libertarian'
        df.loc[df[f]=='C', f]='Constitution'
        df.loc[df[f]=='N', f]='Natural Law'
        df.loc[df[f]=='S', f]='Socialist'
        df.loc[df[f]=='X', f]='Unaffiliated'
        df.loc[(df[f]=='') | (df[f].isnull()==True) | (df[f]==0), f]='Unaffiliated'
    for g in gen_list:  
        df.loc[(df[g]!='') & (df[g].isnull()!=True) & (df[g]!=0) & (df[g]!='NaN'), g]=1
        df.loc[(df[g]=='') | (df[g].isnull()==True) | (df[g]==0) | (df[g]=='NaN'), g]=0
    df[gen_list]=df[gen_list].astype(int)
    df[prim_list]=df[prim_list].astype(str)
    df[party_list]=df[party_list].astype(str)
    df.to_gbq(destination_table='Voterfile.OH_Voterfile', project_id='astute-acolyte-260912', if_exists='replace', table_schema=schema, reauth=False)

Notice that the verify=False is ONLY for testing proposes.请注意, verify=False用于测试建议。 It shall not be used in production.不得用于生产。 After running your code in my machine and in Cloud Functions , I realized two things:在我的机器和Cloud Functions运行您的代码后,我意识到两件事:

  1. Your code takes a long time to run since it needs to download and process files.您的代码需要很长时间才能运行,因为它需要下载和处理文件。 Given that, it should be deployed in Cloud Functions because Cloud Functions has a maximum timeout of 9 minutes as you can see here鉴于这种情况,应该在部署Cloud Functions ,因为云功能有9分钟的最大超时,你可以看到 在这里
  2. To perform all this downloads and transformations, your code uses a lot of memory.为了执行所有这些下载和转换,您的代码使用了大量内存。 I tried to run on Cloud Functions with the maximum possible amount of memory (2GB) and It reached the limit of the memory.我尝试在具有最大可能内存量 (2GB) 的 Cloud Functions 上运行,但达到了内存限制。

You could try using a VM in Compute Engine for that.为此,您可以尝试在 Compute Engine 中使用 VM。 In this case you could also use Cloud Schedule to turn on and turn off your VM in the exactly time you want.在这种情况下,您还可以使用 Cloud Schedule 在您想要的确切时间打开和关闭您的 VM。 You can find the tutorial for that here你可以在这里找到教程

For creating a VM in Compute Engine, you can follow this tutorial Creating an instance from a public image要在 Compute Engine 中创建 VM,您可以按照本教程从公共映像创建实例

Keep in mind that when using VMs in Compute Engine, you will pay for the processing when the VM is turned on and for the storage of the VM`s disk both when the VM is turned on or off.请记住,在 Compute Engine 中使用 VM 时,您将支付 VM 开启时的处理费用以及 VM 开启或关闭时 VM 磁盘的存储费用。 After creating the VM, you can access it through the console.创建 VM 后,您可以通过控制台访问它。 You should prepare the enviroment for your code in the VM the same way you did in your local machine.您应该在 VM 中为您的代码准备环境,就像您在本地机器上所做的那样。

After having your enviroment configured and a deployable code, you can use crontab to schedule the execution of your script by the system.在配置好环境和可部署的代码后,您可以使用crontab来安排系统执行脚本。

Now we have the last step: configure a Cloud Schedule to turn on an turn off your VM in the righ time.现在我们有了最后一步:配置 Cloud Schedule 以在适当的时候打开和关闭 VM。 You can find the tutorial here .您可以在此处找到教程。 You should schedule your VM to turn on some minutes before the time you define in the crontab.您应该安排您的 VM 在您在 crontab 中定义的时间之前几分钟打开。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM