简体   繁体   English

将命令行参数传递给 Presto Query

[英]Passing command line argument to Presto Query

I ma newbie to python. I want to pass a command-line argument to my presto query which is inside a function and then writes the result as a CSV file.我是 python 的新手。我想将命令行参数传递给我的 presto 查询,该查询位于 function 内,然后将结果写入 CSV 文件。 But when I try to run it on the terminal it says ' Traceback (most recent call last): File "function2.py", line 3, in <module> from pyhive import presto ModuleNotFoundError: No module named 'pyhive'但是当我尝试在终端上运行它时,它说' Traceback (most recent call last): File "function2.py", line 3, in <module> from pyhive import presto ModuleNotFoundError: No module named 'pyhive'

The pyhive requirement is already satisfied. pyhive的要求已经满足了。 Please find attached my code:请找到我的代码:

from sys import argv
import argparse
from pyhive import presto
import prestodb
import csv
import sys


import pandas as pd

connection = presto.connect(host='xyz',port=8889,username='test')
cur = connection.cursor()
print('Connection Established')


def func1(object,start,end):
    object = argv[1]
    start = argv[2]
    end = argv[3]
    result = cur.execute("""
    with map_date as 
    (
     SELECT 
     object, 
     epoch,
     timestamp,
     date,
     map_agg(name, value) as map_values
    from hive.schema.test1
    where object = '${object}' 
    and (epoch >= '${start}' and epoch <= '${end}')
    and name in ('x','y')
    GROUP BY object,epoch,timestamp,date
    order by timestamp asc
    )
    SELECT
      epoch
    , timestamp
    , CASE WHEN element_at(map_values, 'x') IS NOT NULL THEN map_values['x'] ELSE NULL END AS x
    , CASE WHEN element_at(map_values, 'y') IS NOT NULL THEN map_values['y'] ELSE NULL END AS y
    , object
    , date AS date
    from map_date
    """)
rows = cur.fetchall()
print('Query Finished')     #Returns the list with one entry for each record
fp = open('/Users/xyz/Desktop/Python/function.csv', 'w')
print('File Created')
myFile = csv.writer(fp)
colnames = [desc[0] for desc in cur.description]     #store the headers in variable called 'colnames'
myFile.writerow(colnames)    #write the header to the file
myFile.writerows(rows)
fp.close()

func1(object,start,end)

cur.close()
connection.close()

How can I pass the command line argument to my Presto query which is written inside a function?如何将命令行参数传递给写在 function 中的 Presto 查询? Any help is much appreciated.任何帮助深表感谢。 Thank you In advance!先感谢您!

I only describe how to pass command line arguments to function and query.我只描述如何将命令行 arguments 传递给 function 并进行查询。


If you define function如果定义 function

def func1(object, start, end):
    # code

then you have to send values as varaibles and you have to use sys.argv outside function那么你必须将值作为变量发送,你必须在 function 之外使用sys.argv

connection = presto.connect(host='xyz', port=8889, username='test')  # PEP8: spaces after commas
cur = connection.cursor()
print('Connection Established')

object_ = sys.argv[1]   # PEP8: there is class `object` so I add `_` to create different name
start = sys.argv[2]
end = sys.argv[3]

func1(object_, start, end)

cur.close()
connection.close()

You don't have to use the same names outside function您不必在 function 之外使用相同的名称

args1 = sys.argv[1]
args2 = sys.argv[2]
args3 = sys.argv[3]

func1(args1, args2, args3)

and you can even do你甚至可以做

func1(sys.argv[1], sys.argv[2], sys.argv[3])

becuse when you run this line then python gets definition def func1(object, start, end): and it creates local variables with names object, start, end inside func1 and it assigns external value to these local variables因为当您运行此行时,python 会获得定义def func1(object, start, end):并在func1中创建名称object, start, end的局部变量,并将外部值分配给这些局部变量

object=objec_, start=start, end=end 

or要么

object=args1, start=args2, end=args2 

or要么

object=sys.argv[1], start=sys.argv[1], end=sys.argv[1]

It would be good to send explicitly also cur to function最好也明确发送cur到 function

def func1(cur, object_, start, end):
    # code

and

func1(cur, sys.argv[1], sys.argv[2], sys.argv[3])

I don't know what you try to do in SQL query but Python uses {start} (without $ ) to put value in string (Bash uses ${start} ) and it needs prefix f to create f-string - f"""... {start}....""" .我不知道你在 SQL 查询中尝试做什么,但是 Python 使用{start} (没有$ )将值放入字符串(Bash 使用${start} )并且它需要前缀f来创建f-string - f"""... {start}....""" Without f you have to use normal string formatting """... {start}....""".format(start=start)没有f你必须使用正常的字符串格式"""... {start}....""".format(start=start)


import sys
import csv
from pyhive import presto

# --- functions ----

def func1(cur, object_, start, end):  # PEP8: spaces after commas
    
    # Python use `{star} {end}`, Bash uses `${start} ${end}`
    
    # String needs prefix `f` to use `{name} {end}` in f-string
    # or you have to use `"{start} {end}".format(start=value1, end=value2)`
    
    result = cur.execute(f"""
    WITH map_date AS 
    (
      SELECT 
        object, 
        epoch,
        timestamp,
        date,
        map_agg(name, value) AS map_values
      FROM hive.schema.test1
      WHERE object = '{object_}' 
        AND (epoch >= '{start}' AND epoch <= '{end}')
        AND name IN ('x','y')
      GROUP BY object,epoch,timestamp,date
      ORDER BY timestamp asc
    )
    SELECT
      epoch,
      timestamp,
      CASE WHEN element_at(map_values, 'x') IS NOT NULL THEN map_values['x'] ELSE NULL END AS x,
      CASE WHEN element_at(map_values, 'y') IS NOT NULL THEN map_values['y'] ELSE NULL END AS y,
      object,
      date AS date
    FROM map_date
    """)

    rows = cur.fetchall()
    colnames = [desc[0] for desc in cur.description]  # store the headers in variable called 'colnames'

    print('Query Finished')  # returns the list with one entry for each record

    fp = open('/Users/xyz/Desktop/Python/function.csv', 'w')
    
    my_file = csv.writer(fp)   # PEP8: lower_case_names for variables
    my_file.writerow(colnames)  # write the header to the file
    my_file.writerows(rows)
    
    fp.close()

    print('File Created')

# --- main ---

connection = presto.connect(host='xyz', port=8889, username='test')  # PEP8: spaces after commas
cur = connection.cursor()
print('Connection Established')

#object_ = sys.argv[1]   # PEP8: there is class `object` so I add `_` to create different name
#start = sys.argv[2]
#end = sys.argv[3]
#func1(cur, object_, start, end)

func1(cur, sys.argv[1], sys.argv[2], sys.argv[3])

cur.close()
connection.close()

If you plan to use argparse如果你打算使用argparse

parser = argparse.ArgumentParser()

parser.add_argument('-o', '--object', help='object to search')
parser.add_argument('-s', '--start',  help='epoch start')
parser.add_argument('-e', '--end',    help='epoch end')

args = parser.parse_args()

and then接着

func1(cur, args.object, args.start, args.end)

import argparse

# ... imports and functions ...

# --- main ---

parser = argparse.ArgumentParser()
parser.add_argument('-o', '--object', help='object to search')
parser.add_argument('-s', '--start',  help='epoch start')
parser.add_argument('-e', '--end',    help='epoch end')
#parser.add_argument('-D', '--debug', action='store_true', help='debug (display extra info)')
args = parser.parse_args()

#if args.debug:
#    print(args)

connection = presto.connect(host='xyz', port=8889, username='test')  # PEP8: spaces after commas
cur = connection.cursor()
print('Connection Established')

func1(cur, args.object, args.start, args.end)

cur.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM