简体   繁体   English

对大查询表的API请求

[英]API Request to Big Query Table

I'm facing an issue, regarding a nice way to modify JSON file by using python 我面临一个问题,关于使用python修改JSON文件的好方法

JSON REQUEST: JSON要求:

{
    "reports": [{
            "data": {
                "rows": [{
                        "metrics": [{
                                "values": ["27.8", "4", "4", "6.95", "1.0", "0.0", "3.8834951456310676"]
                            }
                        ],
                        "dimensions": ["TEST1", "20180725"]
                    }, {
                        "metrics": [{
                                "values": ["75.0", "12", "12", "6.25", "1.0", "0.0", "3.4782608695652173"]
                            }
                        ],
                        "dimensions": ["TEST2", "20180725"]
                    }
                ],
                "maximums": [{
                        "values": ["1665.0", "140", "126", "65.0", "3.0", "0.0", "50.0"]
                    }
                ],
                "minimums": [{
                        "values": ["0.0", "0", "0", "0.0", "0.0", "0.0", "0.0"]
                    }
                ],
                "isDataGolden": true,
                "totals": [{
                        "values": ["27045.99", "3274", "2831", "8.260839951130116", "1.1564818085482163", "0.0", "4.949387227049424"]
                    }
                ],
                "rowCount": 358
            },
            "columnHeader": {
                "dimensions": ["ga:productName", "ga:date"],
                "metricHeader": {
                    "metricHeaderEntries": [{
                            "type": "CURRENCY",
                            "name": "ga:itemRevenue"
                        }, {
                            "type": "INTEGER",
                            "name": "ga:itemQuantity"
                        }, {
                            "type": "INTEGER",
                            "name": "ga:uniquePurchases"
                        }, {
                            "type": "CURRENCY",
                            "name": "ga:revenuePerItem"
                        }, {
                            "type": "FLOAT",
                            "name": "ga:itemsPerPurchase"
                        }, {
                            "type": "CURRENCY",
                            "name": "ga:productRefundAmount"
                        }, {
                            "type": "PERCENT",
                            "name": "ga:buyToDetailRate"
                        }
                    ]
                }
            }
        }
    ]
}

LOOKING FOR: 寻找:

Values in matrics based on "dimensions" and "metricHeaderEntries" 基于“维度”“ metricHeaderEntries”的矩阵中的值

What is the clean way to modify report (or recreate it) so that I will have 修改(或重新创建)报告的干净方法是什么?

LINE1 - {"ga:productName": "NAME","ga:date": "NAME","ga:itemRevenue": "value1", "ga:itemQuantity": "value2", ... }
LINE2 - {"ga:productName": "NAME","ga:date": "NAME","ga:itemRevenue": "value1", "ga:itemQuantity": "value2", ... }

EDIT1: EDIT1:

{
"ga:productName": "NAME", #from dimension 
"ga:date": "NAME", #from dimension 
"ga:itemRevenue": "value1", #from metricHeaderEntries 
"ga:itemQuantity": "value2", #from metricHeaderEntries 
... 
}
{
"ga:productName": "NAME2", #from dimension 
"ga:date": "NAME2", #from dimension 
"ga:itemRevenue": "value3", #from metricHeaderEntries 
"ga:itemQuantity": "value4", #from metricHeaderEntries 
... 
}

Value working this way: 重视以这种方式工作:

"metrics": [{"values": ["27.8", "4", "4", "6.95", "1.0", "0.0","3.8834951456310676"] #headers in metricHeaderEntries 
"dimensions": ["TEST1", "20180725"] #header in dimension 

Or similar (I'm not interested in totals and so on) 或类似(我对总计不感兴趣,依此类推)

Looking for solution/sample/explanation how to do it, with the way that BQ will accept it. 寻找解决方案/样品/解释的方法,以及BQ接受它的方式。

EXTRA: 额外:

I understand the way of getting data from JSON request like: 我了解从JSON请求获取数据的方式,例如:

responce[][][]

But this situation is too tricky for me ( 但是这种情况对我来说太棘手了(

SAMPLE: 样品:

This is Ideally how the table should look like THIS IS HOW IDEALLY TABLE SHOULD LOOK LIKE 理想情况下,这是表格的外观, 应该是怎样的?

This is what google offer in the way of printing this data ( but need to convert it to the format which I explain above 这就是谷歌提供的打印此数据的方式(但需要将其转换为我上面解释过的格式

def print_response(response):
  for report in response.get('reports', []):
    columnHeader = report.get('columnHeader', {})
    dimensionHeaders = columnHeader.get('dimensions', [])
    metricHeaders = columnHeader.get('metricHeader', {}).get('metricHeaderEntries', [])

    for row in report.get('data', {}).get('rows', []):
      dimensions = row.get('dimensions', [])
      dateRangeValues = row.get('metrics', [])

      for header, dimension in zip(dimensionHeaders, dimensions):
        print header + ': ' + dimension

      for i, values in enumerate(dateRangeValues):
        print 'Date range: ' + str(i)
        for metricHeader, value in zip(metricHeaders, values.get('values')):
          print metricHeader.get('name') + ': ' + value

So your edit still don't match your JSON; 因此,您的编辑仍然与JSON不匹配; in dimension you have a list of values not a key:value 在维度中,您有一个值列表,而不是键:值

"dimensions": [
          "ga:productName",
          "ga:date"
        ],

this mean that you don't have any value to take so your example is not correct. 这意味着您没有任何价值,因此您的示例不正确。 In "metricHeaderEntries" you have the following: 在“ metricHeaderEntries”中,您具有以下内容:

"metricHeader": {
          "metricHeaderEntries": [
            {
              "type": "CURRENCY",
              "name": "ga:itemRevenue"
            },
            {
              "type": "INTEGER",
              "name": "ga:itemQuantity"
            },
            {
              "type": "INTEGER",
              "name": "ga:uniquePurchases"
            },
            {
              "type": "CURRENCY",
              "name": "ga:revenuePerItem"
            },
            {
              "type": "FLOAT",
              "name": "ga:itemsPerPurchase"
            },
            {
              "type": "CURRENCY",
              "name": "ga:productRefundAmount"
            },
            {
              "type": "PERCENT",
              "name": "ga:buyToDetailRate"
            }
          ]
        }

So even this case doesn't match your example because under "metricHeaderEntries" you don't have any values of "ga:itemRevenue" or "ga:itemQuantity" that you show in your example. 因此,即使这种情况也不符合您的示例,因为在“ metricHeaderEntries”下您没有在示例中显示的“ ga:itemRevenue”或“ ga:itemQuantity”的任何值。

In any case you can go true the JSON in the same way of a python dictionary so you can select the elements by key in case of dictionary and by index in case of list. 无论如何,您都可以采用与python字典相同的方式实现JSON,因此,在字典的情况下,可以通过键选择元素,在列表的情况下,可以通过索引选择元素。

If I get some time I'll try to solve your problem taking the values from your example even if the nodes gave by you are not the correct one. 如果有时间,即使您提供的节点不正确,我也会尝试从示例中获取值来解决您的问题。

ANSWER: 回答:

I solved your problem even if I hard-coded the key values instead than take from the original JSON just for let you understand how it works; 即使我对键值进行了硬编码而不是从原始JSON中提取代码也只是为了让您了解其工作原理,我还是解决了您的问题; please let me know if it's what you are expecting: 请让我知道这是否是您所期望的:

new_list=[]

l=a["reports"][0]["data"]["rows"]#get to "rows" key from a, where a is your JSON readed as dictionary
for i in l:#iterate rows key for search the needed values for each lines
    dict_line={}#create a dictionary for each line
    dict_line["ga:productName"]=i["dimensions"][0]#add to the dictionary dict_lineth key ga:productName and the product name as value
    dict_line["ga:date"]=i["dimensions"][1]#add to the dictionary dict_lineth key ga:date and the product date as value
    j= (i["metrics"][0]['values'])#for each product line I create a key node and value
    dict_line["ga:itemRevenue"]=j[0]
    dict_line["ga:itemQuantity"]=j[1]
    dict_line["ga:uniquePurchases"]=j[2]
    dict_line["ga:revenuePerIte"]=j[3]
    dict_line["ga:itemsPerPurchase"]=j[4]
    dict_line["ga:productRefundAmount"]=j[5]
    dict_line["ga:buyToDetailRate"]=j[6]
    new_list.append(dict_line)

print (new_list)

this is the result: 结果是:

[
  {
    "ga:productName": "TEST1",
    "ga:itemRevenue": "27.8",
    "ga:uniquePurchases": "4",
    "ga:date": "20180725",
    "ga:revenuePerIte": "6.95",
    "ga:productRefundAmount": "0.0",
    "ga:itemQuantity": "4",
    "ga:itemsPerPurchase": "1.0",
    "ga:buyToDetailRate": "3.8834951456310676"
  },
  {
    "ga:productName": "TEST2",
    "ga:itemRevenue": "75.0",
    "ga:uniquePurchases": "12",
    "ga:date": "20180725",
    "ga:revenuePerIte": "6.25",
    "ga:productRefundAmount": "0.0",
    "ga:itemQuantity": "12",
    "ga:itemsPerPurchase": "1.0",
    "ga:buyToDetailRate": "3.4782608695652173"
  }
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Big Query API 将数据摄取到按时间分区的表中,但出现 SyntaxError: Unexpected end of input - Using Big Query API to ingest data into table partitioned by time but getting SyntaxError: Unexpected end of input 通过API将CSV数据加载到Big Query中 - Loading csv data into Big Query via API 大查询表对象属性为空 - Big Query table object attributes are empty 使用数据流模板读取大查询表 - Read big query table using dataflow templates Apache Beam +大查询表读取 - Apache Beam + Big Query Table Read Python 大查询错误:google.api_core.exceptions.BadRequest:400 无法在带有 DML 语句的作业中设置目标表 - Python Big Query Error : google.api_core.exceptions.BadRequest: 400 Cannot set destination table in jobs with DML statements 如何使用cloud run python api从大查询表中读取大数据以及系统配置应该是什么? - How to read large data from big query table using cloud run python api and what should be system config? 将 Big Query python api 的查询结果保存到 CSV - Saving results of query from Big Query python api to CSV Google大查询,403错误无法通过API查询数据 - Google big query, 403 error unable to query the data via API 使用python API客户端的GZIP Google大查询响应 - GZIP google big query response using python api client
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM