简体   繁体   中英

My query doesn't work when I add a sum() to it?

I'm trying to get a query to show results using php, gae and bigquery. The query happily runs and shows the correct data in my application if I do a generic select * from table, so I know everything else is fine with my code. However, as soon as I try to include a sum() in the query, my data no longer loads when I deploy the app. I'm not sure what I'm missing here, the column is a numeric data type if that helps. I've played with all parts and it seems as soon as I add sum() in, all data vanishes from my application.

...    
<?php
        $client = new Google_Client();
        $client->useApplicationDefaultCredentials();
        $client->addScope(Google_Service_Bigquery::BIGQUERY);
        $bigquery = new Google_Service_Bigquery($client);
        $projectId = 'projectName';
        $query1 = 'SELECT column_1, SUM(column_2) as `trade_price` from [dataset.tableName] group by column_1 order by trade_price desc limit 10';
        $request = new Google_Service_Bigquery_QueryRequest();
        $str = '';
        
        $request->setQuery($query1);
        
        $response = $bigquery->jobs->query($projectId, $request);
        $rows = $response->getRows();
...

Thanks to anyone who can help me out!

EDIT: I need the query to be sorted from largest sum to smallest. I've tested doing sum on column_1 which is an int, and that's fine. So I'm starting to think my issue is to do with the fact column_2 is numeric, however the query I have works correctly in console, so maybe PHP and sum(numeric column) don't play nice together? How would I resolve this?

Column trade_price should be invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.

Please try below query.

SELECT column_1, SUM(column_2) as `trade_price` from [dataset.tableName] group by column_1 order by SUM(column_2) desc limit 10

试试这个!

SELECT column_1, (SELECT SUM(column_2) FROM [dataset.tableName] WHERE Id = tbl.Id) as `trade_price` from [dataset.tableName] tbl group by column_1 order by trade_price desc limit 10

This is the query you are working with:

SELECT column_1, SUM(column_2) as `trade_price` 
FROM [dataset.tableName] 
GROUP BY column_1 
ORDER BY trade_price desc;

You specify that this works okay:

SELECT column_1, SUM(column_2) as `trade_price` 
FROM [dataset.tableName] 
GROUP BY column_1 ;

The issue is the ORDER BY -- and in particular the number of values in column_1 . I suspect this is because BigQuery implements ORDER BY on a single node; at least, I know this used to be the case.

If you can filter the result to remove much of the data, then that should help performance: For instance:

HAVING SUM(column_2) > 1000

or:

HAVING COUNT(*) >= 10

I even think the MAX() window function can help. So assuming the largest values are 10% or more of the maximum:

SELECT column_1, SUM(column_2) as `trade_price` 
FROM [dataset.tableName] 
GROUP BY column_1 
QUALIFY trade_price >= 0.1 * MAX(trade_price) OVER ()
ORDER BY trade_price desc;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM