简体   繁体   English

Select 连接 AWS Athena 中的两个表时除一个之外的所有列

[英]Select all columns except one when joining two tables in AWS Athena

I want to join two large tables with many columns using Presto SQL syntax in AWS Athena.我想在 AWS Athena 中使用 Presto SQL 语法连接两个包含许多列的大表。 My code is pretty simple:我的代码很简单:

select
    * 
from TableA as A
left join TableB as B
on A.key_id = B.key_id
;

After joining, the primary key column ( key_id ) is repeated two times.加入后,主键列( key_id )重复两次。 Both tables have more than 100 columns, and the joining takes very long.两个表都有超过 100 列,连接需要很长时间。 How can I fix it such that the key_id column does not repeat twice in the final result?如何修复它以使key_id列在最终结果中不重复两次?

PS AWS Athena does not support except command, unlike Google BigQuery. PS AWS Athena 不支持except命令,这与 Google BigQuery 不同。

This would be a nice feature, but is not part of standard SQL.这将是一个不错的功能,但不是标准 SQL 的一部分。 The EXCEPT keyword is a set-based operation (ie filtering rows). EXCEPT关键字是基于集合的操作(即过滤行)。

In Athena, as with standard SQL, you will have to specify the columns you want to include.在 Athena 中,与标准 SQL 一样,您必须指定要包含的列。 The argument for this is that it's lower maintenance, and in fact best practice is to always explicitly state the columns you want - never leaving this to "whatever columns exist".对此的论点是它的维护成本较低,实际上最佳实践是始终明确 state 您想要的列 - 永远不要将其留给“任何存在的列”。 This will help ensure your queries don't change behaviour if/when your table structure changes.如果/当您的表结构发生变化时,这将有助于确保您的查询不会改变行为。

Some SQL languages have features like this.一些 SQL 语言具有这样的功能。 I understand Oracle has this too.我知道 Oracle 也有这个。 But to my knowledge Athena (/ PrestoSQL / Trino) does not.但据我所知 Athena (/PrestoSQL/Trino) 没有。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM