简体   繁体   中英

Error in creating table with column name containing dot (.) in Amazon Athena even after escaping the dot with backticks(`)

As per https://docs.aws.amazon.com/athena/latest/ug/tables-databases-columns-names.html ,

Special characters

Special characters other than underscore (_) are not supported. For more information, see the Apache Hive LanguageManual DDL documentation.

Important

Although you may succeed in creating table, view, database, or column names that contain special characters other than underscore by enclosing them in backtick (`) characters, subsequent DDL or DML queries that reference them can fail.

So, I tried to create a table using JSON file stored in S3 bucket and one of the keys in JSON contains multiple dots(.), which, as per the information given on the link, should be fine is I used backticks(`) to escape it.

CREATE EXTERNAL TABLE json_table (
id string,
version string,
com`.`org`.`dto`.`Customer string )
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ('ignore.malformed.json' = 'true')
LOCATION 's3://narendra-damodardas-modi-test-data/';

But it is giving the following error:

line 1:8: no viable alternative at input 'create external' (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: ef586f31-2515-4faa-a9fe-3a0e418235d2)

Now, you may say that as per the link provided, it is but obvious that it is not gonna work, but when I do this via Crawler in AWS Glue, everything works fine and I can see the column with dots in it.

As per https://docs.aws.amazon.com/athena/latest/ug/understanding-tables-databases-and-the-data-catalog.html ,

Regardless of how the tables are created, the tables creation process registers the dataset with Athena. This registration occurs in the AWS Glue Data Catalog and enables Athena to run queries on the data.

So, AWS Athena is utilizing AWS Glue behind the scenes and if Glue's crawler is able to add columns containing dots(.) in the JSON key, why Athena's query is not able to do it.

Maybe I am missing something. So, if anyone has experienced something like this in that past and got past the problem, please enlighten me. And if it impossible to do what I am trying to do, please highlight that as well, so that I do not keep wasting my time.

You need to use use back ticks around the whole thing and not just around special characters. The following should work

CREATE EXTERNAL TABLE json_table (
  `id` string,
  `version` string,
  `com.org.dto.Customer` string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
  'ignore.malformed.json' = 'true'
)
LOCATION 's3://narendra-damodardas-modi-test-data/';

In general, I'd advise to surround all column names with back ticks.

Also if your AWS Glue Crawler runs fine on a similar data, then you can lookup schema that it created with SHOW CREATE TABLE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM