(
ResponseRgBasketId STRING,
RawStandardisedLoadDateTime TIMESTAMP,
InfoMartLoadDateTime TIMESTAMP,
Operaame STRING,
RequestTimestamp TIMESTAMP,
RequestSiteId STRING,
RequestSalePointId STRING,
RequestdTypeId STRING,
RequeetValue DECIMAL(10,2),
ResponsegTimestamp TIMESTAMP,
RequessageId STRING,
RequestBasketId STRING,
ResponsesageId STRING,
RequestTransmitAttempt INT,
ResponseCode STRING,
RequestasketItems INT,
ResponseFinancialTimestamp TIMESTAMP,
RequeketJsonString STRING,
LoyaltyId STRING
)
USING DELTA
PARTITIONED BY (RequestTimestamp)
TBLPROPERTIES
(
delta.deletedFileRetentionDuration = "interval 1 seconds",
delta.autoOptimize.optimizeWrite = true
)
It has been partitioned by RequestTimestamp
(2020-12-12T07:39:35.000+0000 ), but it has the format as below. Could I change the format to different format to something like 2020-12-34
in partition by?
Short answer: No regexp or other transformation is possible in PARTITIONED BY. The only solution is to apply substr(timestamp, 1, 10) during/before load. See also this answer: https://stackoverflow.com/a/64171676/2700344
Long answer:
No regexp is possible in PARTITIONED BY. No functions are allowed in table DDL, only type can be specified. Type in column specification works as constraint and at the same time can cause implicit type conversion. For example if you are loading strings into dates, it will be casted implicitly if possible and loaded into null default partition if not possible to cast. Also if you are loading BIGINT, it will be silently truncated to INT, as a result you will see corrupted data and duplicates.
Does the same implicit cast work with partitioned by? Let,s see:
DROP TABLE IF EXISTS test_partition;
CREATE TABLE IF NOT EXISTS test_partition (Id int)
partitioned by (dt date) --Hope timestamp will be truncated to DATE
;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table test_partition partition(dt)
select 1 as id, current_timestamp as dt;
show partitions test_partition;
Result (We expect timestamp truncated to DATE...):
dt=2021-03-24 10%3A19%3A19.985
No, it does not work. Tested the same with varchar(10) column with strings like yours. See short answer.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.