简体   繁体   English

JQ 1.5:时间戳/日期转换产生巨大的文件

[英]JQ 1.5: Timestamp / date transformation produce huge file

I using jq 1.5 under a Windows 10 powershell enviroment to transform json files and import them to a MS SQL database. 我在Windows 10 powershell环境下使用jq 1.5转换json文件并将其导入到MS SQL数据库。 The original json file is around 1,1mb. 原始的json文件约为1,1mb。 I stored the file here: Json origin file . 我将文件存储在这里: Json原始文件 I use following jq command to transform the data: 我使用以下jq命令来转换数据:

[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}]

That command produce following file transformed file . 该命令产生以下文件转换后的文件 Now i had to transform the UNIX Timestamps to Dates. 现在,我不得不将UNIX时间戳转换为日期。 So i modified the command: 所以我修改了命令:

[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}] | .[].Departure_Time |= todate | .[].Arrival_Time |= todate

The transformed file without date Transformation have around 3 mb. 没有日期转换的转换文件大约有3 mb。 After the date Transformation the file have around 40 mb. 在转换日期之后,文件大小约为40 mb。 I think i have a logical error in my command, but cant find it. 我认为我的命令中存在逻辑错误,但找不到它。 Tips? 提示?

Regards Timo 问候蒂莫

Your use of iteration ( .segments[] ) causes multiplicative behavior: in your case, since there are four cases in which .segments|length is 2, you get a 2^10 expansion locally, four times. 使用迭代( .segments[] )会导致乘法行为:在您的情况下,由于在四种情况下.segments|length为2,因此在本地得到2 ^ 10的扩展,是四倍。

In situations such as this, it would make sense to use a small but well-chosen subset of the data (or maybe more easily, an artificial dataset) to check the code. 在这种情况下,使用少量但选择良好的数据子集(或者更容易使用人工数据集)来检查代码将是有意义的。

Perhaps what you intended is something more like: 也许您想要的更像是:

[ .legs[] | range(0; .segments|length) as $i | .... ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM