简体   繁体   中英

JQ 1.5: Timestamp / date transformation produce huge file

I using jq 1.5 under a Windows 10 powershell enviroment to transform json files and import them to a MS SQL database. The original json file is around 1,1mb. I stored the file here: Json origin file . I use following jq command to transform the data:

[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}]

That command produce following file transformed file . Now i had to transform the UNIX Timestamps to Dates. So i modified the command:

[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}] | .[].Departure_Time |= todate | .[].Arrival_Time |= todate

The transformed file without date Transformation have around 3 mb. After the date Transformation the file have around 40 mb. I think i have a logical error in my command, but cant find it. Tips?

Regards Timo

Your use of iteration ( .segments[] ) causes multiplicative behavior: in your case, since there are four cases in which .segments|length is 2, you get a 2^10 expansion locally, four times.

In situations such as this, it would make sense to use a small but well-chosen subset of the data (or maybe more easily, an artificial dataset) to check the code.

Perhaps what you intended is something more like:

[ .legs[] | range(0; .segments|length) as $i | .... ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM