[英]JQ 1.5: Timestamp / date transformation produce huge file
I using jq 1.5 under a Windows 10 powershell enviroment to transform json files and import them to a MS SQL database. 我在Windows 10 powershell环境下使用jq 1.5转换json文件并将其导入到MS SQL数据库。 The original json file is around 1,1mb.
原始的json文件约为1,1mb。 I stored the file here: Json origin file .
我将文件存储在这里: Json原始文件 。 I use following jq command to transform the data:
我使用以下jq命令来转换数据:
[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}]
That command produce following file transformed file . 该命令产生以下文件转换后的文件 。 Now i had to transform the UNIX Timestamps to Dates.
现在,我不得不将UNIX时间戳转换为日期。 So i modified the command:
所以我修改了命令:
[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}] | .[].Departure_Time |= todate | .[].Arrival_Time |= todate
The transformed file without date Transformation have around 3 mb. 没有日期转换的转换文件大约有3 mb。 After the date Transformation the file have around 40 mb.
在转换日期之后,文件大小约为40 mb。 I think i have a logical error in my command, but cant find it.
我认为我的命令中存在逻辑错误,但找不到它。 Tips?
提示?
Regards Timo 问候蒂莫
Your use of iteration ( .segments[]
) causes multiplicative behavior: in your case, since there are four cases in which .segments|length
is 2, you get a 2^10 expansion locally, four times. 使用迭代(
.segments[]
)会导致乘法行为:在您的情况下,由于在四种情况下.segments|length
为2,因此在本地得到2 ^ 10的扩展,是四倍。
In situations such as this, it would make sense to use a small but well-chosen subset of the data (or maybe more easily, an artificial dataset) to check the code. 在这种情况下,使用少量但选择良好的数据子集(或者更容易使用人工数据集)来检查代码将是有意义的。
Perhaps what you intended is something more like: 也许您想要的更像是:
[ .legs[] | range(0; .segments|length) as $i | .... ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.