簡體   English   中英

我如何使用jsonloader為數組定義架構?

[英]How i can define schema for array using jsonloader?

我正在使用Elephantbird項目將json文件加載到Pig。 但是我不確定如何在加載時定義架構。 找不到相同的描述。

數據:

{"id":22522,"name":"Product1","colors":["Red","Blue"],"sizes":["S","M"]}
{"id":22523,"name":"Product2","colors":["White","Blue"],"sizes":["M"]}

碼:

feed = LOAD '$INPUT' USING com.twitter.elephantbird.pig.load.JsonLoader() AS products_json;

extracted_products = FOREACH feed GENERATE
    products_json#'id' AS id,
    products_json#'name' AS name,
    products_json#'colors' AS colors,
    products_json#'sizes' AS sizes;

describe extracted_products;

結果:

extracted_products: {id: chararray,name: bytearray,colors: bytearray,sizes: bytearray}

我如何才能給它們正確的架構(整數,字符串,數組,數組),以及如何將數組元素展平為行?

提前致謝

將json數組轉換為元組:

feed = LOAD '$INPUT' USING com.twitter.elephantbird.pig.load.JsonLoader() AS products_json;

extracted_products = FOREACH feed GENERATE
products_json#'id' AS id:chararray,
products_json#'name' AS name:chararray,
products_json#'colors' AS colors:{t:(i:chararray)},
products_json#'sizes' AS sizes:{t:(i:chararray)};

扁平化一個元組

flattened = foreach extracted_products generate id,flatten(colors);

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM