繁体   English   中英

Apache Pig:在Pig中处理数据类型时面临问题

[英]Apache Pig: Facing issue while handling datatype in Pig

我一直面临的问题,同时处理现场的数据类型qty ,并做了SUM在同一个领域。 下面是代码。 我将qty转换为double但仍然收到下面提到的错误。 有人可以帮我理解这个问题,如果可能的话,一个解决方案?

A_test1 = load'EXT_OO_IMP' USING PigStorage('\u0001') AS (it: chararray,loc: chararray,qty: chararray,scheddate: chararray,udc_cta_no: chararray,udc_imp_pack_qty: chararray,udc_imp_ready_dt: chararray,udc_imp_ref_no: chararray,udc_ord_sys_cd: chararray,udc_source: chararray,udc_sply_typ: chararray,udc_vend_pack_id: chararray,udc_purch_stg: chararray,srs_pack_flow_indicator_cd: chararray,it_type_cd: chararray,source_owner_cd: chararray,nks_id: chararray,alloc_replen_cd: chararray);

----- ext_oo_import: {it: chararray,loc: chararray,qty: chararray,scheddate: chararray,udc_cta_no: chararray,udc_imp_pack_qty: chararray,udc_imp_ready_dt: chararray,udc_imp_ref_no: chararray,udc_ord_sys_cd: chararray,udc_source: chararray,udc_sply_typ: chararray,udc_vend_pack_id: chararray,udc_purch_stg: chararray,srs_pack_flow_indicator_cd: chararray,it_type_cd: chararray,source_owner_cd: chararray,nks_id: chararray,alloc_replen_cd: chararray}

----- ##############  ##############  ##############

import_on_order = 
        FOREACH A_test1
        GENERATE
            loc,
            it,
            nks_id,
            (double)(qty is NULL ? 0 : qty) as qty:double,
            scheddate,
            ' ' AS order_source,
            ' ' AS chs_it_type_cd;

describe import_on_order;

----- import_on_order: {loc: chararray,it: chararray,nks_id: chararray,qty: int,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray}


grp_import_on_order = GROUP import_on_order BY (loc,it,nks_id,scheddate,order_source,chs_it_type_cd);


describe grp_import_on_order;

----- grp_import_on_order: {group: (loc: chararray,it: chararray,nks_id: chararray,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray),import_on_order: {(loc: chararray,it: chararray,nks_id: chararray,qty: int,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray)}}



------------------------------- STORE TO FILE ---------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------

work__idrp_import_on_order =
                            FOREACH grp_import_on_order 
                            GENERATE    group.loc AS loc,
                                        group.it AS it,
                                        group.nks_id AS nks_id,
                                        SUM(import_on_order.qty) AS qty,
                                        group.scheddate AS scheddate,
                                        group.order_source AS order_source,
                                        group.chs_it_type_cd AS chs_it_type_cd;

describe work__idrp_import_on_order;

----- work__idrp_import_on_order: {loc: chararray,it: chararray,nks_id: chararray,qty: int,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray}

import_on_order_rp = 
        FOREACH ext_oo_import
        GENERATE
            it AS chs_it,
            loc AS chs_loc,
            (qty is NULL ? 0 : qty) as qty:double,
            scheddate AS current_due_dt, 
            ' ' AS order_source,
            'V' AS source_type_cd,
            udc_sply_typ AS sply_typ,
            udc_ord_sys_cd AS ord_sys_cd;

2019-01-31 09:03:30,819 [main]错误org.apache.pig.tools.grunt.GruntParser-错误0:执行时发生异常(名称:grp_import_on_order:本地重新排列[元组] {元组}(假)-范围-1095操作员密钥:作用域-1095):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:work__idrp_import_on_order:每个新的(假,假)[bag]-作用域-1078操作员密钥:范围1078):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:预合并器本地重新排列[元组] {未知}-范围1097操作员键:范围1097):组织.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:import_on_order:每个新建(false,false,false,false,false,false,false)[bag]-scope-977操作员键: scope-977):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:ext_oo_import:New For Each(false,false,false,false,false)[bag]-scope-957 Operator关键:范围95 7):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:每个新的(false,false,false,false,false)[bag]-scope-945 Operator Key:scope- 945):org.apache.pig.backend.executionengine.ExecException:错误0:执行[POCast(Name:Cast [double]-scope-926 Operator Key:scope-926)子项时发生异常:[[POProject(Name:Project) [chararray] [2]-scope-925操作符:scope-925)子代:在[]]处为[]]处为null:java.lang.ClassCastException:java.lang.Integer无法转换为java.lang.String日志文件中的详细信息:/logs/hdidrp/pig/pig_1548942743751.log 2019-01-31 09:03:30,849 [main] INFO org.apache.hadoop.conf.Configuration.deprecation-已弃用fs.default.name。 而是使用fs.defaultFS 2019-01-31 09:03:31,012 [main] WARN org.apache.pig.PigServer-遇到警告IMPLICIT_CAST_TO_DOUBLE 1次。 import_on_order_rp:{shc_item:chararray,shc_loc:chararray,qty:double,current_due_dt:chararray,order_source:chararray,source_type_cd:chararray,sply_typ:chararray,ord_sys_cd:chararray} 2019-01-31 09:03:31,179 [main]错误组织.apache.pig.tools.grunt.GruntParser-错误0:执行时发生异常(名称:grp_import_on_order:本地重新排列[元组] {tuple}(假)-范围1095操作员键:范围1095):org.apache.pig .backend.executionengine.ExecException:错误0:执行时发生异常(名称:work__idrp_import_on_order:每个新的(false,false)[bag]-范围1078操作员键:范围1078):org.apache.pig.backend.executionengine .ExecException:错误0:执行时发生异常(名称:预组合器本地重新排列[元组] {未知}-作用域1097运算符:作用域1097):org.apache.pig.backend.executionengine.ExecException:错误0:异常执行时(名称:import_on_order:每个新建(false,false,false,false,false,false,false)[bag]-scope-977 Operator Key:scope-977):org .apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:ext_oo_import:每个新建(false,false,false,false,false)[bag]-scope-957 Operator Key:scope-957) :org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:每个新的(false,false,false,false,false)[bag]-scope-945 Operator Key:scope-945) :org.apache.pig.backend.executionengine.ExecException:错误0:执行[POCast(名称:Cast [double]-范围926操作符:范围926)]子项时发生异常:[[POProject(名称:Project [chararray] ] [2]-scope-925操作员键:scope-925)子级:在[]]处为[]]处为null:java.lang.ClassCastException:java.lang.Integer无法转换为java.lang.String

在查看了代码之后,无法进入第一条语句,您正在加载数据并执行转换步骤,但是在最后一步中,为什么再次转换第一个字符串类型的数据集,并在处理该数据集时却给出异常。

import_on_order_rp = FOREACH ext_oo_import以 chs_it,loc AS chs_loc,(qty为NULL?0:qty)作为qty:double,scheddate AS,current_due_dt,''AS order_source,'V'AS,source_type_cd,udc_sp__ly_typ,AS_sys_ly_tysys

看看是否正确。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM