简体   繁体   English

Apache Pig:在Pig中处理数据类型时面临问题

[英]Apache Pig: Facing issue while handling datatype in Pig

I have been facing issue while handling the datatype for field qty and doing a SUM on the same field. 我一直面临的问题,同时处理现场的数据类型qty ,并做了SUM在同一个领域。 Below is the code. 下面是代码。 I converted the qty to double but still getting the error mentioned below. 我将qty转换为double但仍然收到下面提到的错误。 Can someone please help me to understand this issue and if possible a solution? 有人可以帮我理解这个问题,如果可能的话,一个解决方案?

A_test1 = load'EXT_OO_IMP' USING PigStorage('\u0001') AS (it: chararray,loc: chararray,qty: chararray,scheddate: chararray,udc_cta_no: chararray,udc_imp_pack_qty: chararray,udc_imp_ready_dt: chararray,udc_imp_ref_no: chararray,udc_ord_sys_cd: chararray,udc_source: chararray,udc_sply_typ: chararray,udc_vend_pack_id: chararray,udc_purch_stg: chararray,srs_pack_flow_indicator_cd: chararray,it_type_cd: chararray,source_owner_cd: chararray,nks_id: chararray,alloc_replen_cd: chararray);

----- ext_oo_import: {it: chararray,loc: chararray,qty: chararray,scheddate: chararray,udc_cta_no: chararray,udc_imp_pack_qty: chararray,udc_imp_ready_dt: chararray,udc_imp_ref_no: chararray,udc_ord_sys_cd: chararray,udc_source: chararray,udc_sply_typ: chararray,udc_vend_pack_id: chararray,udc_purch_stg: chararray,srs_pack_flow_indicator_cd: chararray,it_type_cd: chararray,source_owner_cd: chararray,nks_id: chararray,alloc_replen_cd: chararray}

----- ##############  ##############  ##############

import_on_order = 
        FOREACH A_test1
        GENERATE
            loc,
            it,
            nks_id,
            (double)(qty is NULL ? 0 : qty) as qty:double,
            scheddate,
            ' ' AS order_source,
            ' ' AS chs_it_type_cd;

describe import_on_order;

----- import_on_order: {loc: chararray,it: chararray,nks_id: chararray,qty: int,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray}


grp_import_on_order = GROUP import_on_order BY (loc,it,nks_id,scheddate,order_source,chs_it_type_cd);


describe grp_import_on_order;

----- grp_import_on_order: {group: (loc: chararray,it: chararray,nks_id: chararray,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray),import_on_order: {(loc: chararray,it: chararray,nks_id: chararray,qty: int,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray)}}



------------------------------- STORE TO FILE ---------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------

work__idrp_import_on_order =
                            FOREACH grp_import_on_order 
                            GENERATE    group.loc AS loc,
                                        group.it AS it,
                                        group.nks_id AS nks_id,
                                        SUM(import_on_order.qty) AS qty,
                                        group.scheddate AS scheddate,
                                        group.order_source AS order_source,
                                        group.chs_it_type_cd AS chs_it_type_cd;

describe work__idrp_import_on_order;

----- work__idrp_import_on_order: {loc: chararray,it: chararray,nks_id: chararray,qty: int,scheddate: chararray,order_source: chararray,chs_it_type_cd: chararray}

import_on_order_rp = 
        FOREACH ext_oo_import
        GENERATE
            it AS chs_it,
            loc AS chs_loc,
            (qty is NULL ? 0 : qty) as qty:double,
            scheddate AS current_due_dt, 
            ' ' AS order_source,
            'V' AS source_type_cd,
            udc_sply_typ AS sply_typ,
            udc_ord_sys_cd AS ord_sys_cd;

2019-01-31 09:03:30,819 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 0: Exception while executing (Name: grp_import_on_order: Local Rearrange[tuple]{tuple}(false) - scope-1095 Operator Key: scope-1095): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: work__idrp_import_on_order: New For Each(false,false)[bag] - scope-1078 Operator Key: scope-1078): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: Pre Combiner Local Rearrange[tuple]{Unknown} - scope-1097 Operator Key: scope-1097): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: import_on_order: New For Each(false,false,false,false,false,false,false)[bag] - scope-977 Operator Key: scope-977): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: ext_oo_import: New For Each(false,false,false,false,false)[bag] - scope-957 Operator Key: scope-95 2019-01-31 09:03:30,819 [main]错误org.apache.pig.tools.grunt.GruntParser-错误0:执行时发生异常(名称:grp_import_on_order:本地重新排列[元组] {元组}(假)-范围-1095操作员密钥:作用域-1095):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:work__idrp_import_on_order:每个新的(假,假)[bag]-作用域-1078操作员密钥:范围1078):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:预合并器本地重新排列[元组] {未知}-范围1097操作员键:范围1097):组织.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:import_on_order:每个新建(false,false,false,false,false,false,false)[bag]-scope-977操作员键: scope-977):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:ext_oo_import:New For Each(false,false,false,false,false)[bag]-scope-957 Operator关键:范围95 7): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: New For Each(false,false,false,false,false)[bag] - scope-945 Operator Key: scope-945): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POCast (Name: Cast[double] - scope-926 Operator Key: scope-926) children: [[POProject (Name: Project[chararray][2] - scope-925 Operator Key: scope-925) children: null at []]] at []]: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String Details at logfile: /logs/hdidrp/pig/pig_1548942743751.log 2019-01-31 09:03:30,849 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. 7):org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:每个新的(false,false,false,false,false)[bag]-scope-945 Operator Key:scope- 945):org.apache.pig.backend.executionengine.ExecException:错误0:执行[POCast(Name:Cast [double]-scope-926 Operator Key:scope-926)子项时发生异常:[[POProject(Name:Project) [chararray] [2]-scope-925操作符:scope-925)子代:在[]]处为[]]处为null:java.lang.ClassCastException:java.lang.Integer无法转换为java.lang.String日志文件中的详细信息:/logs/hdidrp/pig/pig_1548942743751.log 2019-01-31 09:03:30,849 [main] INFO org.apache.hadoop.conf.Configuration.deprecation-已弃用fs.default.name。 Instead, use fs.defaultFS 2019-01-31 09:03:31,012 [main] WARN org.apache.pig.PigServer - Encountered Warning IMPLICIT_CAST_TO_DOUBLE 1 time(s). 而是使用fs.defaultFS 2019-01-31 09:03:31,012 [main] WARN org.apache.pig.PigServer-遇到警告IMPLICIT_CAST_TO_DOUBLE 1次。 import_on_order_rp: {shc_item: chararray,shc_loc: chararray,qty: double,current_due_dt: chararray,order_source: chararray,source_type_cd: chararray,sply_typ: chararray,ord_sys_cd: chararray} 2019-01-31 09:03:31,179 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 0: Exception while executing (Name: grp_import_on_order: Local Rearrange[tuple]{tuple}(false) - scope-1095 Operator Key: scope-1095): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: work__idrp_import_on_order: New For Each(false,false)[bag] - scope-1078 Operator Key: scope-1078): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: Pre Combiner Local Rearrange[tuple]{Unknown} - scope-1097 Operator Key: scope-1097): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: import_on_order: New For Each(false,false,false,false,false,false,false)[bag] - scope-977 Operator Key: scope-977): org import_on_order_rp:{shc_item:chararray,shc_loc:chararray,qty:double,current_due_dt:chararray,order_source:chararray,source_type_cd:chararray,sply_typ:chararray,ord_sys_cd:chararray} 2019-01-31 09:03:31,179 [main]错误组织.apache.pig.tools.grunt.GruntParser-错误0:执行时发生异常(名称:grp_import_on_order:本地重新排列[元组] {tuple}(假)-范围1095操作员键:范围1095):org.apache.pig .backend.executionengine.ExecException:错误0:执行时发生异常(名称:work__idrp_import_on_order:每个新的(false,false)[bag]-范围1078操作员键:范围1078):org.apache.pig.backend.executionengine .ExecException:错误0:执行时发生异常(名称:预组合器本地重新排列[元组] {未知}-作用域1097运算符:作用域1097):org.apache.pig.backend.executionengine.ExecException:错误0:异常执行时(名称:import_on_order:每个新建(false,false,false,false,false,false,false)[bag]-scope-977 Operator Key:scope-977):org .apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: ext_oo_import: New For Each(false,false,false,false,false)[bag] - scope-957 Operator Key: scope-957): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: New For Each(false,false,false,false,false)[bag] - scope-945 Operator Key: scope-945): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POCast (Name: Cast[double] - scope-926 Operator Key: scope-926) children: [[POProject (Name: Project[chararray][2] - scope-925 Operator Key: scope-925) children: null at []]] at []]: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String .apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:ext_oo_import:每个新建(false,false,false,false,false)[bag]-scope-957 Operator Key:scope-957) :org.apache.pig.backend.executionengine.ExecException:错误0:执行时发生异常(名称:每个新的(false,false,false,false,false)[bag]-scope-945 Operator Key:scope-945) :org.apache.pig.backend.executionengine.ExecException:错误0:执行[POCast(名称:Cast [double]-范围926操作符:范围926)]子项时发生异常:[[POProject(名称:Project [chararray] ] [2]-scope-925操作员键:scope-925)子级:在[]]处为[]]处为null:java.lang.ClassCastException:java.lang.Integer无法转换为java.lang.String

After looking at code, unable to get in first statement you are loading data and doing steps of transformation but again in last statament why you are transforming first data set again which is of string type and while working on this it giving exception. 在查看了代码之后,无法进入第一条语句,您正在加载数据并执行转换步骤,但是在最后一步中,为什么再次转换第一个字符串类型的数据集,并在处理该数据集时却给出异常。

import_on_order_rp = FOREACH ext_oo_import GENERATE it AS chs_it, loc AS chs_loc, (qty is NULL ? 0 : qty) as qty:double, scheddate AS current_due_dt, ' ' AS order_source, 'V' AS source_type_cd, udc_sply_typ AS sply_typ, udc_ord_sys_cd AS ord_sys_cd; import_on_order_rp = FOREACH ext_oo_import以 chs_it,loc AS chs_loc,(qty为NULL?0:qty)作为qty:double,scheddate AS,current_due_dt,''AS order_source,'V'AS,source_type_cd,udc_sp__ly_typ,AS_sys_ly_tysys

See if that is the correct case. 看看是否正确。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM