简体   繁体   中英

Unable to get this Hadoop / Cascading code to run without exception? New to Both

Can someone explain why this code is throwing an exception when I attempt to run the job on local file system? I am new to Hadoop/Cascading and am trying to get multiple columns per line and count the frequency of the values in those columns (without differentiating about from which column key they came from)

I'm at a loss. If someone can explain to me where my thought process went astray, I would greatly appreciate it. I'm more concerned with learning the technology and its library than producing working code.

Here's the source code: http://pastebin.com/USTUJ5Da

Here's the stack trace:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. pipe0+pipe1+pipe2+pipe3+pipe4+pipe5+pipe6 Exception in thread "main" cascading.flow.planner.PlannerException: could not build flow from assembly: [[pipe0+pipe1+pipe2+pipe...][com.wellcentive.DQ1a.main(DQ1a.java:51)] merged streams must declare the same field names, in the same order, expected: [{7}:'PROC_NUM', 'DX 1', 'DX 3', 'DX 4', 'DX 5', 'DX 6', 'codes'] found: [{7}:'PROC_NUM', 'DX 2', 'DX 3', 'DX 4', 'DX 5', 'DX 6', 'codes']] at cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:576) at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:263) at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:80) at cascading.flow.FlowConnector.connect(FlowConnector.java:459) at cascading.flow.FlowConnector.connect(FlowConnector.java:450) at cascading.flow.FlowConnector.connect(FlowConnector.java:426) at cascading.flow.FlowConnector.connect(FlowConnector.java:275) at cascading.flow.FlowConnector.connect(FlowConnector.java :257) at com.wellcentive.DQ1a.main(DQ1a.java:68) Caused by: cascading.pipe.OperatorException: [pipe0+pipe1+pipe2+pipe...][com.wellcentive.DQ1a.main(DQ1a.java:51)] merged streams must declare the same field names, in the same order, expected: [{7}:'PROC_NUM', 'DX 1', 'DX 3', 'DX 4', 'DX 5', 'DX 6', 'codes'] found: [{7}:'PROC_NUM', 'DX 2', 'DX 3', 'DX 4', 'DX 5', 'DX 6', 'codes'] at cascading.pipe.Splice.resolveDeclared(Splice.java:1276) at cascading.pipe.Splice.outgoingScopeFor(Splice.java:992) at cascading.flow.planner.ElementGraph.resolveFields(ElementGraph.java:628) at cascading.flow.planner.ElementGraph.resolveFields(ElementGraph.java:610) at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:248) ... 7 more

in Cascading the when you are working with merging,

you should have have the same field names

here the conflict is with D1 in one table and D2 in other table, change them to same Field name.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM