简体   繁体   中英

spark mllib predict error with map

I have a linear regression model model and a set of LabeledPoint regPoints .

I am able to predict the first sample

scala> model.predict(regPoints.first.features)
15/02/12 16:17:56 INFO SparkContext: Starting job: first at <console>:61
15/02/12 16:17:56 INFO DAGScheduler: Got job 154 (first at <console>:61) with 1 output partitions (allowLocal=true)
15/02/12 16:17:56 INFO DAGScheduler: Final stage: Stage 154(first at <console>:61)
15/02/12 16:17:56 INFO DAGScheduler: Parents of final stage: List()
15/02/12 16:17:56 INFO DAGScheduler: Missing parents: List()
15/02/12 16:17:56 INFO DAGScheduler: Submitting Stage 154 (MappedRDD[32] at map at <console>:54), which has no missing parents
15/02/12 16:17:56 INFO MemoryStore: ensureFreeSpace(4104) called with curMem=88811, maxMem=278302556
15/02/12 16:17:56 INFO MemoryStore: Block broadcast_286 stored as values in memory (estimated size 4.0 KB, free 265.3 MB)
15/02/12 16:17:56 INFO MemoryStore: ensureFreeSpace(2720) called with curMem=92915, maxMem=278302556
15/02/12 16:17:56 INFO MemoryStore: Block broadcast_286_piece0 stored as bytes in memory (estimated size 2.7 KB, free 265.3 MB)
15/02/12 16:17:56 INFO BlockManagerInfo: Added broadcast_286_piece0 in memory on localhost:53178 (size: 2.7 KB, free: 265.4 MB)
15/02/12 16:17:56 INFO BlockManagerMaster: Updated info of block broadcast_286_piece0
15/02/12 16:17:56 INFO SparkContext: Created broadcast 286 from broadcast at DAGScheduler.scala:838
15/02/12 16:17:56 INFO DAGScheduler: Submitting 1 missing tasks from Stage 154 (MappedRDD[32] at map at <console>:54)
15/02/12 16:17:56 INFO TaskSchedulerImpl: Adding task set 154.0 with 1 tasks
15/02/12 16:17:56 INFO TaskSetManager: Starting task 0.0 in stage 154.0 (TID 289, localhost, PROCESS_LOCAL, 1742 bytes)
15/02/12 16:17:56 INFO Executor: Running task 0.0 in stage 154.0 (TID 289)
15/02/12 16:17:56 INFO HadoopRDD: Input split: file:/home/donbeo/Documents/dataset/spark_sample_data/sinx_over_x.txt:0+2543
15/02/12 16:17:56 INFO HadoopRDD: Input split: file:/home/donbeo/Documents/dataset/spark_sample_data/sinx_over_x.txt:0+2543
15/02/12 16:17:56 INFO Executor: Finished task 0.0 in stage 154.0 (TID 289). 2018 bytes result sent to driver
15/02/12 16:17:56 INFO TaskSetManager: Finished task 0.0 in stage 154.0 (TID 289) in 4 ms on localhost (1/1)
15/02/12 16:17:56 INFO TaskSchedulerImpl: Removed TaskSet 154.0, whose tasks have all completed, from pool 
15/02/12 16:17:56 INFO DAGScheduler: Stage 154 (first at <console>:61) finished in 0.004 s
15/02/12 16:17:56 INFO DAGScheduler: Job 154 finished: first at <console>:61, took 0.009231 s
res30: Double = -6.866178341568849E-16

While I get an error if I try to use map on the samples.

scala> model
res26: org.apache.spark.mllib.regression.LinearRegressionModel = (weights=[-4.00245512323736E-15,-7.110058964543731E-15,2.0790436644401968E-15,1.7497510523275056E-15,6.593638326021273E-15], intercept=0.0)

scala> regPoints
res27: org.apache.spark.rdd.RDD[org.apache.spark.mllib.regression.LabeledPoint] = MappedRDD[32] at map at <console>:54

scala> val y_predicted = regPoints map (point => model.predict(point.features))
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 285
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_285_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_285_piece0 of size 4234 dropped from memory (free 278119436)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_285_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_285_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_285
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_285 of size 6456 dropped from memory (free 278125892)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 285
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 284
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_284_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_284_piece0 of size 163 dropped from memory (free 278126055)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_284_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_284_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_284
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_284 of size 96 dropped from memory (free 278126151)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 284
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 283
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_283_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_283_piece0 of size 4236 dropped from memory (free 278130387)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_283_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_283_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_283
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_283 of size 6456 dropped from memory (free 278136843)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 283
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 282
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_282
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_282 of size 96 dropped from memory (free 278136939)
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_282_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_282_piece0 of size 163 dropped from memory (free 278137102)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_282_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_282_piece0
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 282
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 281
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_281_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_281_piece0 of size 4233 dropped from memory (free 278141335)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_281_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_281_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_281
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_281 of size 6456 dropped from memory (free 278147791)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 281
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 280
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_280
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_280 of size 96 dropped from memory (free 278147887)
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_280_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_280_piece0 of size 163 dropped from memory (free 278148050)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_280_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_280_piece0
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 280
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 279
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_279_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_279_piece0 of size 4233 dropped from memory (free 278152283)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_279_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_279_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_279
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_279 of size 6456 dropped from memory (free 278158739)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 279
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 278
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_278_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_278_piece0 of size 163 dropped from memory (free 278158902)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_278_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_278_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_278
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_278 of size 96 dropped from memory (free 278158998)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 278
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 277
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_277
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_277 of size 6456 dropped from memory (free 278165454)
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_277_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_277_piece0 of size 4233 dropped from memory (free 278169687)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_277_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_277_piece0
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 277
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 276
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_276_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_276_piece0 of size 163 dropped from memory (free 278169850)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_276_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_276_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_276
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_276 of size 96 dropped from memory (free 278169946)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 276
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 275
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_275_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_275_piece0 of size 4233 dropped from memory (free 278174179)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_275_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_275_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_275
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_275 of size 6456 dropped from memory (free 278180635)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 275
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 274
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_274
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_274 of size 96 dropped from memory (free 278180731)
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_274_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_274_piece0 of size 163 dropped from memory (free 278180894)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_274_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_274_piece0
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 274
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 273
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_273_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_273_piece0 of size 4235 dropped from memory (free 278185129)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_273_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_273_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_273
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_273 of size 6456 dropped from memory (free 278191585)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 273
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 272
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_272
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_272 of size 96 dropped from memory (free 278191681)
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_272_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_272_piece0 of size 163 dropped from memory (free 278191844)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_272_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_272_piece0
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 272
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 271
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_271_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_271_piece0 of size 4232 dropped from memory (free 278196076)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_271_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_271_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_271
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_271 of size 6456 dropped from memory (free 278202532)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 271
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 270
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_270
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_270 of size 96 dropped from memory (free 278202628)
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_270_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_270_piece0 of size 163 dropped from memory (free 278202791)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_270_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_270_piece0
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 270
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 269
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_269_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_269_piece0 of size 4239 dropped from memory (free 278207030)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_269_piece0 on localhost:53178 in memory (size: 4.1 KB, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_269_piece0
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_269
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_269 of size 6456 dropped from memory (free 278213486)
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 269
15/02/12 16:14:45 INFO BlockManager: Removing broadcast 268
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_268
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_268 of size 96 dropped from memory (free 278213582)
15/02/12 16:14:45 INFO BlockManager: Removing block broadcast_268_piece0
15/02/12 16:14:45 INFO MemoryStore: Block broadcast_268_piece0 of size 163 dropped from memory (free 278213745)
15/02/12 16:14:45 INFO BlockManagerInfo: Removed broadcast_268_piece0 on localhost:53178 in memory (size: 163.0 B, free: 265.4 MB)
15/02/12 16:14:45 INFO BlockManagerMaster: Updated info of block broadcast_268_piece0
15/02/12 16:14:45 INFO ContextCleaner: Cleaned broadcast 268
org.apache.spark.SparkException: Task not serializable
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166)
    at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158)
    at org.apache.spark.SparkContext.clean(SparkContext.scala:1478)
    at org.apache.spark.rdd.RDD.map(RDD.scala:288)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:60)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:65)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:67)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:69)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:71)
    at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:73)
    at $iwC$$iwC$$iwC$$iwC.<init>(<console>:75)
    at $iwC$$iwC$$iwC.<init>(<console>:77)
    at $iwC$$iwC.<init>(<console>:79)
    at $iwC.<init>(<console>:81)
    at <init>(<console>:83)
    at .<init>(<console>:87)
    at .<clinit>(<console>)
    at .<init>(<console>:7)
    at .<clinit>(<console>)
    at $print(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
    at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
    at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
    at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
    at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
    at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
    at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:628)
    at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:636)
    at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:641)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:968)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
    at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
    at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
    at org.apache.spark.repl.Main$.main(Main.scala:31)
    at org.apache.spark.repl.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.NotSerializableException: breeze.stats.distributions.Rand$
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
    at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
    at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:164)
    ... 49 more


scala> 

How can I solve this?

EDIT: This is the full code

/* elm.scala */
import org.apache.spark.SparkContext 
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import breeze.linalg.linspace
import org.apache.spark.mllib.linalg.Vector
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.mllib.random._
import org.apache.spark.rdd.RDD
import breeze._
import org.apache.spark.mllib.linalg.{Matrix, Matrices, Vectors, Vector}
import org.apache.commons.math3.random.RandomDataGenerator
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.util.MLUtils
import org.apache.spark.mllib.regression.LinearRegressionWithSGD


val n_nodes = 20



val data = sc.textFile("/home/donbeo/Documents/dataset/spark_sample_data/sinx_over_x.txt")

//data.first = res5: String = -1.000000000000000000e+01 -3.748532789558167710e-02



val x = new RowMatrix(data map (line => {
  val l = line.split(' ').map (x => x.toDouble)
  Vectors.dense(l.tail)
}
)
)

val y = data map (line => {
  val l = line.split(' ').map (x => x.toDouble)
  Vectors.dense(l.head)
})


val n = x.numRows.toInt
val p = x.numCols.toInt

val u = breeze.stats.distributions.Uniform(-1,1)
val v = u.samplesVector(p*n_nodes).toArray
val w = Matrices.dense(p, n_nodes, v)




val xw = x.multiply(w)

val h = xw.rows map (r => {
  val rb = breeze.linalg.Vector(r.toArray) map (e => breeze.numerics.exp(-e*e))
  Vectors.dense(rb.toArray) } 
          )


val d = y.zip(h)

val regPoints = d map (line => {
  val (ye, xe) = line
  LabeledPoint(ye.apply(0), xe)
}
             )


val numIterations = 100
val model = LinearRegressionWithSGD.train(regPoints, numIterations)

val y_predicted = regPoints map (point => model.predict(point.features))

EDIT 2 : The code seems to work if it is written as a scala class and packet in a jar file with sbt assembly. The problem is so probably related to a dependence in the console

尝试使用collect()将模型带入驱动程序应用程序,而不是在工作程序之间序列化它。

regPoints.collect().map(point => model.predict(point.features))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM