简体   繁体   中英

Error while running standalone app example in python using spark

I am just getting started on spark and am running it on standalone mode over amazon EC2 instance. I was trying examples mentioned in the documentation and while going through this example called Simple App I keep getting this error: NameError: name 'numAs' is not defined

from pyspark import SparkContext

logFile = "$YOUR_SPARK_HOME/README.md"  # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()

numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()

print "Lines with a: %i, lines with b: %i" % (numAs, numBs)

How do I integrate an editor into spark instead of using this dynamic python shell? Why do I keep getting this error?

Thanks for any help/guidance.

put your all your python code in a .py file , then submit the .py file like below:

# Run a Python application on a Spark Standalone cluster
./bin/spark-submit \
  --master spark://207.184.161.138:7077 \
  examples/src/main/python/pi.py \
  1000

read here:

Submitting Applications

try these examples, really helping:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM