I am trying to read data from kinesis in Pyspark using KinesisUtils.createStream
but the issue is I'm getting this error.
Spark Streaming's Kinesis libraries not found in class path. Try one of the following.
1. Include the Kinesis library and its dependencies with in the
spark-submit command as
$ bin/spark-submit --packages org.apache.spark:spark-streaming-kinesis-asl:2.4.4 ...
2. Download the JAR of the artifact from Maven Central http://search.maven.org/,
Group Id = org.apache.spark, Artifact Id = spark-streaming-kinesis-asl-assembly, Version = 2.4.4.
Then, include the jar in the spark-submit command as
$ bin/spark-submit --jars <spark-streaming-kinesis-asl-assembly.jar> ...
________________________________________________________________________________________________
Traceback (most recent call last):
File "/Users/ahmad.muhammad/Desktop/kinesis-reader.py", line 8, in <module>
kinesisStream = KinesisUtils.createStream(ssc,'Ahmad-Kineses','twitter-stream','https://kinesis.us-east-1.amazonaws.com/','us-east-1',InitialPositionInStream.TRIM_HORIZON,20)
File "/Users/Ahmad.Muhammad/opt/apache-spark/spark-2.4.4-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/streaming/kinesis.py", line 84, in createStream
TypeError: 'JavaPackage' object is not callable
assuming you are using pyspark on local machine then what you can do is add env variable to your code, you can do some thing like this. in your terminal try
export PYSPARK_SUBMIT_ARGS = --master local[2] --packages org.apache.spark:spark-streaming-kinesis-asl_2.11:2.1.0 pyspark-shell
hopefully this will solve your problem.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.