While running the PYSPARK code, I am getting an error.
Python code for creating RDD from a list.
I need to use MAP function in RDD.
rom pyspark import SparkContext
print("Pyspark import succefully imported")
sc=SparkContext('local[2]','BasicExample')
print("Spark Context object created")
l=range(0,100)
print("type of data entered:-",type(l))
lRDD=sc.parallelize(l)
print(type(lRDD))
#COLLECT is an action performed on RDD to print on output
print(lRDD.collect())
print(lRDD.first())
#MAP: map is a lazy function
lRDD_map=lRDD.map(lambda x: x+13)
print(lRDD_map.collect())
lRDD_map_2=lRDD.map(lambda y: y>=23)
print(lRDD_map_2.collect())
lRDD_map_3=lRDD.map(lambda z: z **10)
print(lRDD_map_3.collect())
#Distinct
print("Distinct",lRDD.distinct().collect())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.