简体   繁体   中英

Need help in Python code for creating RDD from a list

While running the PYSPARK code, I am getting an error.

Python code for creating RDD from a list.

I need to use MAP function in RDD.

rom pyspark import SparkContext
print("Pyspark import succefully imported")

sc=SparkContext('local[2]','BasicExample')
print("Spark Context object created")

l=range(0,100)

print("type of data entered:-",type(l))

lRDD=sc.parallelize(l)

print(type(lRDD))

#COLLECT is an action performed on RDD to print on output
print(lRDD.collect())

print(lRDD.first())

#MAP: map is a lazy function
lRDD_map=lRDD.map(lambda x: x+13)
print(lRDD_map.collect())

lRDD_map_2=lRDD.map(lambda y: y>=23)
print(lRDD_map_2.collect())

lRDD_map_3=lRDD.map(lambda z: z **10)
print(lRDD_map_3.collect())

#Distinct
print("Distinct",lRDD.distinct().collect())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM