I am using python-spark for a word count problem. My lines of codes for importing the text file located in my hdfs are:
file="hdfs://localhost:50070/user/hduser/input/sample.txt"
input=sc.textFile(file)
When I execute the program I get the following error:
py4j.protocol.Py4JJavaError: An error occurred while calling o25.collect. : java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "quickstart.cloudera/127.0.0.1"; destination host is: "localhost":50070;
Please help me to eradicate it. I am stuck.
Try running input=sc.textFile(file)
where file="/user/hduser/input/sample.txt"
You don't need hdfs://localhost:50070/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.