I am newbie to spark and learning spark with java on ubuntu 18.0 with no explicit clusters. I have data.csv file saved at local file system in java/main/resources folder.
while executing below code,
SparkSession sparkSession = SparkSession.builder()
.appName("sparksql").master("local[*]")
.getOrCreate();
Dataset<Row> dataset = sparkSession.read()
.option("header", true)
.csv("/media/home/work/sparksamples/src/main/resources/exams/test.csv");
below error is coming :
20/11/23 16:07:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/DistributedFileSystem
at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.listLeafFiles(InMemoryFileIndex.scala:316)
at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.$anonfun$bulkListLeafFiles$1(InMemoryFileIndex.scala:195)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
Could 20/11/23 16:07:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/DistributedFileSystem
at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.listLeafFiles(InMemoryFileIndex.scala:316)
at org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.$anonfun$bulkListLeafFiles$1(InMemoryFileIndex.scala:195)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
How to load file from local file system without using hdfs in Ubuntu?
It was due to missing hadoop-client jar in latest version - 3.3.
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.3.0</version>
</dependency>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.