I'm using Java-Spark, I'm get message from Kafka topic that indicate on zip file path, I want to take this zip file and to extract it to HDFS.
I have code that read messages from Kafka with Spark Structured Stream.
What is the way to extract the files to HDFS?
I'm using ZipFile
from net.lingala.zip4j.core.ZipFile
as follow:
ZipFile zipFile = new ZipFile(pathFromKafka);
zipFile.extractAll("?");//What should I write here?
ZipFile doesn't allow you to extract files to the HDFS You can extract files to the local file system and then put these file into HDFS:
//imports required
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
//some class here .....
Configuration conf = new Configuration();
conf.set("fs.defaultFS", <hdfs write endpoint>);
FileSystem fs = FileSystem.get(conf);
fs.copyFromLocalFile(<src>, <dst>);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.