简体   繁体   中英

Extract zip file to HDFS using Java

I'm using Java-Spark, I'm get message from Kafka topic that indicate on zip file path, I want to take this zip file and to extract it to HDFS.

I have code that read messages from Kafka with Spark Structured Stream.

What is the way to extract the files to HDFS?

I'm using ZipFile from net.lingala.zip4j.core.ZipFile as follow:

ZipFile zipFile = new ZipFile(pathFromKafka);
zipFile.extractAll("?");//What should I write here?

ZipFile doesn't allow you to extract files to the HDFS You can extract files to the local file system and then put these file into HDFS:

//imports required 
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;

//some class here .....
Configuration conf = new Configuration();
conf.set("fs.defaultFS", <hdfs write endpoint>);
FileSystem fs = FileSystem.get(conf);
fs.copyFromLocalFile(<src>, <dst>);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM