简体   繁体   中英

Reading a zip file using java api from clojure

I'm trying to rewrite following snippet in clojure, but it all comes out ugly, maybe someone will suggest a more elegant solution?

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;

public class ZipFileRdrExp {

  public static void main(String[] args) {

    try {

      FileInputStream fis = new FileInputStream("C:\\MyZip.zip");
      ZipInputStream zis = new ZipInputStream(fis);
      ZipEntry ze;
      while((ze=zis.getNextEntry())!=null){
        System.out.println(ze.getName());
        zis.closeEntry();
      }

      zis.close();

    } catch (FileNotFoundException e) {
      e.printStackTrace();
    } catch (IOException e) {
      e.printStackTrace();
    }
  }
}

Here is my ugly try with duplicate call to getNextEntry:

(ns app.core
  (:import
  (java.io FileInputStream FileNotFoundException IOException File)
  (java.util.zip ZipInputStream ZipEntry)))


(defn- read-zip [zip-file]
  (let [fis (FileInputStream. zip-file)
        zis (ZipInputStream. fis)]
    (loop [ze (.getNextEntry zis)]
      (when ze
        (println (.getName ze))
        (.closeEntry zis)
        (recur (.getNextEntry zis))))
    (.close zis)))

I would go with something like the following:

(defn entries [zipfile]
 (lazy-seq
  (if-let [entry (.getNextEntry zipfile)]
   (cons entry (entries zipfile)))))

(defn walkzip [fileName]
 (with-open [z (ZipInputStream. (FileInputStream. fileName))]
  (doseq [e (entries z)]
   (println (.getName e))
   (.closeEntry z))))

EDIT: the above code was eventually tested and corrected.

EDIT: the following works as expected and it's much more concise, even though it uses a different Java API

(defn entries [zipfile]
  (enumeration-seq (.entries zipfile)))

(defn walkzip [fileName]
  (with-open [z (java.util.zip.ZipFile. fileName)]
             (doseq [e (entries z)]
                    (println (.getName e)))))

This is a simpler example:

(defn filenames-in-zip [filename]
  (let [z (java.util.zip.ZipFile. filename)] 
    (map #(.getName %) (enumeration-seq (.entries z)))))

This is similar to the code above, but there is no reason to use with-open here. This example returns a sequence of data that you can then print out or better yet, format. It's better to have the function that extracts the data just return data rather than having the side effect of printing contained inside that function.

If you want to print the contents out you can use

(pprint (filenames-in-zip "my.zip"))

and it will give you a nice list.

This is similar to skuro's answer that uses ZipInputStream , but a slightly more terse definition of entries .

(defn entries [zip-stream]
  (take-while #(not (nil? %))
              (repeatedly #(.getNextEntry zip-stream))))

(defn walkzip [fileName]
  (with-open [z (ZipInputStream. (FileInputStream. fileName))]
             (doseq [e (entries z)]
                    (println (.getName e))
                    (.closeEntry z))))

Or, if you want to actually extract the files you need another helper function for copying. I've used clojure.java.io for shortening the code, but the same thing could be accomplished without this dependency.

(require '[clojure.java.io :as io])

(defn entries [zip-stream]
  (take-while #(not (nil? %))
              (repeatedly #(.getNextEntry zip-stream))))

(defn copy-file [zip-stream filename]
  (with-open [out-file (file-out-stream filename)]
             (let [buff-size 4096
                             buffer (byte-array buff-size)]
               (loop [len (.read zip-stream buffer)]
                     (when (> len 0)
                       (.write out-file buffer 0 len)
                       (recur (.read zip-stream buffer)))))))

(defn extract-stream [zip-stream to-folder]
  (let [extract-entry (fn [zip-entry]
                          (when (not (.isDirectory zip-entry))
                            (let [to-file (io/file to-folder
                                                   (.getName zip-entry))
                                          parent-file (io/file (.getParent to-file))]
                              (.mkdirs parent-file)
                              (copy-file zip-stream to-file))))]
    (->> zip-stream
      entries
      (map extract-entry)
      dorun)))

This is effectively equivalent to simply unzipping the file with an unzip utility. The beauty of it is that since the entries are in a lazy seq, you can filter or drop or take to your hearts (or requirements) content. Well, I'm pretty sure you can. Haven't really tried it yet :)

Also do note. You MUST process the seq inside of the function where you open the zip stream!!!

Clojure-Contrib has libraries IO and Jar , which make the code shorter:

(require 'clojure.contrib.jar
         'clojure.contrib.io)

(import [java.util.jar JarFile])

(defn- read-zip [zip-file]
  (clojure.contrib.jar/filenames-in-jar (JarFile. (clojure.contrib.io/file zip-file))))

Caveat: Function filenames-in-jar does not list directory entries in the zip file, only names of actual files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM