简体   繁体   English

对clojure的Web检索到Lazy字符串错误

[英]Web retrieval to Lazy string error on clojure

I intend to retrieve a big web page and tried this on Clojure 我打算检索一个大网页并在Clojure上尝试这个

(defn fetch-url [url]
  "Retrieves the web page specified by the url."
  (with-open [the-stream (.openStream (java.net.URL. url))]
    (let [reader (new BufferedReader (new InputStreamReader the-stream))]
     (repeatedly (str (.read reader))))))

It gives me the following error when I try to get the first or any seq operator on the result of the function: 当我尝试获取函数结果的first或任何seq运算符时,它给出了以下错误:

java.lang.RuntimeException: java.lang.ClassCastException: java.lang.String cannot be cast to clojure.lang.IFn (repl-1:2)

Also, using line-seq doesn't work because (I think) the page lasts too much loading. 此外,使用line-seq不起作用,因为(我认为)页面持续过多的加载。

I wanted to create a lazy string because creating simply a string gives me an out of heap space error. 我想创建一个惰性字符串,因为只创建一个字符串就可以解决堆空间错误。 How can I accomplish this? 我怎么能做到这一点?

The immediate problem is, that repeatedly expects a function, and you're giving the result of (str (....)), which is String. 当前的问题是,重复地期望一个函数,并且你给出了(str(....))的结果,即String。 To make Clojure happy, you need to "pack" the call to "str" in "fn": 为了让Clojure满意,你需要在“fn”中“打包”对“str”的调用:

(repeatedly (fn [] (str (.read reader)))))

A better solution is to use slurp or slurp* (the latter is in contrib IIRC), or at least to check out how it is written. 更好的解决方案是使用slurp或slurp *(后者在contrib IIRC中),或者至少检查它是如何编写的。

[edit] There's no such thing as "lazy string" in clojure. [编辑]在clojure中没有“懒字”这样的东西。 Clojure strings are just java strings. Clojure字符串只是java字符串。 Clojure has lazy sequences so you can try using them, but you'll have to fight with the closing stream. Clojure有懒惰的序列,所以你可以尝试使用它们,但你必须与结束流战斗。

Alternatively, you can use the following approach (pseudo-code): 或者,您可以使用以下方法(伪代码):

(defn process-url [url proc-fn]
  (with-open [the-stream ...]
    (loop [c (.read r)]
      (if-not (neg? c)
         (proc-fn (char c)))))

This will call the function you pass as a second arg on each read character. 这将调用您在每个读取字符上作为第二个arg传递的函数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM