I am learning clojure at school and I have an exam coming up. I was just working on a few things to make sure I get the hang of it.
I am trying to read from a file line by line and as I do, I want to split the line whenever there is a ";".
Here is my code so far
(defn readFile []
(map (fn [line] (clojure.string/split line #";"))
(with-open [rdr (reader "C:/Users/Rohil/Documents/work.txt.txt")]
(doseq [line (line-seq rdr)]
(clojure.string/split line #";")
(println line)))))
When I do this, I still get the output:
"I;Am;A;String;"
Am I missing something?
I'm not sure if you need this at school, but since Gary already gave an excellent answer, consider this as a bonus.
You can do elegant transformations on lines of text with transducers. The ingredient you need is something that allows you to treat the lines as a reducible collection and which closes the reader when you're done reducing:
(defn lines-reducible [^BufferedReader rdr]
(reify clojure.lang.IReduceInit
(reduce [this f init]
(try
(loop [state init]
(if (reduced? state)
@state
(if-let [line (.readLine rdr)]
(recur (f state line))
state)))
(finally
(.close rdr))))))
Now you're able to do the following, given input work.txt
:
I;am;a;string
Next;line;please
Count the length of each 'split'
(require '[clojure.string :as str])
(require '[clojure.java.io :as io])
(into []
(comp
(mapcat #(str/split % #";"))
(map count))
(lines-reducible (io/reader "/tmp/work.txt")))
;;=> [1 2 1 6 4 4 6]
Sum the length of all 'splits'
(transduce
(comp
(mapcat #(str/split % #";"))
(map count))
+
(lines-reducible (io/reader "/tmp/work.txt")))
;;=> 24
Sum the length of all words until we find a word that is longer than 5
(transduce
(comp
(mapcat #(str/split % #";"))
(map count))
(fn
([] 0)
([sum] sum)
([sum l]
(if (> l 5)
(reduced sum)
(+ sum l))))
(lines-reducible (io/reader "/tmp/work.txt")))
or with take-while
:
(transduce
(comp
(mapcat #(str/split % #";"))
(map count)
(take-while #(> 5 %)))
+
(lines-reducible (io/reader "/tmp/work.txt")))
Read https://tech.grammarly.com/blog/building-etl-pipelines-with-clojure for more details.
TL;DR embrace the REPL and embrace immutability
Your question was "what am I missing?" and to that I'd say you're missing one of the best features of Clojure, the REPL.
Edit : you might also be missing that Clojure uses immutable data structures so
consider this code snippet:
(doseq [x [1 2 3]]
(inc x)
(prn x))
This code does not print "2 3 4"
it prints "1 2 3" because x isn't a mutable variable.
During the first iteration (inc x)
gets called, returns 2, and that gets thrown away because it wasn't passed to anything, then (prn x)
prints the value of x which is still 1.
Now consider this code snippet:
(doseq [x [1 2 3]] (prn (inc x)))
During the first iteration the inc passes its return value to prn so you get 2
Long example:
I don't want to rob you of the opportunity to solve the problem yourself so I'll use a different problem as an example.
Given the file "birds.txt"
with the data "1chicken\\n 2duck\\n 3Larry"
you want to write a function that takes a file and returns a sequence of bird names
Lets break this problem down into smaller chunks:
first lets read the file and split it up into lines
(slurp "birds.txt")
will give us the whole file a string
clojure.string/split-lines
will give us a collection with each line as an element in the collection
(clojure.string/split-lines (slurp "birds.txt"))
gets us ["1chicken" "2duck" "3Larry"]
At this point we could map some function over that collection to strip out the number like (map #(clojure.string/replace % #"\\d" "") birds-collection)
or we could just move that step up the pipeline when the whole file is one string.
Now that we have all of our pieces we can put them together in a functional pipeline where the result of one piece feeds into the next
In Clojure there is a nice macro to make this more readable, the ->
macro
It takes the result of one computation and injects it as the first argument to the next
so our pipeline looks like this:
(-> "C:/birds.txt"
slurp
(clojure.string/replace #"\d" "")
clojure.string/split-lines)
last note on style, for Clojure functions you want to stick to kebab case so readFile
should be read-file
I would keep it simple, and code it like this:
(ns tst.demo.core
(:use tupelo.test)
(:require [tupelo.core :as t]
[clojure.string :as str] ))
(def text
"I;am;a;line;
This;is;another;one
Followed;by;this;")
(def tmp-file-name "/tmp/lines.txt")
(dotest
(spit tmp-file-name text) ; write it to a tmp file
(let [lines (str/split-lines (slurp tmp-file-name))
result (for [line lines]
(for [word (str/split line #";")]
(str/trim word)))
result-flat (flatten result)]
(is= result
[["I" "am" "a" "line"]
["This" "is" "another" "one"]
["Followed" "by" "this"]])
Notice that result
is a doubly-nested (2D) matrix of words. The simplest way to undo this is the flatten
function to produce result-flat
:
(is= result-flat
["I" "am" "a" "line" "This" "is" "another" "one" "Followed" "by" "this"])))
You could also use apply concat
as in:
(is= (apply concat result) result-flat)
If you want to avoid building up a 2D matrix in the first place, you can use a generator function
(a la Python) via lazy-gen
and yield
from the Tupelo library :
(dotest
(spit tmp-file-name text) ; write it to a tmp file
(let [lines (str/split-lines (slurp tmp-file-name))
result (t/lazy-gen
(doseq [line lines]
(let [words (str/split line #";")]
(doseq [word words]
(t/yield (str/trim word))))))]
(is= result
["I" "am" "a" "line" "This" "is" "another" "one" "Followed" "by" "this"])))
In this case, lazy-gen
creates the generator function. Notice that for
has been replaced with doseq
, and the yield
function places each word into the output lazy sequence.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.