I am new to elixir I know how to solve the issue in JavaScript, but i am having a hard time converting it to elixir solution.
I am trying to take a list in and output the words the occur more than once
const str = "This is the state of education? is It?"
words = str.split(" ");
const data = new Set(words.map((word) => {
if(words.filter((value) => value ==
word).length > 1){
return word
}
}).filter(value => value != undefined)
)
// Set(1) {'is'}
I have been trying to do this in Elixir but I keep failing at it because values are immutable.
defmodule Test do
def find_duplicate_words(sentence) do
words = String.split(String.downcase(sentence))
ls = [1,2]
Enum.map(words, fn word ->
# if "fox" == word do
[ls | word]
# end
end
)
IO.puts(ls)
IO.puts(length(words))
end
end
sentence = """
This is the state of education? is It?
"""
# returns ["is"] <-- return this
Test.find_duplicate_words(sentence)
Can anyone show me how to do this?
Two possible solutions:
str = "This is the state of education? is It?"
words = String.split(str, " ")
unique_words = Enum.uniq(words)
duplicate_words = words -- unique_words
# ["is"]
Enum.frequencies
:str = "This is the state of education? is It?"
str
|> String.split(" ")
|> Enum.frequencies
|> Enum.filter(fn {_k, v} -> v > 1 end)
# [{"is", 2}]
Neither the original javascript solution nor the posted answer does provide a correct result because of punctuation not being filtered out.
"This is the state of education? is It?"
|> String.downcase()
|> String.split(~r/[-\P{L}]+/, trim: true) # letters only
|> Enum.group_by(& &1) # or Enum.frequencies/1
|> Enum.filter(&match?({_, [_, _ | _]}, &1)) # 2+ elems
|> Enum.map(&elem(&1, 0))
#⇒ ["is"]
Here is a variation of Aleksei's answer that is perhaps easier to read and maybe cleaves closer to the original Javascript answer. Note his comment about punctuation and his use of a regular expression that does a better split than the simple "split on space".
"This is the state of education? is It?"
|> String.downcase()
|> String.split(~r/[-\P{L}]+/, trim: true)
|> Enum.frequencies()
|> Map.filter(fn {_, cnt_occurrences} -> cnt_occurrences > 1 end)
|> Map.keys()
|> MapSet.new()
Although you could write this without the pipe, it's good to start thinking that way in Elixir because (as you noted), values are immutable (ie the scope is not preserved outside the block and values must be assigned). Add |> IO.inspect()
at any point in the chain to look at the results of the previous operation, eg the result of Enum.frequencies
would show
%{
"education" => 1,
"is" => 2,
"it" => 1,
"of" => 1,
"state" => 1,
"the" => 1,
"this" => 1
}
and then you can see how the Map.filter/2
function pares that down to simply %{"is" => 2}
.
An approach using Tail Call:
defmodule Word do
def find_duplicate_words(str) do
current = MapSet.new()
words = String.split(str, " ")
do_find_duplicate_words(current, words)
end
defp do_find_duplicate_words(current, []) do
MapSet.to_list(current)
end
defp do_find_duplicate_words(current, [head | tail]) do
current = if head in tail, do: MapSet.put(current, head), else: current
do_find_duplicate_words(current, tail)
end
end
The output will be the following.
Word.find_duplicate_words("This is the is state of education? is It?")
["is"]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.