简体   繁体   中英

What is the javascript array.map equivalent in Elixir?

I am new to elixir I know how to solve the issue in JavaScript, but i am having a hard time converting it to elixir solution.

I am trying to take a list in and output the words the occur more than once

const str = "This is the state of education? is It?"

words = str.split(" ");

const data = new Set(words.map((word) => {
   if(words.filter((value) => value == 
    word).length > 1){
    return word
   }
  }).filter(value => value != undefined)
)

// Set(1) {'is'}

I have been trying to do this in Elixir but I keep failing at it because values are immutable.

defmodule Test do
 def find_duplicate_words(sentence) do
     words = String.split(String.downcase(sentence))
     ls = [1,2]
     Enum.map(words, fn word ->
       # if "fox" == word do
               [ls | word]
    #    end
    end
    )
    IO.puts(ls)
    IO.puts(length(words))
 end
end
 
sentence = """
This is the state of education? is It?
"""
 
# returns ["is"]  <-- return this
Test.find_duplicate_words(sentence)

Can anyone show me how to do this?

Two possible solutions:

  1. Use the diff of all words and all unique words:
str = "This is the state of education? is It?"

words = String.split(str, " ")
unique_words = Enum.uniq(words)
duplicate_words = words -- unique_words

# ["is"]
  1. If you are also interested in the number of occurences, use Enum.frequencies :
str = "This is the state of education? is It?"

str
|> String.split(" ")
|> Enum.frequencies
|> Enum.filter(fn {_k, v} -> v > 1 end)

# [{"is", 2}]

Neither the original javascript solution nor the posted answer does provide a correct result because of punctuation not being filtered out.

"This is the state of education? is It?"
|> String.downcase()
|> String.split(~r/[-\P{L}]+/, trim: true) # letters only
|> Enum.group_by(& &1) # or Enum.frequencies/1
|> Enum.filter(&match?({_, [_, _ | _]}, &1)) # 2+ elems
|> Enum.map(&elem(&1, 0))
#⇒ ["is"]

Here is a variation of Aleksei's answer that is perhaps easier to read and maybe cleaves closer to the original Javascript answer. Note his comment about punctuation and his use of a regular expression that does a better split than the simple "split on space".

"This is the state of education? is It?"
|> String.downcase()
|> String.split(~r/[-\P{L}]+/, trim: true)
|> Enum.frequencies()
|> Map.filter(fn {_, cnt_occurrences} -> cnt_occurrences > 1 end)
|> Map.keys()
|> MapSet.new()

Although you could write this without the pipe, it's good to start thinking that way in Elixir because (as you noted), values are immutable (ie the scope is not preserved outside the block and values must be assigned). Add |> IO.inspect() at any point in the chain to look at the results of the previous operation, eg the result of Enum.frequencies would show

%{
  "education" => 1,
  "is" => 2,
  "it" => 1,
  "of" => 1,
  "state" => 1,
  "the" => 1,
  "this" => 1
}

and then you can see how the Map.filter/2 function pares that down to simply %{"is" => 2} .

An approach using Tail Call:

defmodule Word do

  def find_duplicate_words(str) do
    current = MapSet.new()
    words = String.split(str, " ")

    do_find_duplicate_words(current, words)
  end

  defp do_find_duplicate_words(current, []) do
    MapSet.to_list(current)
  end

  defp do_find_duplicate_words(current, [head | tail]) do
    current = if head in tail, do: MapSet.put(current, head), else: current
    do_find_duplicate_words(current, tail)
  end

end

The output will be the following.

Word.find_duplicate_words("This is the is state of education? is It?")

["is"]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM