problems with Scraping a Website with Elixir

I'm trying to get a simple hound test working with my app, I figured out its an error with selenium. This is the code:

In mix.exs:

 defmodule Scraper.Mixfile do use Mix.Project def project do [app: :scraper, version: "0.0.1", elixir: "~> 1.0", build_embedded: Mix.env == :prod, start_permanent: Mix.env == :prod, deps: deps] end # Configuration for the OTP application # # Type `mix help compile.app` for more information def application do [applications: [:logger, :httpoison, :hound]] end # Dependencies can be Hex packages: # # {:mydep, "~> 0.3.0"} # # Or git/path repositories: # # {:mydep, git: "https://github.com/elixir-lang/mydep.git", tag: "0.1.0"} # # Type `mix help deps` for more examples and options defp deps do [ {:httpoison, "~> 0.7"}, {:floki, "~> 0.7"}, {:hound, "~> 0.7"} ] end end 

In lib/scraper.ex

 defmodule Example do use Hound.Helpers def run do Hound.start_session IO.inspect "Iniciando" navigate_to "http://akash.im" IO.inspect page_title() Hound.end_session end end 

In config/config.exs

 # This file is responsible for configuring your application # and its dependencies with the aid of the Mix.Config module. use Mix.Config # This configuration is loaded before any dependency and is restricted # to this project. If another project depends on this project, this # file won't be loaded nor affect the parent project. For this reason, # if you want to provide default values for your application for third- # party users, it should be done in your mix.exs file. # Sample configuration: # # config :logger, :console, # level: :info, # format: "$date $time [$level] $metadata$message\\n", # metadata: [:user_id] # It is also possible to import configuration files, relative to this # directory. For example, you can emulate configuration per environment # by uncommenting the line below and defining dev.exs, test.exs and such. # Configuration from the imported file will override the ones defined # here (which is why it is important to import them last). # # import_config "#{Mix.env}.exs" # Define how long the application will wait between failed attempts (in miliseconds) config :hound, retry_time: 100000 # Start with selenium driver (default) config :hound, driver: "selenium" 

Starting a webdriver server

 java -jar selenium-server-standalone-2.45.0.jar 

Run app:

 /scraper$ iex -S mix Erlang/OTP 18 [erts-7.1] [source] [64-bit] [smp:2:2] [async-threads:10] [hipe] [kernel-poll:false] Interactive Elixir (1.0.5) - press Ctrl+C to exit (type h() ENTER for help) iex(1)> Example.run ** (exit) exited in: :gen_server.call(Hound.SessionServer, {:find_or_create_session, #PID<0.148.0>}, 60000) ** (EXIT) an exception was raised: ** (MatchError) no match of right hand side value: {:error, %HTTPoison.Error{id: nil, reason: :timeout}} (hound) lib/hound/request_utils.ex:43: Hound.RequestUtils.send_req/4 (hound) lib/hound/session_server.ex:22: Hound.SessionServer.handle_call/3 (stdlib) gen_server.erl:629: :gen_server.try_handle_call/4 (stdlib) gen_server.erl:661: :gen_server.handle_msg/5 (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3 11:26:13.971 [error] GenServer Hound.SessionServer terminating Last message: {:find_or_create_session, #PID<0.148.0>} State: #HashDict<[]> ** (exit) an exception was raised: ** (MatchError) no match of right hand side value: {:error, %HTTPoison.Error{id: nil, reason: :timeout}} (hound) lib/hound/request_utils.ex:43: Hound.RequestUtils.send_req/4 (hound) lib/hound/session_server.ex:22: Hound.SessionServer.handle_call/3 (stdlib) gen_server.erl:629: :gen_server.try_handle_call/4 (stdlib) gen_server.erl:661: :gen_server.handle_msg/5 (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3 (stdlib) gen_server.erl:212: :gen_server.call/3 (scraper) lib/scraper.ex:37: Example.run/0 iex(1)> 

The request timed out in this case, as can be seen from the line

** (MatchError) no match of right hand side value: {:error, %HTTPoison.Error{id: nil, reason: :timeout}}

If you look at the stack trace, it indicates the error is at

(hound) lib/hound/request_utils.ex:43: Hound.RequestUtils.send_req/4

And if you open up hound source, on line 43 of lib/hound/request_utils.ex you see

case type do
  :get ->
    {:ok, resp} = HTTPoison.get(url, headers, @http_options)
  :post ->
    {:ok, resp} = HTTPoison.post(url, body, headers, @http_options)
  :delete ->
    {:ok, resp} = HTTPoison.delete(url, headers, @http_options)

This code expects a response, and crashes otherwise. There's a timeout error in your case, causing the crash.

Please check if the website up and reachable when you run the test, and retry.

