[英]Ruby on Rails “invalid byte sequence in UTF-8” due to bot
I have some errors triggered by a chinese bot: http://www.easou.com/search/spider.html when it scrolls my websites.我有一些由中国机器人触发的错误: http : //www.easou.com/search/spider.html当它滚动我的网站时。
Versions of my applications are all with Ruby 1.9.3 and Rails 3.2.X我的应用程序版本都是 Ruby 1.9.3 和 Rails 3.2.X
Here a stacktrace :这里有一个堆栈跟踪:
An ArgumentError occurred in listings#show:
invalid byte sequence in UTF-8
rack (1.4.5) lib/rack/utils.rb:104:in `normalize_params'
-------------------------------
Request:
-------------------------------
* URL : http://www.my-website.com
* IP address: X.X.X.X
* Parameters: {"action"=>"show", "controller"=>"listings", "id"=>"location-t7-villeurbanne--58"}
* Rails root: /.../releases/20140708150222
* Timestamp : 2014-07-09 02:57:43 +0200
-------------------------------
Backtrace:
-------------------------------
rack (1.4.5) lib/rack/utils.rb:104:in `normalize_params'
rack (1.4.5) lib/rack/utils.rb:96:in `block in parse_nested_query'
rack (1.4.5) lib/rack/utils.rb:93:in `each'
rack (1.4.5) lib/rack/utils.rb:93:in `parse_nested_query'
rack (1.4.5) lib/rack/request.rb:332:in `parse_query'
actionpack (3.2.18) lib/action_dispatch/http/request.rb:275:in `parse_query'
rack (1.4.5) lib/rack/request.rb:209:in `POST'
actionpack (3.2.18) lib/action_dispatch/http/request.rb:237:in `POST'
actionpack (3.2.18) lib/action_dispatch/http/parameters.rb:10:in `parameters'
-------------------------------
Session:
-------------------------------
* session id: nil
* data: {}
-------------------------------
Environment:
-------------------------------
* CONTENT_LENGTH : 514
* CONTENT_TYPE : application/x-www-form-urlencoded
* HTTP_ACCEPT : text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
* HTTP_ACCEPT_ENCODING : gzip, deflate
* HTTP_ACCEPT_LANGUAGE : zh;q=0.9,en;q=0.8
* HTTP_CONNECTION : close
* HTTP_HOST : www.my-website.com
* HTTP_REFER : http://www.my-website.com/
* HTTP_USER_AGENT : Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html)
* ORIGINAL_FULLPATH : /
* PASSENGER_APP_SPAWNER_IDLE_TIME : -1
* PASSENGER_APP_TYPE : rack
* PASSENGER_CONNECT_PASSWORD : [FILTERED]
* PASSENGER_DEBUGGER : false
* PASSENGER_ENVIRONMENT : production
* PASSENGER_FRAMEWORK_SPAWNER_IDLE_TIME : -1
* PASSENGER_FRIENDLY_ERROR_PAGES : true
* PASSENGER_GROUP :
* PASSENGER_MAX_REQUESTS : 0
* PASSENGER_MIN_INSTANCES : 1
* PASSENGER_SHOW_VERSION_IN_HEADER : true
* PASSENGER_SPAWN_METHOD : smart-lv2
* PASSENGER_USER :
* PASSENGER_USE_GLOBAL_QUEUE : true
* PATH_INFO : /
* QUERY_STRING :
* REMOTE_ADDR : 183.60.212.153
* REMOTE_PORT : 52997
* REQUEST_METHOD : GET
* REQUEST_URI : /
* SCGI : 1
* SCRIPT_NAME :
* SERVER_PORT : 80
* SERVER_PROTOCOL : HTTP/1.1
* SERVER_SOFTWARE : nginx/1.2.6
* UNION_STATION_SUPPORT : false
* _ : _
* action_controller.instance : listings#show
* action_dispatch.backtrace_cleaner : #<Rails::BacktraceCleaner:0x000000056e8660>
* action_dispatch.cookies : #<ActionDispatch::Cookies::CookieJar:0x00000006564e28>
* action_dispatch.logger : #<ActiveSupport::TaggedLogging:0x0000000318aff8>
* action_dispatch.parameter_filter : [:password, /RAW_POST_DATA/, /RAW_POST_DATA/, /RAW_POST_DATA/]
* action_dispatch.remote_ip : 183.60.212.153
* action_dispatch.request.content_type : application/x-www-form-urlencoded
* action_dispatch.request.parameters : {"action"=>"show", "controller"=>"listings", "id"=>"location-t7-villeurbanne--58"}
* action_dispatch.request.path_parameters : {:action=>"show", :controller=>"listings", :id=>"location-t7-villeurbanne--58"}
* action_dispatch.request.query_parameters : {}
* action_dispatch.request.request_parameters : {}
* action_dispatch.request.unsigned_session_cookie: {}
* action_dispatch.request_id : 9f8afbc8ff142f91ddbd9cabee3629f3
* action_dispatch.routes : #<ActionDispatch::Routing::RouteSet:0x0000000339f370>
* action_dispatch.show_detailed_exceptions : false
* action_dispatch.show_exceptions : true
* rack-cache.allow_reload : false
* rack-cache.allow_revalidate : false
* rack-cache.cache_key : Rack::Cache::Key
* rack-cache.default_ttl : 0
* rack-cache.entitystore : rails:/
* rack-cache.ignore_headers : ["Set-Cookie"]
* rack-cache.metastore : rails:/
* rack-cache.private_headers : ["Authorization", "Cookie"]
* rack-cache.storage : #<Rack::Cache::Storage:0x000000039c5768>
* rack-cache.use_native_ttl : false
* rack-cache.verbose : false
* rack.errors : #<IO:0x000000006592a8>
* rack.input : #<PhusionPassenger::Utils::RewindableInput:0x0000000655b3a0>
* rack.multiprocess : true
* rack.multithread : false
* rack.request.cookie_hash : {}
* rack.request.form_hash :
* rack.request.form_input : #<PhusionPassenger::Utils::RewindableInput:0x0000000655b3a0>
* rack.request.form_vars : ���W�"��陷q�B��)���
�F��P Z� 8�� & G\y�P��u�T ed �.�%�mxEAẳ\�d*�Hg� �C賳�lj��� � U 1��]pgt�P�
Ɗ ��c"� ��LX��D���HR�y��p`6�l���lN�P �l�S����`V4y��c����X2� &JO!��*p �l��-�гU��w }g�ԍk�� (� F J�� q�:�5G�Jh�pί����ࡃ] �z�h���� d }�}
* rack.request.query_hash : {}
* rack.request.query_string :
* rack.run_once : false
* rack.session : {}
* rack.session.options : {:path=>"/", :domain=>nil, :expire_after=>nil, :secure=>false, :httponly=>true, :defer=>false, :renew=>false, :coder=>#<Rack::Session::Cookie::Base64::Marshal:0x000000034d4ad8>, :id=>nil}
* rack.url_scheme : http
* rack.version : [1, 0]
As you can see there is no invalid utf-8 in the url but only in the rack.request.form_vars
.如您所见,url 中没有无效的 utf-8,而只有
rack.request.form_vars
。 I have about hundred errors per days, and all similar as this one.我每天大约有一百个错误,并且都与这个相似。
So, I tried to force utf-8 in rack.request.form_vars
with something like this:所以,我试图在
rack.request.form_vars
强制使用 utf-8 ,如下所示:
class RackFormVarsSanitizer
def initialize(app)
@app = app
end
def call(env)
if env["rack.request.form_vars"]
env["rack.request.form_vars"] = env["rack.request.form_vars"].force_encoding('UTF-8')
end
@app.call(env)
end
end
And I call it in my application.rb
:我在我的
application.rb
调用它:
config.middleware.use "RackFormVarsSanitizer"
It doesn't seem to work because I already have errors.它似乎不起作用,因为我已经有错误。 The problem is I can't test in development mode because I don't know how to set
rack.request.form_vars
.问题是我无法在开发模式下进行测试,因为我不知道如何设置
rack.request.form_vars
。
I installed utf8-cleaner
gem but it fixes nothing.我安装了
utf8-cleaner
gem,但它没有修复任何问题。
Somebody have an idea to fix this?有人有解决这个问题的想法吗? or to trigger it in development?
还是在开发中触发它?
So you don't have to piece together the comments in my other reply, this is what I'm doing now – I've seen no errors for 24 hours, so it looks very promising:所以你不必把我另一个回复中的评论拼凑起来,这就是我现在正在做的——我已经 24 小时没有看到错误,所以它看起来很有希望:
Add rack-utf8_sanitizer to your Gemfile:将rack-utf8_sanitizer添加到您的 Gemfile 中:
gem 'rack-utf8_sanitizer'
and run并运行
bundle
Put this middleware in app/middleware/handle_invalid_percent_encoding.rb
and rename the class HandleInvalidPercentEncoding
(because ExceptionApp
is a bit too general).把这个中间件放在
app/middleware/handle_invalid_percent_encoding.rb
并重命名类HandleInvalidPercentEncoding
(因为ExceptionApp
有点太笼统了)。
In the config
block of config/application.rb
do:在
config/application.rb
的config
块中执行:
require "#{Rails.root}/app/middleware/handle_invalid_percent_encoding.rb"
# NOTE: These must be in this order relative to each other.
# HandleInvalidPercentEncoding just raises for encoding errors it doesn't cover,
# so it must run after (= be inserted before) Rack::UTF8Sanitizer.
config.middleware.insert 0, HandleInvalidPercentEncoding
config.middleware.insert 0, Rack::UTF8Sanitizer # from a gem
Deploy.部署。 Done.
完毕。
( app
happens to be the location for middleware in the project I'm working on, but I'd probably prefer lib
. Whatever. Either should work.) (
app
恰好是我正在处理的项目中中间件的位置,但我可能更喜欢lib
。无论如何。两者都应该可行。)
Add this line to your Gemfile
, then run bundle
in your terminal:将此行添加到
Gemfile
,然后在终端中运行bundle
:
gem "handle_invalid_percent_encoding_requests"
This solution is based on Henrik's answer , turned into a Rails Engine gem .这个解决方案基于Henrik 的回答,变成了 Rails Engine gem 。
There is an issue in the gem repo with a link to someone's possible solution – they say it works for them but they're not sure if it's a good solution. gem repo 中存在一个问题,其中包含指向某人可能的解决方案的链接——他们说这对他们有用,但他们不确定这是否是一个好的解决方案。
I've yet to try it, but I think I will.我还没有尝试过,但我想我会的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.