简体   繁体   English

Erlang:如何在Linux上触发ECONNRESET?

[英]Erlang: How to trigger ECONNRESET on Linux?

I recently encountered heisenbug-like occurrences of ECONNRESET with Erlang on Mac OS X (not my machine!). 我最近遇到的heisenbug般出现ECONNRESET使用Erlang在Mac OS X(不是我的机器!)。 To simplify the source of the problem, and in order to understand the error itself better, I wrote the following module to trigger ECONNRESET . 为了简化问题的根源,并且为了更好地理解错误本身,我编写了以下模块来触发ECONNRESET Unfortunately, I do not own a Mac myself. 不幸的是,我自己没有Mac。 On my Linux (ArchLinux) the code below does not crash with a badmatch of {error,econnreset} , but with {error,closed} instead. 在我的Linux(ArchLinux)上,下面的代码不会由于{error,econnreset}不匹配而崩溃,而是带有{error,closed}{error,econnreset}

-module(econnreset).

-export([run/0]).

run() ->
    Pid = spawn_link(fun listen_and_accept/0),
    Pid ! {get_port,self()},
    Port = receive {port,P} -> P end,

    {ok,Socket} = gen_tcp:connect("localhost", Port, [{active,false}]),
    ok = gen_tcp:send(Socket, lists:duplicate(100, $a)),

    %% On the following line, I expected a badmatch w/ {error,econnreset}
    {ok,_} = gen_tcp:recv(Socket, 0, 1000),
    ok = gen_tcp:close(Socket),
    ok.

listen_and_accept() ->
    {ok,LSocket} = gen_tcp:listen(0, [binary,{active,false}]),
    {ok,Port} = inet:port(LSocket),
    receive {get_port,Pid} -> Pid ! {port,Port} end,

    {ok,Socket} = gen_tcp:accept(LSocket),
    {ok,Bin} = gen_tcp:recv(Socket, 1), %=> read only 1 byte to trigger tcp RST
    io:format("Server read ~p~n", [Bin]),
    gen_tcp:close(Socket),
    gen_tcp:close(LSocket).

As I encountered ECONNRESET only on Mac OS X yet and not on Linux, this comes down to two questions: 由于我仅在Mac OS X上而不是Linux上遇到了ECONNRESET ,这归结为两个问题:

  1. Is the code above even capable of producing ECONNRESET by calling recv/2 ? 上面的代码是否甚至可以通过调用recv/2来产生ECONNRESET
  2. Do Mac OS X and Linux behave differently here? Mac OS X和Linux在这里的行为是否有所不同?

Some scenarios in which this can happen. 在某些情况下可能会发生这种情况。

Attempting to send new data on a socket, in which the remote endpoint has already started to close it's end of the connection can result in ECONNRESET. 尝试在套接字上发送新数据,该套接字中的远程端点已开始关闭它的连接结束,这可能会导致ECONNRESET。 (Sometimes it can result in SIGPIPE getting raised). (有时可能会导致SIGPIPE升高)。

Normally, this is hard to reproduce because most socket code gets notified of a closed connection by detecting a recv() call returning 0. 通常,这很难重现,因为大多数套接字代码都是通过检测返回0的recv()调用来通知关闭的连接。

It can also happen if the remote host crashes or loses power. 如果远程主机崩溃或断电,也会发生这种情况。 A typical scenario is two hosts that have an active TCP connection between each other. 典型的场景是两台主机之间相互具有活动的TCP连接。 Neither host is using TCP keep-alives and only periodically send/recv data between each other. 两个主机都没有使用TCP保持活动,并且仅在彼此之间定期发送/接收数据。 At the moment, assume both hosts are idle and do not have any data to send between each other. 目前,假设两个主机都处于空闲状态,并且彼此之间没有任何数据可发送。

The first host, HOST A, suffers a temporary power outage (ie pull the power cord) and reboots. 第一台主机HOST A暂时停电(即拔下电源线)并重新启动。 The other host, HOST B, has no awareness of the outage. 另一台主机HOST B不了解中断情况。 As such, the TCP state machine thinks the socket is still in the connected state. 这样,TCP状态机认为套接字仍处于连接状态。 At some point, HOST B attempts to send data on the socket. 在某个时候,主机B尝试在套接字上发送数据。 An IP packet eventually reaches HOST A. But the TCP state machine of HOST A doesn't have a connection record for HOST B's IP:port since the reboot. IP数据包最终到达主机A。但是,自重启以来,主机A的TCP状态机没有主机B的IP:端口的连接记录。 Hence, the only thing it can do is send back a RST to inform the HOST B that the connection is dead. 因此,它唯一能做的就是发回RST通知HOST B,该连接已死。

I've heard, but I could be wrong... that a socket can be forced into a CLOSED state without sending a FIN by using SO_LINGER with a zero value. 我听说过,但是我可能是错的...通过使用具有零值的SO_LINGER可以将套接字强制为CLOSED状态而不发送FIN。 Hence, subsequent data sent by the other side will result in RST as the response. 因此,另一端发送的后续数据将以RST作为响应。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM