简体   繁体   English

尝试将文件放在 sftp 上的并行线程经常从 sftp 服务器获取连接重置错误

[英]Getting Connection Reset error from sftp server frequently with parallel threads trying to put file on sftp

I have a piece of multithreaded code which has 22 threads running in parallel and trying to put files on sftp server.我有一段多线程代码,它有 22 个线程并行运行并试图将文件放在 sftp 服务器上。

But I keep getting Connection Reset error intermittently in my logs and few of the records fail because of that.但是我的日志中不断出现连接重置错误,因此很少有记录失败。

On initial analysis, I found out that the size of the sftp server was t2.small and CPU utilization was going to 92%.初步分析,我发现 sftp 服务器的大小为 t2.small,CPU 使用率达到 92%。

Considering this as I reason I changed the server to c5n.xlarge, now the error is coming less frequently but still, I am getting it at times even when the maximum CPU utilization goes to 63%.考虑到这是我将服务器更改为 c5n.xlarge 的原因,现在错误发生的频率降低了,但即使最大 CPU 使用率达到 63%,我有时也会遇到错误。

I am not able to find anything different in sftp server logs at /var/log/secure.我无法在 /var/log/secure 的 sftp 服务器日志中找到任何不同之处。

Below is the piece of code being used to put file, every thread creates a new session and closes it.下面是一段用于放置文件的代码,每个线程创建一个新会话并关闭它。

JSch ssh = new JSch();
            // ssh.setKnownHosts("/path/of/known_hosts/file");
            java.util.Properties config = new java.util.Properties();
            config.put("StrictHostKeyChecking", "no");
            // Use key authentication if it is set, else use password auth
            if (mpServerDetails.get(SftpFile.SFTP_USERKEY) != null
                    && mpServerDetails.get(SftpFile.SFTP_USERKEY) != "") {
                    File userKeyFile = new File(mpServerDetails.get(SftpFile.SFTP_USERKEY).toString());
                if (userKeyFile == null || !userKeyFile.exists()) {
                    throw new NonRetriableException(
                            "Key file " + mpServerDetails.get(SftpFile.SFTP_USERKEY).toString() + "not found.");
                }
                ssh.addIdentity(userKeyFile.getAbsolutePath());
                session = ssh.getSession(mpServerDetails.get(SftpFile.SFTP_USERNAME).toString(),
                        mpServerDetails.get(SftpFile.SFTP_HOSTNAME).toString());
            } else if (mpServerDetails.get(SftpFile.SFTP_PASSWORD) != null) {
                session = ssh.getSession(mpServerDetails.get(SftpFile.SFTP_USERNAME).toString(),
                        mpServerDetails.get(SftpFile.SFTP_HOSTNAME).toString());
                session.setPassword(mpServerDetails.get(SftpFile.SFTP_PASSWORD).toString());
            }
            session.setConfig(config);
            session.connect();
            if (session != null && !session.isConnected()) {
                logger.warn("**session is not connected going to connect the sftp session ** {} ", session.getHost());
                session.connect();
            }
            channel = (ChannelSftp) session.openChannel("sftp");
            if (channel != null && !channel.isConnected()) {
                logger.warn("**channel is not connected going to connect the sftp channel ** {} ",
                        channel.getSession().isConnected());
                channel.connect();
            }
            channel.put(file.getAbsolutePath(), dest.getConfig().get(TransporterFileConstants.SFTP_DIRECTORY).toString()
                    + File.separatorChar + dest.getFileName(), new SystemOutProgressMonitor());

        }
        catch (NonRetriableException e) {
            throw new NonRetriableException(e);
        }
        catch (Exception e) {
            logger.error(
                    "Error occured while uploading file having name " + dest.getFileName() + " from remote directory:"
                            + dest.getConfig().get(TransporterFileConstants.SFTP_DIRECTORY).toString(),
                    e);
            logger.error("SFTP Exception : ", e);
            throw new RetriableException(e);
        }
        finally {
            if (null != channel && channel.isConnected()) {
                try {
                    channel.disconnect();
                }
                catch (Throwable e) {
                    logger.error("Error while disconnecting channel : ", e);
                }
            }
            if (null != session) {
                try {
                    session.disconnect();
                }
                catch (Throwable e) {
                    logger.error("Error while returning object to sftp pool : ", e);
                }
            }
        }

Can someone help me understand why I might be getting this exception?有人能帮我理解为什么我会得到这个例外吗?

SFTP server configurations are SFTP 服务器配置是

MaxSessions 50
Capacity - 25 GB
4 core server with 10 GB Ram

A snippet of error message一段错误信息

com.jcraft.jsch.JSchException: Session.connect: java.net.SocketException: Connection reset
    at com.jcraft.jsch.Session.connect(Session.java:558) ~[honeybee-engine.jar:na]

If this would keep coming, my data processing would not be consistent.如果这种情况继续发生,我的数据处理将不一致。

MaxSessions 50

The SSH server MaxSessions parameter limits the number of "sessions" that can run through a single SSH connection. SSH 服务器MaxSessions参数限制可以通过单个 SSH 连接运行的“会话”数。 You're only running one session--the SFTP session--through each connection, so the MaxSessions limit isn't particularly relevant to you.您仅通过每个连接运行一个会话(SFTP 会话),因此 MaxSessions 限制与您不是特别相关。

Your problem may be with the MaxStartups setting :您的问题可能与MaxStartups设置有关:

MaxStartups MaxStartups
Specifies the maximum number of concurrent unauthenticated connections to the SSH daemon.指定到 SSH 守护进程的最大并发未验证连接数。 Additional connections will be dropped until authentication succeeds or the LoginGraceTime expires for a connection.其他连接将被丢弃,直到身份验证成功或连接的 LoginGraceTime 到期。 The default is 10:30:100....默认为 10:30:100....

Basically, if there are too many clients connected to the server which haven't authenticated yet , the server will drop some of those connections.基本上,如果有太多客户端连接到尚未验证的服务器,服务器将丢弃其中一些连接。 If your application is opening too many connections to the server at the same time, the server may be dropping some of those connections.如果您的应用程序同时打开太多与服务器的连接,服务器可能会丢弃其中一些连接。 The solution here is to adjust the value of MaxStartups, or change your application not to open so many connections at once.这里的解决方案是调整MaxStartups的值,或者改变你的应用程序不要一次打开那么多连接。

There is also an operating system limit called the listen backlog .还有一个称为listen backlog的操作系统限制。 Basically, the operating system will only hold on to a certain number of pending TCP connections.基本上,操作系统只会保留一定数量的未决 TCP 连接。 If enough connection attempts come in at the same time, and the ssh server process isn't fast enough at accepting them, then the OS will drop some of the connection requests.如果同时有足够多的连接尝试进入,而 ssh 服务器进程接受它们的速度不够快,那么操作系统将丢弃一些连接请求。 The SSH server requests a backlog of 128 connections, but the OS may be capping the backlog at a lower value. SSH 服务器请求 128 个连接的积压,但操作系统可能会将积压限制在较低的值。 If your SSH server is busy enough, you may be running into this limit.如果您的 SSH 服务器足够繁忙,您可能会遇到此限制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM