简体   繁体   English

如何提高ArangoDB中的插入性能

[英]How can I improve insert performance in ArangoDB

My environment is local machine: ubuntu 12.04 ArangoDB 2.2.4 or 2.2.3 perl driver(ArangoDB) CPU: 3 core 6 threads mem: 3GB 我的环境是本地计算机:ubuntu 12.04 ArangoDB 2.2.4或2.2.3 perl驱动程序(ArangoDB)CPU:3核心6线程mem:3GB

I used save method. 我使用了保存方法。 Save method is equal to HTTP_GET and HTTP_POST. 保存方法等于HTTP_GET和HTTP_POST。 Execute result is following: 执行结果如下:

  1. one perl process, insert 30000 documents. 一个perl进程,插入30000个文档。 Avg 700 requests/s. 平均700个请求/秒。 350 HTTP_GET and 350 HTTP_POST. 350 HTTP_GET和350 HTTP_POST。
  2. 10 perl process, insert 30000 documents. 10个perl进程,插入30000个文档。 Avg 1000 requests/s. 平均1000个请求/秒。 500 HTTP_GET and 500 HTTP_POST. 500 HTTP_GET和500 HTTP_POST。

After running 30 seconds, It'll report HTTP 500 error. 运行30秒后,它将报告HTTP 500错误。 I modified perl driver(ArangeDB) code for retrying it. 我修改了perl driver(ArangeDB)代码以重试它。 So I can finish this test. 这样我就可以完成此测试。

arangodb's log is following when it reported HTTP 500 error. 报告了HTTP 500错误时,arangodb的日志也在后面。

2014-10-04T14:46:47Z [26642] DEBUG [./lib/GeneralServer/GeneralServerDispatcher.h:403]   shutdownHandler called, but no handler is known for task
2014-10-04T14:46:47Z [26642] DEBUG [./lib/GeneralServer/GeneralServerDispatcher.h:403] shutdownHandler called, but no handler is known for task

I hoped my program can execute avg 3000-5000 requests/s and reduce HTTP 500 error. 我希望我的程序可以执行平均3000-5000个请求/秒并减少HTTP 500错误。 What's improvements I can use. 我可以使用哪些改进。 Thanks! 谢谢!

UPDATE BY 7/10/2014, My insert sample script is following. 更新于2014年7月10日,以下是我的插入示例脚本。 And I replaced save method by AQL. 然后我用AQL替换了保存方法。 One perl process, insert 10000 documents, Avg 900 requests/s, 1000 HTTP_POST/s. 一个perl进程,插入10000个文档,平均900个请求/秒,1000个HTTP_POST /秒。 (no HTTP 500) One perl process, insert 30000 documents, Avg 700 requests/s, 700 HTTP_POST/s. (无HTTP 500)一个perl进程,插入30000个文档,平均700个请求/秒,700个HTTP_POST /秒。 (There will issued HTTP 500, need retry it) (将会发出HTTP 500,需要重试)

#!/usr/bin/perl

use warnings;
use strict;

use ArangoDB;

my $itdb = ArangoDB->new(
{
    host       => '10.211.55.2',
    port       => 8529,
    keep_alive => 1,
}
);

# Find or create collection
$itdb->create('Node_temp',{isVolatile => JSON::true});
ImpNodes();

sub ImpNodes{

    for(1..30000){
        my $sth = $itdb->query('INSERT {
            "id": "Jony",
            "value": "File",
            "popup": "public",
            "version": "101",
            "machine": "10.20.18.193",
            "text": {
               "Address": ["center","bold","250","100"]
            },
            "menuitem":[
            {
                "value": "New",
                "onclick": "CreateNewDoc",
                "action": "CreateNewDoc"
            }
            ,
            {
                "value": "Open",
                "onclick": "OpenNewDoc",
                "action": "OpenNewDoc"
            },
            {
                "value": "Close",
                "onclick": "CloseDoc",
                "action": "CloseDoc"
            },
            {
                "value": "Save",
                "onclick": "SaveDoc",
                "action": "SaveDoc"
            }]
        } in Node_temp');

        my $cursor = $sth->execute({
            do_count => 1,
            batch_size => 10,
        });
    }
}

And I have modified Arangodb-0.08 for inserting smoothly in Connection.pm. 我修改了Arangodb-0.08,以便在Connection.pm中顺利插入。 http_post method: http_post方法:

$retries = 100 #for testing
for(1..$retries){
    ( undef, $code, $msg, undef, $body ) = $self->{_http_agent}->request(
        %{ $self->{_req_args} },
        method     => 'POST',
        path_query => $path,
        headers    => $headers,
        content    => $data,
    );
    last if ( $code < 500 || $code >= 600 );
    print "The return code is 5xx,retry http_post!\n";
    print $code, " : " , $msg , " : " , $body;
    select(undef, undef, undef, 3);
}

I strace'd the client program and could verify that a new connection is opened for each request. 我跟踪了客户端程序,并可以验证是否为每个请求打开了新连接。 This causes many system calls to be issued. 这导致发出许多系统调用。 strace looks like this for each request: strace对于每个请求如下所示:

17300 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
17300 ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7fffaee760c0) = -1 ENOTTY (Inappropriate ioctl for device)
17300 lseek(3, 0, SEEK_CUR)             = -1 ESPIPE (Illegal seek)
17300 ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7fffaee760c0) = -1 ENOTTY (Inappropriate ioctl for device)
17300 lseek(3, 0, SEEK_CUR)             = -1 ESPIPE (Illegal seek)
17300 fcntl(3, F_SETFD, FD_CLOEXEC)     = 0
17300 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
17300 fcntl(3, F_GETFL)                 = 0x2 (flags O_RDWR)
17300 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
17300 connect(3, {sa_family=AF_INET, sin_port=htons(8529), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
17300 select(8, NULL, [3], [3], {299, 999526}) = 1 (out [3], left {299, 999524})
17300 write(3, "POST /_api/cursor HTTP/1.1\r\nConnection: Keep-Alive\r\nUser-Agent: Furl::HTTP/3.05\r\nHost: 127.0.0.1\r\nContent-Type: application/json\r\nContent-Length: 1032\r\nHost: 127.0.0.1:8529\r\n\r\n", 176) = 176
17300 write(3, "{\"count\":true,\"query\":\"INSERT {\\n            \\\"id\\\": \\\"Jony\\\",\\n            \\\"value\\\": \\\"File\\\",\\n            \\\"popup\\\": \\\"public\\\",\\n            \\\"version\\\": \\\"101\\\",\\n            \\\"machine\\\": \\\"10.20.18.193\\\",\\n            \\\"text\\\": {\\n               \\"..., 1032) = 1032
17300 read(3, 0x15f0af0, 10240)         = -1 EAGAIN (Resource temporarily unavailable)
--
17300 close(3)                          = 0
17300 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
17300 rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER, 0x7faa49b221f0}, {SIG_IGN, [], SA_RESTORER, 0x7faa49b221f0}, 8) = 0
17300 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
17300 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
17300 rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, 0x7faa49b221f0}, {SIG_DFL, [], SA_RESTORER, 0x7faa49b221f0}, 8) = 0
17300 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0

I think you will want to avoid establishing and closing connections on each request. 我认为您将希望避免在每个请求上建立和关闭连接。 This also solves the problem with the OS running out of ports. 这也解决了操作系统端口用尽的问题。

To keep the driver from re-opening connections all the time I had to modify FURL as follows: 为了防止驱动程序一直重新打开连接,我必须按以下方式修改FURL:

In line 526 of Furl/HTTP.pm, FURL checks the HTTP response headers it gets from the server. 在Furl / HTTP.pm的第526行中,FURL检查它从服务器获取的HTTP响应标头。 It will read the Connection from the response header there, and compare the header value with the string keep-alive . 它将从那里的响应标头读取Connection ,并将标头值与字符串keep-alive The problem is that this does not take into account a different case of the response header. 问题是这没有考虑响应头的其他情况。 ArangoDB returns a header value of Keep-Alive (mind the caps), so FURL does not recognize it properly. ArangoDB返回头值Keep-Alive (注意上限),因此FURL无法正确识别它。

The following change to Furl/HTTP.pm fixes that: 对Furl / HTTP.pm的以下更改可修复以下问题:

-    if ($connection_header eq 'keep-alive') {
+    if (lc($connection_header) eq 'keep-alive') {

This makes the clients not close the connection after each request and not run out of ports. 这使得客户端在每个请求之后都不会关闭连接,并且不会耗尽端口。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM