简体   繁体   English

SQL 基准测试:PHP ActiveRecord ORM 与 MySQL 与 CodeIgniter Active Record 与标准 PHP

[英]SQL Benchmarks: PHP ActiveRecord ORM vs. MySQL vs. CodeIgniter Active Record vs. Standard PHP

Tests updated to be more readable;测试更新为更具可读性; all done within a 100x foreach loop.所有这些都在 100 倍的 foreach 循环中完成。

The test query is SELECT * FROM school_courses;测试查询是SELECT * FROM school_courses;

Can anyone provide "outside-the-box-thinking" feedback as to:任何人都可以提供关于以下方面的“开箱即用”的反馈:

a) Why PHP ActiveRecord ORM takes 4 seconds to perform the same query per the below results? a) 为什么 PHP ActiveRecord ORM 需要 4 秒才能根据以下结果执行相同的查询?

b) Whether this is a practical benchmark or more of a hypothetical one for comparing methods of querying? b) 这是比较查询方法的实用基准还是更多假设基准?

c) Are there other methods (test cases) I should try (or revise these ones) to get a clearer picture? c) 是否还有其他方法(测试用例)我应该尝试(或修改这些方法)以获得更清晰的图片?

Results (with PDO & MySQLi)结果(使用 PDO 和 MySQLi)

Iterations: 100

PHP (config file)
Base Time: 5.793571472168E-5
Gross Time: 0.055607080459595
Net Time: 0.055549144744873

PHP ActiveRecord ORM
Base Time: 5.2213668823242E-5
Gross Time: 4.1013090610504
Net Time: 4.1012568473816

MySQL (standard)
Base Time: 5.1975250244141E-5
Gross Time: 0.32771301269531
Net Time: 0.32766103744507

CodeIgniter (Active Record)
Base Time: 5.1975250244141E-5
Gross Time: 0.28282189369202
Net Time: 0.28276991844177

MySQLi
Base Time: 5.1975250244141E-5
Gross Time: 0.20240592956543
Net Time: 0.20235395431519

PDO
Base Time: 5.2928924560547E-5
Gross Time: 0.17662906646729
Net Time: 0.17657613754272

Tests测试

// Benchmark tests
$runs = 100;

// PHP (config file)
for ($i = 0; $i < $runs; $i++) {
    $this->view_data['courses'] = course_info();
}

// PHP ActiveRecord ORM
for ($i = 0; $i < $runs; $i++) {
    $this->view_data['courses'] = Course::all();
}

// mysql_* (MySQL standard; deprecated)
for ($i = 0; $i < $runs; $i++) {
    $sql = mysql_query('SELECT * FROM school_courses') or die(mysql_error());
    while ($row = mysql_fetch_object($sql)) {
        array_push($this->view_data['courses'], $row);
    }
}

// CodeIgniter (Active Record)
for ($i = 0; $i < $runs; $i++) {
    $this->view_data['courses'] = $this->db->get('school_courses');
}

// mysqli_* (MySQLi)
for ($i = 0; $i < $runs; $i++) {
    $res = $mysqli->query('SELECT * FROM school_courses');
    while ($row = $res->fetch_object()) {
        array_push($this->view_data['courses'], $row);
    }
}

// PDO
for ($i = 0; $i < $runs; $i++) {
    foreach($conn->query('SELECT * FROM school_courses') as $row) {
        array_push($this->view_data['courses'], $row);
    }
}

So the reason why PHP ActiveRecord ORM introduces so much overhead when benchmarking concurrent connections is due to the fact that each result that is returned instantiates a new Model object.所以 PHP ActiveRecord ORM 在对并发连接进行基准测试时引入如此多的开销的原因是因为每个返回的结果都实例化了一个新的 Model 对象。 This being integral to the usage of this ORM library, I don't see any reasonable way of making changes without overhauling the entire library.这是使用这个 ORM 库不可或缺的一部分,我看不出有任何合理的方法可以在不彻底检查整个库的情况下进行更改。

Here is what I found:这是我发现的:

Inside the find_by_sql() method in the Table class, you have:在 Table 类的 find_by_sql() 方法中,您有:

    $sth = $this->conn->query($sql,$this->process_data($values));

    while (($row = $sth->fetch()))
    {
        $model = new $this->class->name($row,false,true,false);

        if ($readonly)
            $model->readonly();

        if ($collect_attrs_for_includes)
            $attrs[] = $model->attributes();

        $list[] = $model;
    }

Specifically, the dynamic model instantiation new $this->class->name() is responsible for the overhead, weighing in somewhere around 0.004 per result fetched, let's say.具体来说,动态模型实例化new $this->class->name()负责开销,比方说每个结果的权重约为 0.004。

You take this and multiply it times the number of records now, (10 records = 0.04).你把它乘以现在的记录数,(10 条记录 = 0.04)。 Now multiply that by the number of concurrent connections, let's say 100, and you have an foreseeable bottleneck problem.现在将其乘以并发连接数,假设为 100,您就会遇到可预见的瓶颈问题。

Four (4) seconds for 100 users (hypothetically speaking) accessing a table at the same time containing 10 records. 100 个用户(假设)同时访问包含 10 条记录的表的四 (4) 秒。

Should I be concerned at this point that the number of records being fetched could potentially cause bottlenecking issues due to the way this library is instantiating a model class for every record?此时我是否应该担心,由于该库为每条记录实例化模型类的方式,所获取的记录数量可能会导致瓶颈问题?

Again, this all could be hypothetical speech at this point that may never exist or present a problem in the real world assuming proper use of an ORM.同样,在这一点上,这一切都可能是假设性的演讲,假设正确使用 ORM,在现实世界中可能永远不会存在或出现问题。 And unless these tests or conclusions are inaccurate, what I'm trying to simulate here is traffic load for say 100, 1,000 and 10,000 active on-site visitors.除非这些测试或结论不准确,否则我在这里尝试模拟的是 100、1,000 和 10,000 名活跃现场访问者的流量负载。

In other words, if I don't ever add another course (limit 10), will 10,000 visitors browsing the courses page, for example, lead to a 400 second (6.67 minute) wait time for others to move off the page?换句话说,如果我不添加其他课程(限制为 10),例如,10,000 名访问者浏览课程页面是否会导致其他人移出页面的等待时间为 400 秒(6.67 分钟)? And if that's the case, then I will have discovered my own answer (hence this post) and will look into finding another ORM or resort to refactoring on a case-by-case basis.如果是这种情况,那么我将找到我自己的答案(因此是这篇文章),并将研究寻找另一个 ORM 或根据具体情况进行重构。

Is this this most appropriate way of benchmarking and simulating traffic load?这是基准测试和模拟流量负载的最合适方法吗?

Additional Resources其他资源

How to Apache Stress Test With ab Tool https://wiki.appnexus.com/display/documentation/How+to+Apache+Stress+Test+With+ab+Tool如何使用 ab 工具进行 Apache 压力测试https://wiki.appnexus.com/display/documentation/How+to+Apache+Stress+Test+With+ab+Tool

Rewrite recommendation:改写推荐:

I don't want to sound brutal but you can save a lot of headache in the future as well as keep up with current practices if you forget everything you know about mysql_().我不想听起来很粗鲁,但是如果您忘记了有关 mysql_() 的所有知识,您可以在将来避免很多麻烦,并且可以跟上当前的做法。 By today's standards it is trash honestly.按照今天的标准,老实说它是垃圾。 Look into mysqli_ or PDO as your db interfaces.查看 mysqli_ 或 PDO 作为您的数据库接口。

mysqli_ : http://us2.php.net/manual/en/book.mysqli.php mysqli_ : http://us2.php.net/manual/en/book.mysqli.php

PDO: http://us2.php.net/manual/en/book.pdo.php PDO: http : //us2.php.net/manual/en/book.pdo.php

Report back benchmarks then...然后报告基准...

Your simple query isn't really a fair test.您的简单查询并不是真正的公平测试。 ORMs are fine and fairly competitive for simple queries that like that.对于像这样的简单查询,ORM 很好并且相当有竞争力。 It's more complex ones (ie LEFT JOIN) that ORMs create inefficient queries on and you end having to bypass them.更复杂的(即 LEFT JOIN)ORM 会在其上创建低效查询,您最终不得不绕过它们。 ORMs will always be slower than raw SQL written by someone who knows SQL. ORM 总是比了解 SQL 的人编写的原始 SQL 慢。 Of course, knowing SQL is the key.当然,了解SQL是关键。

If you are considering ORMs, you really should try Doctrine.如果您正在考虑 ORM,您真的应该尝试 Doctrine。 I am not a fan ORMs (at all), but that is the most popular PHP ORM out there.我不是 ORM 的粉丝(一点也不),但那是最流行的 PHP ORM。

Bulk inserts are also another area where some ORMs and DB abstraction layers trip up.批量插入也是一些 ORM 和 DB 抽象层失败的另一个领域。 Instead of recognizing that a bulk insert can be used, they do single inserts in a loop.他们没有意识到可以使用批量插入,而是在循环中执行单个插入。 That is going to cause table locking issues on MyISAM in addition to being slow.除了速度慢之外,这还会导致 MyISAM 上的表锁定问题。 Perhaps add a bulk insert test, letting each DB layer generate the insert query if possible.也许添加一个批量插入测试,如果可能的话让每个数据库层生成插入查询。

What your testing method does reveal is that over many iterations the overhead of each DB access method adds up.您的测试方法确实表明,在多次迭代中,每个数据库访问方法的开销加起来。 I would suggest eliminating query overhead altogether and just use "SELECT VERSION()" instead.我建议完全消除查询开销,而只使用“SELECT VERSION()”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM