简体   繁体   English

如何管理SQL查询

[英]How do you manage SQL Queries

At the moment my code (PHP) has too many SQL queries in it. 目前,我的代码(PHP)中包含太多SQL查询。 eg... 例如...

// not a real example, but you get the idea...
$results = $db->GetResults("SELECT * FROM sometable WHERE iUser=$userid");
if ($results) {
    // Do something
}

I am looking into using stored procedures to reduce this and make things a little more robust, but I have some concerns.. 我正在研究使用存储过程来减少这种情况并使事情变得更加强大,但我有一些担忧。

I have hundreds of different queries in use around the web site, and many of them are quite similar. 我在网站上使用了数百种不同的查询,其中很多都非常相似。 How should I manage all these queries when they are removed from their context (the code that uses the results) and placed in a stored procedure on the database? 当从上下文(使用结果的代码)中删除所有这些查询并将其置于数据库的存储过程中时,我该如何管理它们?

The best course of action for you will depend on how you are approaching your data access. 您的最佳行动方案取决于您接近数据访问的方式。 There are three approaches you can take: 您可以采取三种方法:

  • Use stored procedures 使用存储过程
  • Keep the queries in the code (but put all your queries into functions and fix everything to use PDO for parameters, as mentioned earlier) 将查询保留在代码中(但将所有查询放入函数中并修复所有内容以使用PDO作为参数,如前所述)
  • Use an ORM tool 使用ORM工具

If you want to pass your own raw SQL to the database engine then stored procedures would be the way to go if all you want to do is get the raw SQL out of your PHP code but keep it relatively unchanged. 如果你想将你自己的原始SQL传递给数据库引擎,那么如果你想要做的就是从你的PHP代码中获取原始SQL但保持相对不变,那么存储过程将会成为可能。 The stored procedures vs raw SQL debate is a bit of a holy war, but K. Scott Allen makes an excellent point - albeit a throwaway one - in an article about versioning databases : 存储过程与原始SQL辩论是一场神圣的战争,但是K. Scott Allen在一篇关于版本化数据库的文章中提出了一个很好的观点 - 虽然它是一次性的:

Secondly, stored procedures have fallen out of favor in my eyes. 其次,存储过程在我眼中已经失宠了。 I came from the WinDNA school of indoctrination that said stored procedures should be used all the time. 我来自WinDNA灌输学校,说应该一直使用存储过程。 Today, I see stored procedures as an API layer for the database. 今天,我将存储过程视为数据库的API层。 This is good if you need an API layer at the database level, but I see lots of applications incurring the overhead of creating and maintaining an extra API layer they don't need. 如果您需要在数据库级别使用API​​层,这很好,但我发现许多应用程序会产生创建和维护他们不需要的额外API层的开销。 In those applications stored procedures are more of a burden than a benefit. 在那些应用程序中,存储过程更多的是负担而不是利益。

I tend to lean towards not using stored procedures. 我倾向于倾向于不使用存储过程。 I've worked on projects where the DB has an API exposed through stored procedures, but stored procedures can impose some limitations of their own, and those projects have all , to varying degrees, used dynamically generated raw SQL in code to access the DB. 我在哪里的DB具有通过存储过程暴露的API项目的工作,但存储程序可以征收自身的一些局限性,并且这些项目在不同程度上,使用动态生成的原始SQL代码来访问数据库。

Having an API layer on the DB gives better delineation of responsibilities between the DB team and the Dev team at the expense of some of the flexibility you'd have if the query was kept in the code, however PHP projects are less likely to have sizable enough teams to benefit from this delineation. 在数据库上拥有一个API层可以更好地描述数据库团队和开发团队之间的责任,但代价是如果查询保存在代码中,您将拥有一些灵活性,但是PHP项目不太可能具有相当大的规模。足够的团队从这个划分中受益。

Conceptually, you should probably have your database versioned. 从概念上讲,您可能应该对数据库进行版本控制。 Practically speaking, however, you're far more likely to have just your code versioned than you are to have your database versioned. 但是,实际上,您更有可能只使用版本化的代码,而不是将数据库版本化。 You are likely to be changing your queries when you are making changes to your code, but if you are changing the queries in stored procedures stored against the database then you probably won't be checking those in when you check the code in and you lose many of the benefits of versioning for a significant area of your application. 当您对代码进行更改时,您可能会更改查询,但如果要更改存储在程序中的查询,那么在检查代码时您可能不会检查这些查询在应用程序的重要区域进行版本控制的许多好处。

Regardless of whether or not you elect not to use stored procedures though, you should at the very least ensure that each database operation is stored in an independent function rather than being embedded into each of your page's scripts - essentially an API layer for your DB which is maintained and versioned with your code. 无论您是否选择不使用存储过程,您至少应确保每个数据库操作都存储在一个独立的函数中,而不是嵌入到每个页面的脚本中 - 实质上是数据库的API层。使用您的代码进行维护和版本化。 If you're using stored procedures, this will effectively mean you have two API layers for your DB, one with the code and one with the DB, which you may feel unnecessarily complicates things if your project does not have separate teams. 如果您正在使用存储过程,这实际上意味着您有两个用于数据库的API层,一个包含代码,另一个包含数据库,如果您的项目没有单独的团队,您可能会感到不必要的复杂化。 I certainly do. 我当然这样做。

If the issue is one of code neatness, there are ways to make code with SQL jammed in it more presentable, and the UserManager class shown below is a good way to start - the class only contains queries which relate to the 'user' table, each query has its own method in the class and the queries are indented into the prepare statements and formatted as you would format them in a stored procedure. 如果问题是代码整洁,那么有一些方法可以使SQL中的代码更加干净,并且下面显示的UserManager类是一个很好的开始方式 - 该类只包含与'user'表相关的查询,每个查询在类中都有自己的方法,查询将缩进到prepare语句中,并按照在存储过程中格式化它们进行格式化。

// UserManager.php:

class UserManager
{
    function getUsers()
    {
        $pdo = new PDO(...);
        $stmt = $pdo->prepare('
            SELECT       u.userId as id,
                         u.userName,
                         g.groupId,
                         g.groupName
            FROM         user u
            INNER JOIN   group g
            ON           u.groupId = g.groupId
            ORDER BY     u.userName, g.groupName
        ');
        // iterate over result and prepare return value
    }

    function getUser($id) {
        // db code here
    }
}

// index.php:
require_once("UserManager.php");
$um = new UserManager;
$users = $um->getUsers();
foreach ($users as $user) echo $user['name'];

However, if your queries are quite similar but you have huge numbers of permutations in your query conditions like complicated paging, sorting, filtering, etc, an Object/Relational mapper tool is probably the way to go, although the process of overhauling your existing code to make use of the tool could be quite complicated. 但是,如果您的查询非常相似,但是在查询条件中有大量的排列,例如复杂的分页,排序,过滤等,那么对象/关系映射器工具可能是可行的方法,尽管是对现有代码进行大修的过程使用该工具可能会非常复杂。

If you decide to investigate ORM tools, you should look at Propel , the ActiveRecord component of Yii , or the king-daddy PHP ORM, Doctrine . 如果您决定调查ORM工具,您应该查看PropelYii的ActiveRecord组件,或者King-daddy PHP ORM, Doctrine Each of these gives you the ability to programmatically build queries to your database with all manner of complicated logic. 这些中的每一个都使您能够以各种复杂的逻辑以编程方式构建对数据库的查询。 Doctrine is the most fully featured, allowing you to template your database with things like the Nested Set tree pattern out of the box. Doctrine是功能最全面的,允许您使用开箱即用的嵌套集树模式等模板来模拟数据库。

In terms of performance, stored procedures are the fastest, but generally not by much over raw sql. 在性能方面,存储过程是最快的,但通常不会超过原始sql。 ORM tools can have a significant performance impact in a number of ways - inefficient or redundant querying, huge file IO while loading the ORM libraries on each request, dynamic SQL generation on each query... all of these things can have an impact, but the use of an ORM tool can drastically increase the power available to you with a much smaller amount of code than creating your own DB layer with manual queries. ORM工具可以通过多种方式对性能产生重大影响 - 低效或冗余的查询,在每个请求上加载ORM库时的巨大文件IO,每个查询的动态SQL生成......所有这些都会产生影响,但是与使用手动查询创建自己的数据库层相比,使用ORM工具可以使用更少的代码来大幅增加可用的功能。

Gary Richardson is absolutely right though, if you're going to continue to use SQL in your code you should always be using PDO's prepared statements to handle the parameters regardless of whether you're using a query or a stored procedure. Gary Richardson是绝对正确的,如果你要在代码中继续使用SQL,你应该总是使用PDO的预处理语句来处理参数,无论你是使用查询还是存储过程。 The sanitisation of input is performed for you by PDO. PDO为您执行输入的清洁。

// optional
$attrs = array(PDO::ATTR_PERSISTENT => true);

// create the PDO object
$pdo = new PDO("mysql:host=localhost;dbname=test", "user", "pass", $attrs);

// also optional, but it makes PDO raise exceptions instead of 
// PHP errors which are far more useful for debugging
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $pdo->prepare('INSERT INTO venue(venueName, regionId) VALUES(:venueName, :regionId)');
$stmt->bindValue(":venueName", "test");
$stmt->bindValue(":regionId", 1);

$stmt->execute();

$lastInsertId = $pdo->lastInsertId();
var_dump($lastInsertId);

Caveat: assuming that the ID is 1, the above script will output string(1) "1" . 警告:假设ID为1,上面的脚本将输出string(1) "1" PDO->lastInsertId() returns the ID as a string regardless of whether the actual column is an integer or not. 无论实际列是否为整数, PDO->lastInsertId()将ID作为字符串返回。 This will probably never be a problem for you as PHP performs casting of strings to integers automatically. 这可能永远不会成为你的问题,因为PHP会自动将字符串转换为整数。

The following will output bool(true) : 以下将输出bool(true)

// regular equality test
var_dump($lastInsertId == 1); 

but if you have code that is expecting the value to be an integer, like is_int or PHP's "is really, truly, 100% equal to" operator: 但是如果你有代码期望值是一个整数,比如is_int或PHP的“真的,真的,100%等于”运算符:

var_dump(is_int($lastInsertId));
var_dump($lastInsertId === 1);

you could run into some issues. 你可能遇到一些问题。

Edit: Some good discussion on stored procedures here 编辑: 这里有关存储过程的一些很好的讨论

First up, you should use placeholders in your query instead of interpolating the variables directly. 首先,您应该在查询中使用占位符,而不是直接插入变量。 PDO/MySQLi allow you to write your queries like: PDO / MySQLi允许您编写以下查询:

SELECT * FROM sometable WHERE iUser = ?

The API will safely substitute the values into the query. API将安全地将值替换为查询。

I also prefer to have my queries in the code instead of the database. 我也更喜欢在代码而不是数据库中查询。 It's a lot easier to work with an RCS when the queries are with your code. 当查询与您的代码一起使用时,使用RCS要容易得多。

I have a rule of thumb when working with ORM's: if I'm working with one entity at a time, I'll use the interface. 在使用ORM时我有一个经验法则:如果我一次使用一个实体,我将使用该界面。 If I'm reporting/working with records in aggregate, I typically write SQL queries to do it. 如果我正在汇总报告/处理记录,我通常会编写SQL查询来执行此操作。 This means there's very few queries in my code. 这意味着我的代码中的查询非常少。

I'd move all the SQL to a separate Perl module (.pm) Many queries could reuse the same functions, with slightly different parameters. 我将所有SQL移动到一个单独的Perl模块(.pm)许多查询可以重用相同的函数,参数略有不同。

A common mistake for developers is to dive into ORM libraries, parametrized queries and stored procedures. 开发人员常见的错误是深入了解ORM库,参数化查询和存储过程。 We then work for months in a row to make the code "better", but it's only "better" in a development kind of way. 然后我们连续几个月工作以使代码“更好”,但它在开发方式中只是“更好”。 You're not making any new features! 你没有做任何新功能!

Use complexity in your code only to address customer needs. 仅在代码中使用复杂性来满足客户需求。

I had to clean up a project wich many (duplicate/similar) queries riddled with injection vulnerabilities. 我不得不清理一个项目,其中许多(重复/类似)查询充斥着注入漏洞。 The first steps I took were using placeholders and label every query with the object/method and source-line the query was created. 我采取的第一步是使用占位符,并使用创建查询的对象/方法和源代码行标记每个查询。 (Insert the PHP-constants METHOD and LINE into a SQL comment-line) (将PHP常量METHODLINE插入SQL注释行)

It looked something like this: 它看起来像这样:

-- @Line:151 UserClass::getuser(): - @Line:151 UserClass :: getuser():

 SELECT * FROM USERS; 

Logging all queries for a short time supplied me with some starting points on which queries to merge. 在短时间内记录所有查询为我提供了一些要合并查询的起点。 (And where!) (还有哪里!)

Use a ORM package, any half decent package will allow you to 使用ORM包,任何一半体面的包将允许您

  1. Get simple result sets 获得简单的结果集
  2. Keep your complex SQL close to the data model 使复杂的SQL保持接近数据模型

If you have very complex SQL, then views are also nice to making it more presentable to different layers of your application. 如果你有非常复杂的SQL,那么视图也很适合使它更适合应用程序的不同层。

We were in a similar predicament at one time. 我们曾经处于类似的困境中。 We queried a specific table in a variety of ways, over 50+. 我们以超过50多种方式查询了一个特定的表格。

What we ended up doing was creating a single Fetch stored procedure that includes a parameter value for the WhereClause. 我们最终做的是创建一个包含WhereClause参数值的Fetch存储过程。 The WhereClause was constructed in a Provider object, we employed the Facade design pattern, where we could scrub it for any SQL injection attacks. WhereClause是在Provider对象中构造的,我们采用了Facade设计模式,我们可以在其中擦除任何SQL注入攻击。

So as far as maintenance goes, it is easy to modify. 因此,就维护而言,它很容易修改。 SQL Server is also quite the chum and caches the execution plans of dynamic queries so the the overall performance is pretty good. SQL Server也非常流行,并且缓存了动态查询的执行计划,因此整体性能非常好。

You'll have to determine the performance drawbacks based on your own system and needs, but all and all, this works very well for us. 您必须根据自己的系统和需求确定性能缺陷,但总而言之,这对我们来说非常有效

There are some libraries, such as MDB2 in PEAR that make querying a bit easier and safer. 有一些库,例如PEAR中的MDB2,使查询更容易和更安全。

Unfortunately, they can be a bit wordy to set up, and you sometimes have to pass them the same info twice. 不幸的是,它们设置起来有点罗嗦,有时您必须将相同的信息传递两次。 I've used MDB2 in a couple of projects, and I tended to write a thin veneer around it, especially for specifying the types of fields. 我在几个项目中使用过MDB2,我倾向于在它周围写一个薄的贴面,特别是用于指定字段的类型。 I generally make an object that knows about a particular table and its columns, and then a helper function in that fills in field types for me when I call an MDB2 query function. 我通常会创建一个知道特定表及其列的对象,然后在调用MDB2查询函数时为我填充字段类型的辅助函数。

For instance: 例如:

function MakeTableTypes($TableName, $FieldNames)
{
    $Types = array();

    foreach ($FieldNames as $FieldName => $FieldValue)
    {
        $Types[] = $this->Tables[$TableName]['schema'][$FieldName]['type'];
    }

    return $Types;
}

Obviously this object has a map of table names -> schemas that it knows about, and just extracts the types of the fields you specify, and returns an matching type array suitable for use with an MDB2 query. 显然,这个对象有一个表名的映射 - >它知道的模式,只是提取你指定的字段的类型,并返回一个适合用于MDB2查询的匹配类型数组。

MDB2 (and similar libraries) then handle the parameter substitution for you, so for update/insert queries, you just build a hash/map from column name to value, and use the 'autoExecute' functions to build and execute the relevant query. 然后,MDB2(和类似的库)为您处理参数替换,因此对于更新/插入查询,您只需构建从列名到值的哈希/映射,并使用“autoExecute”函数来构建和执行相关查询。

For example: 例如:

function UpdateArticle($Article)
{
    $Types = $this->MakeTableTypes($table_name, $Article);

    $res = $this->MDB2->extended->autoExecute($table_name,
        $Article,
        MDB2_AUTOQUERY_UPDATE,
        'id = '.$this->MDB2->quote($Article['id'], 'integer'),
        $Types);
}

and MDB2 will build the query, escaping everything properly, etc. 并且MDB2将构建查询,正确地转义所有内容等。

I'd recommend measuring performance with MDB2 though, as it pulls in a fair bit of code that might cause you problems if you're not running a PHP accelerator. 我建议使用MDB2测量性能,因为如果你没有运行PHP加速器,它会引入一些可能导致问题的代码。

As I say, the setup overhead seems daunting at first, but once it's done the queries can be simpler/more symbolic to write and (especially) modify. 正如我所说,设置开销一开始似乎令人生畏,但一旦完成,查询可以更简单/更符号来编写和(特别是)修改。 I think MDB2 should know a bit more about your schema, which would simpify some of the commonly used API calls, but you can reduce the annoyance of this by encapsulating the schema yourself, as I mentioned above, and providing simple accessor functions that generate the arrays MDB2 needs to perform these queries. 我认为MDB2应该更多地了解你的模式,这会简化一些常用的API调用,但你可以通过自己封装模式来减少这种烦恼,如上所述,并提供简单的访问器函数来生成数组MDB2需要执行这些查询。

Of course you can just do flat SQL queries as a string using the query() function if you want, so you're not forced to switch over to the full 'MDB2 way' - you can try it out piecemeal, and see if you hate it or not. 当然,如果你愿意的话,你可以使用query()函数将平面SQL查询作为字符串进行,所以你不必被迫切换到完整的'MDB2方式' - 你可以尝试零碎,看看你是否讨厌与否。

使用像QCodo这样的ORM框架 - 您可以轻松映射现有数据库

这个问题也有一些有用的链接......

I try to use fairly generic functions and just pass the differences in them. 我尝试使用相当通用的函数,只是传递它们之间的差异。 This way you only have one function to handle most of your database SELECT's. 这样,您只有一个函数来处理大多数数据库SELECT。 Obviously you can create another function to handle all your INSERTS. 显然你可以创建另一个函数来处理所有的INSERTS。

eg. 例如。

function getFromDB($table, $wherefield=null, $whereval=null, $orderby=null) {
    if($wherefield != null) { 
        $q = "SELECT * FROM $table WHERE $wherefield = '$whereval'"; 
    } else { 
        $q = "SELECT * FROM $table";
    }
    if($orderby != null) { 
        $q .= " ORDER BY ".$orderby; 
    }

    $result = mysql_query($q)) or die("ERROR: ".mysql_error());
    while($row = mysql_fetch_assoc($result)) {
        $records[] = $row;
    }
    return $records;
}

This is just off the top of my head, but you get the idea. 这只是我的头脑,但你明白了。 To use it just pass the function the necessary parameters: 要使用它只需传递函数必要的参数:

eg. 例如。

$blogposts = getFromDB('myblog', 'author', 'Lewis', 'date DESC');

In this case $blogposts will be an array of arrays which represent each row of the table. 在这种情况下, $ blogposts将是一个数组数组,代表表的每一行。 Then you can just use a foreach or refer to the array directly: 然后你可以使用foreach或直接引用数组:

echo $blogposts[0]['title'];

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM