简体   繁体   English

Haskell数据库连接

[英]Haskell database connections

Please look at this scotty app (it's taken directly from this old answer from 2014 ): 请查看这个scotty应用程序(它直接取自2014年的这个旧答案 ):

import Web.Scotty
import Database.MongoDB
import qualified Data.Text.Lazy as T
import Control.Monad.IO.Class

runQuery :: Pipe -> Query -> IO [Document]
runQuery pipe query = access pipe master "nutrition" (find query >>= rest) 

main = do
  pipe <- connect $ host "127.0.0.1"
  scotty 3000 $ do
    get "/" $ do
      res <- liftIO $ runQuery pipe (select [] "stock_foods")
      text $ T.pack $ show res

You see how the the database connection ( pipe ) is created only once when the web app launches. 您将看到如何在启动Web应用程序时仅一次创建数据库连接( pipe )。 Subsequently, thousands if not millions of visitors will hit the "/" route simultaneously and read from the database using the same connection ( pipe ). 随后,成千上万(如果不是数百万)的访问者将同时点击“ /”路由,并使用相同的连接( pipe )从数据库中读取数据。

I have questions about how to properly use Database.MongoDB : 我对如何正确使用Database.MongoDB有疑问:

  1. Is this the proper way of setting things up? 这是设置事情的正确方法吗? As opposed to creating a database connection for every visit to "/". 相对于每次访问“ /”都创建数据库连接。 In this latter case, we could have millions of connections at once. 在后一种情况下,我们可以一次拥有数百万个连接。 Is that discouraged? 灰心吗? What are the advantages and drawbacks of such an approach? 这种方法的优缺点是什么?
  2. In the app above, what happens if the database connection is lost for some reason and needs to be created again? 在上面的应用程序中,如果由于某种原因数据库连接丢失并且需要重新创建该怎么办? How would you recover from that? 您将如何恢复呢?
  3. What about authentication with the auth function ? 如何使用auth函数进行身份auth Should the auth function only be called once after creating the pipe , or should it be called on every hit to "/"? 应该在创建pipe之后仅将auth函数调用一次,还是应该在每次命中“ /”时调用auth函数?
  4. Some say that I'm supposed to use a pool ( Data.Pool ). 有人说我应该使用一个池( Data.Pool )。 It looks like that would only help limit the number of visitors using the same database connection simultaneously. 看起来这只会帮助限制同时使用同一数据库连接的访问​​者数量。 But why would I want to do that? 但是我为什么要这样做呢? Doesn't the MongoDB connection have a built-in support for simultaneous usages? MongoDB连接是否没有对同时使用的内置支持?
  1. Even if you create connection per client you won't be able to create too many of them. 即使您为每个客户端创建连接,您也将无法创建过多的连接。 You will hit ulimit. 您将击中ulimit。 Once you hit that ulimit the client that hit this ulimit will get a runtime error. 一旦您击中该ulimit,则击中该ulimit的客户端将获得运行时错误。 The reason it doesn't make sense is because mongodb server will be spending too much time polling all those connections and it will have only as many meaningful workers as many CPUs your db server has. 之所以没有意义,是因为mongodb服务器将花费太多时间轮询所有这些连接,并且它的有意义的工作人员将与db服务器拥有的CPU数量一样多。 One connection is not a bad idea, because mongodb is designed to send several requests and wait for responses. 一个连接不是一个坏主意,因为mongodb旨在发送多个请求并等待响应。 So, it will utilize as much resources as your mongodb can have with only one limitation - you have only one pipe for writing, and if it closes accidentally you will need to recreate this pipe yourself. 因此,它将仅使用一个限制就可以使用mongodb所拥有的尽可能多的资源-您只有一个写入管道,并且如果意外关闭,则需要您自己重新创建此管道。 So, it makes more sense to have a pool of connections. 因此,拥有一个连接池更有意义。 It doesn't need to be big. 它不必很大。 I had an app which authenticates users and gives them tokens. 我有一个可以验证用户身份并给他们令牌的应用程序。 With 2500 concurrent users per second it only had 3-4 concurrent connections to the database. 每秒有2500个并发用户,它与数据库只有3-4个并发连接。

Here are the benefits connection pool gives you: 以下是连接池为您带来的好处:

  • If you hit pool connection limit you will be waiting for the next available connection and will not get runtime error. 如果达到池连接限制,则将等待下一个可用连接,并且不会出现运行时错误。 So, you app will wait a little bit instead of rejecting your client. 因此,您的应用将稍等片刻,而不是拒绝您的客户端。

  • Pool will be recreating connections for you. 池将为您重新建立连接。 You can configure pool to close excess of connections and create more up until certain limit as you need them. 您可以配置池以关闭过多的连接,并根据需要创建更多的连接,直到达到特定限制。 If you connection breaks while you read from it or write to it, then you just take another connection from the pool. 如果您在读取或写入连接时断开连接,则只需从池中建立另一个连接。 If you don't return that broken connection to the pool pool will create another connection for you. 如果您不将断开的连接返回到池中,池将为您创建另一个连接。

    1. If the database connection is closed then: mongodb listener on this connection will exit printing a error message on your terminal, your app will receive an IO error. 如果数据库连接已关闭,则:此连接上的mongodb监听器将退出,并在终端上显示一条错误消息,您的应用将收到IO错误。 In order to handle this error you will need to create another connection and try again. 为了处理此错误,您将需要创建另一个连接,然后重试。 When it comes to handling this situation you understand that it's easier to use a db pool. 在处理这种情况时,您了解使用数据库池更容易。 Because eventually you solution to this will resemble connection pool very much. 因为最终您对此的解决方案将非常类似于连接池。

    2. I do auth once as part of opening a connection. 作为连接的一部分,我只进行一次身份验证。 If you need to auth another user later you can always do it. 如果以后需要认证其他用户,可以随时进行。

    3. Yes, mongodb handles simultaneous usage, but like I said it gives only one pipe to write and it soon becomes a bottle neck. 是的,mongodb可以同时使用,但是就像我说的那样,它只写一个管道,很快就成为瓶颈。 If you create at least as many connections as your mongodb server can afford threads for handling them(CPU count), then they will be going at full speed. 如果您创建的连接数至少等于mongodb服务器所能负担的用于处理它们的线程数(CPU数),则它们将全速运行。

If I missed something feel free to ask for clarifications. 如果我错过了任何事情,请随时澄清。 Thank you for your question. 谢谢你的问题。

What you really want is a database connection pool. 您真正想要的是一个数据库连接池。 Take a look at the code from this other answer . 看一下另一个答案中的代码。

Instead of auth , you can use withMongoDBPool to if your MongoDB server is in secure mode. 如果您的MongoDB服务器处于安全模式,则可以使用withMongoDBPool代替auth

Is this the proper way of setting things up? 这是设置事情的正确方法吗? As opposed to creating a database connection for every visit to "/". 相对于每次访问“ /”都创建数据库连接。 In this latter case, we could have millions of connections at once. 在后一种情况下,我们可以一次拥有数百万个连接。 Is that discouraged? 灰心吗? What are the advantages and drawbacks of such an approach? 这种方法的优缺点是什么?

You do not want to open one connection and then use it. 您不想打开一个连接然后再使用它。 The HTTP server you are using, which underpins Scotty, is called Warp. 支持Scotty的HTTP服务器称为Warp。 Warp has a multi-core, multi-green-thread design . Warp具有多核,多绿线设计 You are allowed to share the same connection across all threads, since Database.MongoDB says outright that connections are thread-safe, but what will happen is that when one thread is blocked waiting for a response ( the MongoDB protocol follows a simple request-response design ) all threads in your web service will block. 允许您在所有线程之间共享相同的连接,因为Database.MongoDB直截了当地说连接是线程安全的,但是将发生的事情是,当一个线程被阻塞以等待响应时( MongoDB协议遵循简单的请求-响应设计 ),您的Web服务中的所有线程都会阻塞。 This is unfortunate. 这是不幸的。

We can instead create a connection on every request. 相反,我们可以在每个请求上创建一个连接。 This trivially solves the problem of one thread's blocking another but leads to its own share of problems. 这琐碎地解决了一个线程阻塞另一个线程的问题,但导致了自己的问题。 The overhead of setting up a TCP connection, while not substantial, is also not zero. 建立TCP连接的开销虽然不大,但也不为零。 Recall that every time we want to open or close a socket we have to jump from the user to the kernel, wait for the kernel to update its internal data structures, and then jump back (a context switch). 回想一下,每次我们想要打开或关闭套接字时,我们都必须从用户跳到内核,等待内核更新其内部数据结构,然后跳回(上下文切换)。 We also have to deal with the TCP handshake and goodbyes. 我们还必须处理TCP握手和告别。 We would also, under high load, run out file descriptors or memory. 在高负载下,我们还将耗尽文件描述符或内存。

It would be nice if we had a solution somewhere in between. 如果我们之间有解决方案,那就太好了。 The solution should be 解决方案应该是

  • Thread-safe 线程安全
  • Let us max-bound the number of connections so we don't exhaust the finite resources of the operating system 让我们最大程度地限制连接数,这样我们就不会耗尽操作系统的有限资源
  • Quick
  • Share connections across threads under normal load 在正常负载下跨线程共享连接
  • Create new connections as we experience increased load 随着负载的增加,创建新的连接
  • Allow us to clean up resources (like closing a handle) as connections are deleted under reduced load 在减少负载的情况下删除连接时,请允许我们清理资源(如关闭句柄)
  • Hopefully already written and battle-tested by other production systems 希望已经由其他生产系统编写并经过了实战测试

It is this exactly problem that resource-pool tackles. 资源池解决的正是这个问题。

Some say that I'm supposed to use a pool (Data.Pool). 有人说我应该使用一个池(Data.Pool)。 It looks like that would only help limit the number of visitors using the same database connection simultaneously. 看起来这只会帮助限制同时使用同一数据库连接的访问​​者数量。 But why would I want to do that? 但是我为什么要这样做呢? Doesn't the MongoDB connection have a built-in support for simultaneous usages? MongoDB连接是否没有对同时使用的内置支持?

It is unclear what you mean by simultaneous usages. 目前尚不清楚同时使用是什么意思。 There is one interpretation I can guess at: you mean something like HTTP/2, which has pipelining built into the protocol. 我可以猜测一种解释:您的意思是类似HTTP / 2的东西,该协议已内置流水线。

standard picture of pipelining http://research.worksap.com/wp-content/uploads/2015/08/pipeline.png 流水线的标准图片http://research.worksap.com/wp-content/uploads/2015/08/pipeline.png

Above we see the client making multiple requests to the server, without waiting for a response, and then the client can receive responses back in some order. 在上方,我们看到客户端向服务器发出多个请求,而无需等待响应,然后客户端可以按一定顺序接收响应。 (Time flows from the top to the bottom.) This MongoDB does not have. (时间从上到下流动。)此MongoDB没有。 This is a fairly complicated protocol design that is not that much better than just asking your clients to use connection pools. 这是一个相当复杂的协议设计,没有比仅要求您的客户端使用连接池好多少了。 And MongoDB is not alone here: the simple request-and-response design is something that Postgres, MySQL, SQL Server, and most other databases have settled on. 而且MongoDB并不孤单:Postgres,MySQL,SQL Server和大多数其他数据库都采用了简单的请求和响应设计。

And: it is true that connection pool limits the load you can take as a web service before all threads are blocked and your user just sees a loading bar. 并且:的确,连接池限制了在阻止所有线程并且用户仅看到加载栏之前可以作为Web服务承担的负载。 But this problem would exist in any of the three scenarios (connection pooling, one shared connection, one connection per request)! 但是,在三种情况下(连接池,一个共享连接,每个请求一个连接),都会存在此问题! The computer has finite resources, and at some point something will collapse under sufficient load. 计算机具有有限的资源,在某些情况下,某些东西会在足够的负载下崩溃。 Connection pooling's advantages are that it scales gracefully right up until the point it cannot. 连接池的优点是可以正常扩展直到无法扩展为止。 The correct solution to handling more traffic is to increase the number of computers; 处理更多流量的正确解决方案是增加计算机数量。 we should not avoid pooling simply due to this problem. 我们不应该仅仅因为这个问题就避免合并。

In the app above, what happens if the database connection is lost for some reason and needs to be created again? 在上面的应用程序中,如果由于某种原因数据库连接丢失并且需要重新创建该怎么办? How would you recover from that? 您将如何恢复呢?

I believe these kinds of what-if's are outside the scope of Stack Overflow and deserve no better answer than "try it and see." 我认为,这些假设分析超出了Stack Overflow的范围,没有比“尝试一下然后看”更好的答案了。 Buuuuuuut given that the server terminates the connection, I can take a stab at what might happen: assuming Warp forks a green thread for each request (which I think it does), each thread will experience an unchecked IOException as it tries to write to the closed TCP connection. Buuuuuuut假​​设服务器终止了连接,我可以对可能发生的情况take之以鼻:假设Warp为每个请求派生一个绿色线程(我认为确实如此),则每个线程在尝试写入时都会遇到未经检查的IOException TCP连接已关闭。 Warp would catch this exception and serve it as an HTTP 500, hopefully writing something useful to the logs also. Warp会捕获此异常并将其用作HTTP 500,希望也可以为日志编写一些有用的东西。 Assuming a single-connection model like you have now, you could either do something clever (but high in lines of code) where you "reboot" your main function and set up a second connection. 假设像现在这样的单连接模型,您可以在“重新启动” main功能并建立第二个连接的地方做一些聪明的事情(但是代码行很多)。 Something I do for hobby projects: should anything odd occur, like a dropped connection, I ask my supervisor process (like systemd) to watch the logs and restart the web service. 我为爱好项目所做的事情:万一发生异常情况(例如断开连接),我请主管进程(如systemd)观看日志并重新启动Web服务。 Though clearly not a great solution for a production, money-makin' website, it works well enough for small apps. 虽然显然不是一个生产,赚钱的网站的好解决方案,但它对于小型应用程序已经足够好了。

What about authentication with the auth function? 使用auth函数进行身份auth怎么办? Should the auth function only be called once after creating the pipe, or should it be called on every hit to "/"? 应该仅在创建管道之后才将auth函数调用一次,还是在每次命中“ /”时都调用auth函数?

It should be called once after creating the connection. 创建连接后应调用一次。 MongoDB authentication is per-connection. MongoDB身份验证是按连接的。 You can see an example here of how the db.auth() command mutates the MongoDB server's data structures corresponding to the current client connection . 您可以在此处看到db.auth()命令如何db.auth()与当前客户端连接相对应的MongoDB服务器的数据结构的示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM