简体   繁体   English

在PHP中处理大型数组

[英]Handling big arrays in PHP

The application i am working on needs to obtain dataset of around 10mb maximum two times a hour. 我正在处理的应用程序需要每小时两次获得大约10mb的数据集。 We use that dataset to display paginated results on the site also simple search by one of the object properties should also be possible. 我们使用该数据集在站点上显示分页结果,也可以通过其中一个对象属性进行简单搜索。

Currently we are thinking about 2 different ways to implement this 目前,我们正在考虑实现此目的的两种不同方式

1.) Store the json dataset in the database or a file in the file system, read that and loop over to display results whenever we need. 1.)将json数据集存储在数据库或文件系统中的文件中,读取并循环以在需要时显示结果。

2.) Store the json dataset in relational MySQL table and query the results and loop over whenever we need to display them. 2.)将json数据集存储在关系MySQL表中并查询结果并在需要显示它们时循环。

Replacing/Refreshing the results has to be done multiple times per hour as i said. 正如我所说,更换/刷新结果必须每小时多次完成。

Both ways have cons. 两种方式都有缺点。 I am trying to choose a good way which is less evil overall. 我试图选择一种总体上不那么邪恶的好方法。 Reading 10 MB in memory is not a lot and on the other hand rewriting a table few times a hour could produce conflicts in my opinion. 在内存中读取10 MB并不是很多,另一方面,每小时重写几次表可能会产生冲突。

My concern regarding 1.) is how safe the app will be if we read 10mb in the memory all the time? 我对1.)的关注是,如果我们一直在内存中读取10mb,应用程序的安全性如何? What will happen if multiple users do this at some point of time, is this something to worry about or PHP is able to handle this in background? 如果多个用户在某个时间点执行此操作会发生什么,这是值得担心的还是PHP能够在后台处理此问题?

What do you think it will be best for this use case? 你觉得这个用例最好是什么?

Thanks! 谢谢!

When php runs on a web server (as it usually does) the server starts new php processes on demand when they're needed to handle concurrent requests. 当php在Web服务器上运行时(通常如此),当需要处理并发请求时,服务器会根据需要启动新的php进程。 A powerful web server may allow fifty or so php processes. 一个功能强大的Web服务器可能允许50个左右的php进程。 If each of them is handling this large data set, you'll need to have enough RAM for fifty copies. 如果他们每个人都处理这个大型数据集,你需要有足够的RAM存储50个副本。 And, you'll need to load that data somehow for each new request. 并且,您需要以某种方式为每个新请求加载该数据。 Reading 10mb from a file is not an overwhelming burden unless you have some sort of parsing to do. 除非你有某种解析方法,否则从文件读取10mb并不是一个压倒性的负担。 But it is a burden. 但这是一种负担。

As it starts to handle each request, php offers a clean context to the programming environment. 当它开始处理每个请求时,php为编程环境提供了一个干净的上下文。 php is not good at maintaining in-RAM context from one request to the next. php不擅长从一个请求到下一个请求维护RAM内容。 You may be able to figure out how to do it, but it's a dodgy solution. 你或许可以弄清楚如何去做,但这是一个狡猾的解决方案。 If you're running on a server that's shared with other web applications -- especially applications you don't trust -- you should not attempt to do this; 如果您在与其他Web应用程序共享的服务器上运行 - 尤其是您不信任的应用程序 - 则不应尝试这样做; the other applications will have access to your in-RAM data. 其他应用程序可以访问您的RAM内数据。

You can control the concurrent processes with Apache or nginx configuration settings, and restrict it to five or ten copies of php. 您可以使用Apache或nginx配置设置来控制并发进程,并将其限制为五个或十个php副本。 But if you have a lot of incoming requests, those requests get serialized and they will slow down. 但是如果你有很多传入的请求,这些请求会被序列化,并且会慢下来。

Will this application need to scale up? 这个应用程序需要扩展吗? Will you eventually need a pool of web servers to handle all your requests? 您最终是否需要一个Web服务器池来处理您的所有请求? If so, the in-RAM solution looks worse. 如果是这样,RAM内解决方案看起来更糟。

Does your json data look like a big array of objects? 你的json数据看起来像一大堆对象吗? Do most of the objects in that array have the same elements as each other? 该数组中的大多数对象是否具有相同的元素? If so, that's conformable to a SQL table? 如果是这样,那是否适合SQL表? You can make a table in which the columns correspond to the elements of your object. 您可以创建一个表,其中列对应于对象的元素。 Then you can use SQL to avoid touching every row -- every element of each array -- every time you display or update data. 然后,每次显示或更新数据时,都可以使用SQL来避免触及每一行 - 每个数组的每个元素。

(The same sort of logic applies to Mongo, Redis, and other ways of storing your data.) (同样的逻辑适用于Mongo,Redis和其他存储数据的方式。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM