简体   繁体   English

如何加速/分解多个部分的过程。 Rss,Curl,PHP

[英]How to speed up / break up process in multiple parts. Rss, Curl, PHP

Im experimenting with some RSS reader/fetcher im writing at the moment. 我正在尝试一些RSS阅读器/提取器即时写作。 Everything is going smoothly except 1 thing. 除了一件事,一切都很顺利。 It's terribly slow. 这非常慢。

Let me explain: 让我解释:

  1. I fetch the list of RSS feeds from the database 我从数据库中获取RSS源列表
  2. I iterate every feed from this list, open it with cURL and parse it with SimpleXMLElement 我迭代此列表中的每个提要,使用cURL打开它并使用SimpleXMLElement进行解析
  3. I check descriptions and title's of these feeds with a given keyword, to see if its already in database or not. 我使用给定的关键字检查这些Feed的描述和标题,以查看它是否已存在于数据库中。
  4. If its not i add it to database. 如果它不是我将它添加到数据库。

For now i am looping through 11 feeds. 现在我循环浏览11个Feed。 Which gives me a page loading time of 18 seconds. 这给了我一个18秒的页面加载时间。 This is without updating the database. 这是在不更新数据库的情况下。 When there are some new articles found, it goes up to 22 seconds (on localhost). 当找到一些新文章时,它最多可达22秒(在localhost上)。

On a live webserver, my guess is that this will be even slower, and maybe goes beyond the limit php is setup to. 在实时网络服务器上,我的猜测是,这将更慢,并且可能超出了php设置的限制。

So my question is, what are your suggestions to improve speed.. and if this is not possible, whats the best way to break this down into multiples executions, like say 2 feeds at a time? 所以我的问题是,你有什么建议来提高速度..如果这是不可能的,最好的方法是将其分解为多次执行,比如一次说2个Feed? I'd like to keep it all automated, dont want to click after every 2 feeds. 我想保持全部自动化,不想在每2个Feed后点击。

Hope you guys have some good suggestions for me! 希望你们有一些好的建议给我!

If you want some code example let me know and ill paste some 如果你想要一些代码示例让我知道并且粘贴一些

Thanks! 谢谢!

I would suggest you use a cronjob or a daemon that automatically synchronizes the feeds with your database by running a php script. 我建议您使用cronjob或守护进程,通过运行php脚本自动将提要与数据库同步。 That would remove the delay from the user's perspective. 这将从用户的角度消除延迟。 Run it like every hour or whatever suits you. 像每小时或任何适合你的方式运行它。

Though first, you should possibly try and figure out which parts of the process are actually slow. 虽然首先,您应该尝试找出过程的哪些部分实际上很慢。 Without the code it's hard to tell what could be wrong. 没有代码,很难说出什么可能是错的。

Possible issues could be: 可能的问题可能是:

  • The remote servers(which store the feeds) are slow 远程服务器(存储源)很慢
  • Your local server's internet connection 您本地服务器的互联网连接
  • Your server's hardware 您服务器的硬件
  • And obviously the code 显然代码

Here are some suggestions. 以下是一些建议。

  • First, separate the data fetching and crunching from displaying web pages to the user. 首先,将数据提取和处理从显示网页分离到用户。 You can do this by putting the fetching and crunching part by setting up a script that is executed in a CRON job or that exists as a daemon (ie runs continuously.) 您可以通过设置在CRON作业中执行或作为守护程序存在的脚本(即连续运行)来放置提取和处理部分。
  • Second, you can set some sensible time limit between feed fetches, so that your script does not have to loop through every feed each time. 其次,您可以在Feed提取之间设置一些合理的时间限制,这样您的脚本就不必每次都遍历每个Feed。
  • Third, you should probably look into using a feed parsing library, like MagpieRSS, rather than SimpleXML. 第三,您应该考虑使用Feed解析库,如MagpieRSS,而不是SimpleXML。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM