简体   繁体   English

如何使用Perl操作本地数据库?

[英]How can I manipulate a local database with Perl?

I'm a Perl programmer with some nice scripts that go fetch HTTP pages (from a text file-list of URLs) with cURL and save them to a folder. 我是一个Perl程序员,有一些很好的脚本,可以使用cURL获取HTTP页面(来自文本文件列表的URL)并将它们保存到文件夹中。

However, the number of pages to get is in the tens of millions. 但是,要获得的页数是数千万。 Sometimes the script fails on number 170,000 and I have to start the script again manually. 有时脚本在170,000号上失败,我必须再次手动启动脚本。 It automatically reads the URL and sees if there is a page downloaded and skips. 它会自动读取URL并查看是否有下载的页面并跳过。 But, with a few hundred thousand, it still takes a few hours to skip back up to where it left off. 但是,有几十万,它仍然需要几个小时才能跳回到它停止的地方。 Obviously, this is not going to pan out in the end. 显然,这最终不会成功。

I've been told that instead of saving to a text file, which is hard to search and modify, I need to use a database. 我被告知,我需要使用数据库,而不是保存到难以搜索和修改的文本文件。 I don't know much about databases, just messed around with MySQL on a school server a year ago. 我对数据库知之甚少,一年前就在学校服务器上搞乱了MySQL。 I just need the ability to add millions of rows and a few static columns, search/modify one quickly , and do this all locally on a lan (or a single computer if that's difficult). 我只需要能够添加数百万行和一些静态列, 快速搜索/修改一个,并在局域网(或单个计算机,如果这很困难)上完成所有操作。 And of course, I need to access this database using perl. 当然,我需要使用perl访问这个数据库。

Where should I start? 我应该从哪里开始? What do I need to download to get a server started on Windows? 在Windows上启动服务器需要下载什么? Which Perl modules should I use? 我应该使用哪些Perl模块? (I'm using an ActiveState distro) (我正在使用ActiveState发行版)

有很多类型的数据库,但如果您已经决定使用SQL数据库并且正在尝试简化设置过程,那么您可能需要查看SQLite和DBI / DBD::SQLite模块,它们允许您使用来自perl的。

Since you only need to search on one column, you may wish to consider a key/value store database like the Berkeley DB by using either BerkeleyDB or DB_File . 由于您只需要搜索一列,您可能希望使用BerkeleyDBDB_File来考虑像Berkeley DB这样的键/值存储数据库。

Generally, you can think of these key/value databases as being Perl hashes that operate from a disk rather than memory. 通常,您可以将这些键/值数据库视为从磁盘而非内存操作的Perl哈希值。 Exact key look ups are very fast. 精确的关键外观非常快。 Everything else requires scanning the whole dataset. 其他一切都需要扫描整个数据集。

Look into DBI . 看看DBI If you do not like SQL in your programs, try SQL::Abstract . 如果您不喜欢程序中的SQL,请尝试SQL :: Abstract

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM