简体   繁体   中英

Parsing dmoz rdf files using PHP script

I used the php script in

http://sourceforge.net/projects/dmoz2mysql/files/

to download, extract, clean, parse and insert dmoz data into MySQL table. I encounter no problem while processing structure.rdf.

But while parsing content.rdf, there is an abrupt pause after inserting 3200000 rows, and nothing happens after that - I had to force quit after waiting for about an hour. I run this script in Windows command prompt. I don't know PHP, and hence I'm unable to find the cause of this problem.

Here are a few trouble shooting tips that you might find helpful.

SHOW TABLE STATUS LIKE `table_name`;

This will tell you information about the table you are filling. The important part will be the Max_data_length. Check to see if you are hitting any limitations on the amount of data you are allowed to store. This is common for MyISAM tables. Compare it to Data_length. If they match, you reached a limit. This can be modified using MAX_ROWS. The common limit for a MyISAM table is 4GB of data. If you are maxed out you will need to do one of two things. Either use the InnoDB engine for the table or alter your table:

ALTER TABLE `table_name` MAX_ROWS=1000000000 AVG_ROW_LENGTH=nnn;

Be sure to use the Avg_row_length (or higher) listed in the first STATUS query listed in this answer. It will help you evaluate where this number should be. Keep in mind, if you already have data in the DB, this could table some time. Hope it helps.

Happy Coding!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM