简体   繁体   中英

replacing flat-file db with proper database with record level editing

I cannot install SQLite on a remote machine, so I have to find a way to store a large amount of data in some kind of database structure.

Example data

key,values...
key,values....
..

There are currently about a million rows in a 20MB flat file, and hourly I have to read through each record and value in the file and update or add a record. Since it is a flat file I have to rewrite the whole file each time.

I am looking at the Storable module, but I think it also writes data sequentially. I want to edit only those records which need to be changed.

reading and updating of random records is a requirement. Additions can be anywhere(order is not important)

Can anyone suggest something? How will I know if I can setup a native Berkeley database file on these systems, which are a mixture of Solaris and Linux?

________________finally__________________

finally I understood things better (thank you all), and based on your suggestions I used AnyDBM_File. It found NDBM_File ('C' library) installed on all OS. So far so good.

Just to check how it will play out in real world. I ran a sample script to add 1 million records (the max records i think i may ever get in a day, normally between 500k to 700k). OMG it created a 110G data file on my disk !!!! and all the records were like:

a628234 = 0.178532683639599

I mean my real world records are longer than that. compare this to a flat file which is holding real-life 700k+ records and is only 15Mb on disk.

I am disappointed with the slowness and bloat-ness of this, so for now i think i will pay the price by writing the whole file each time an edit is required.

Thanks again for all your help.

As they said in the comments you may use SDBM_File module. For example:

#!/usr/bin/perl 
use strict;
use warnings;
use v5.14;

use Fcntl;
use SDBM_File;

my $filename = "dbdb";

my %h;

tie %h, 'SDBM_File', $filename, O_RDWR|O_CREAT, 0666
    or die "Error: $!\n";

# To run only one time to fill the dbdb file.
# Next time you may delete this line and
# the output will be the same "16,40".    
$h{$_} = $_ * 2 . "," . $_ * 5  for 1..100;

say $h{8};

untie %h;

Output: 16,40

Depends, what your program logic needs, but one solution is to partition database, based on keys. So you can deal with many smaller files instead of one big file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM