简体   繁体   中英

Python dictionary / database in memory

I have a file that looks like this:

LastName    FirstName   Age Gender  Height  Weight
Smith   May 20  F   1500    55
Wilder  Harry   25  M   1800    65
Potter  Harry   50  M   1600    66
Lincoln Abram   100 M   1800    55
Reynolds    Mary    55  F   1600    55
Anderson    Jane    40  F   1700    60
Smith   William 42  M   1520    60

I want to be able to search in memory for example to find who has a height of 1800, or who has a last name of Smith, without having to read the file again.

I can read the file using import csv

filename = r'C:\Users\wsteve46\Documents\Python\People.csv'
reader = csv.DictReader(open(filename))

results = []
resdict = []

for row in reader:
    try:
        print 'Row = ',row
        results.append(row.values())
        resdict.append(row)

    except:
         break
         print 'break ',row
fieldnames = row.keys()

However, resdict is a list, not a dictionary. What is the best way to access this data by key/value?

the easiest way for this is using pandas

import pandas as pd
data = pd.read_csv(fn)
print data[data.Height == 1800]
print data[data.LastName == 'Smith']

you'll have to do more research on your own, but that answers your first question.

Another option is to use Sqlite3 with in memory database:

import sqlite3

con = sqlite3.connect(':memory:')

cur = con.cursor()    
cur.execute('INSERT ...')
con.commit()

cur.execute('SELECT ... WHERE ...')
rows = cur.fetchall()
for row in rows:
    print row

This gives you a wide range of SQL functions to use without dependencies.

While you could put the results of the CSV read into a dictionary that will not directly address your problem since a dictionary is keyed on only one element and you mention that you might want to search by different elements.

An alternative is to create one dictionary per type of search you plan on executing. This is effectively the same as creating an in-memory index for each type of search. The trick is that the payload on each dictionary key will need to be a list of people, not a single instance of a person.

However, if your total number of people is of a "reasonable" size, you can search a list and return a subset of the list using a list comprehension . Here are two, one for each search you mention in your query; it's easy to construct others:

smiths = [p for p in resdict if p.LastName == 'Smith']

height_1800s = [p for p in resdict if p.Height = 1800]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM