简体   繁体   中英

How to write this snippet in Python?

I am learning Python (I have a C/C++ background).

I need to write something practical in Python though, whilst learning. I have the following pseudocode (my first attempt at writing a Python script, since reading about Python yesterday). Hopefully, the snippet details the logic of what I want to do. BTW I am using python 2.6 on Ubuntu Karmic.

Assume the script is invoked as: script_name.py directory_path

import csv, sys, os, glob

# Can I declare that the function accepts a dictionary as first arg?
def getItemValue(item, key, defval)
  return !item.haskey(key) ? defval : item[key]


dirname = sys.argv[1]

# declare some default values here
weight, is_male, default_city_id = 100, true, 1 

# fetch some data from a database table into a nested dictionary, indexed by a string
curr_dict = load_dict_from_db('foo')

#iterate through all the files matching *.csv in the specified folder
for infile in glob.glob( os.path.join(dirname, '*.csv') ):
  #get the file name (without the '.csv' extension)
  code = infile[0:-4]
  # open file, and iterate through the rows of the current file (a CSV file)
  f = open(infile, 'rt')
  try:
    reader = csv.reader(f)
    for row in reader:
      #lookup the id for the code in the dictionary
      id = curr_dict[code]['id']
      name = row['name']
      address1 = row['address1']
      address2 = row['address2']
      city_id = getItemValue(row, 'city_id', default_city_id)

      # insert row to database table

  finally:
    f.close()

I have the following questions:

  1. Is the code written in a Pythonic enough way (is there a better way of implementing it)?

  2. Given a table with a schema like shown below, how may I write a Python function that fetches data from the table and returns is in a dictionary indexed by string (name).

  3. How can I insert the row data into the table (actually I would like to use a transaction if possible, and commit just before the file is closed)

Table schema:

create table demo (id int, name varchar(32), weight float, city_id int);

BTW, my backend database is postgreSQL

[Edit]

Wayne et al:

To clarify, what I want is a set of rows. Each row can be indexed by a key (so that means the rows container is a dictionary (right)?. Ok, now once we have retrieved a row by using the key, I also want to be able to access the 'columns' in the row - meaning that the row data itself is a dictionary. I dont know if Python supports multidimensional array syntax when dealing with dictionaries - but the following statement will help explain how I intend to conceptually use the data returned from the db. A statement like dataset['joe']['weight'] will first fetch the row data indexed by the key 'joe' (which is a dictionary) and then index that dictionary for the key 'weight'. I want to know how to build such a dictionary of dictionaries from the retrieved data in a Pythonic way like you did before.

A simplistic way would be to write something like:

import pyodbc

mydict = {}
cnxn = pyodbc.connect(params)
cursor = cnxn.cursor()
cursor.execute("select user_id, user_name from users"):

for row in cursor:
   mydict[row.id] = row

Is this correct/can it be written in a more pythonic way?

to get the value from the dictionary you need to use .get method of the dict :

>>> d = {1: 2}
>>> d.get(1, 3)
2
>>> d.get(5, 3)
3

This will remove the need for getItemValue function. I wont' comment on the existing syntax since it's clearly alien to Python. Correct syntax for the ternary in Python is:

true_val if true_false_check else false_val
>>> 'a' if False else 'b'
'b'

But as I'm saying below, you don't need it at all.

If you're using Python > 2.6, you should use with statement over the try-finally :

with open(infile) as f:
    reader = csv.reader(f)
    ... etc

Seeing that you want to have row as dictionary, you should be using csv.DictReader and not a simple csv. reader csv. reader . However, it is unnecessary in your case. Your sql query could just be constructed to access the fields of the row dict. In this case you wouldn't need to create separate items city_id , name , etc. To add default city_id to row if it doesn't exist, you could use .setdefault method:

>>> d
{1: 2}
>>> d.setdefault(1, 3)
2
>>> d
{1: 2}
>>> d.setdefault(3, 3)
3
>>> d
{1: 2, 3: 3}

and for id , simply row[id] = curr_dict[code]['id']

When slicing, you could skip 0 :

>>> 'abc.txt'[:-4]
'abc'

Generally, Python's library provide a fetchone , fetchmany , fetchall methods on cursor, which return Row object, that might support dict-like access or return a simple tuple. It will depend on the particular module you're using.

It looks mostly Pythonic enough for me.

The ternary operation should look like this though (I think this will return the result you expect):

return defval if not key in item else item[key]

Yeah, you can pass a dictionary (or any other value) in basically any order. The only difference is if you use the *args, **kwargs (named by convention. Technically you can use any name you want) which expect to be in that order and the last one or two arguments.

For inserting into a DB you can use the odbc module:

import odbc
conn = odbc.odbc('servernamehere')
cursor = conn.cursor()
cursor.execute("INSERT INTO mytable VALUES (42, 'Spam on Eggs', 'Spam on Wheat')")
conn.commit()

You can read up or find plenty of examples on the odbc module - I'm sure there are other modules as well, but that one should work fine for you.

For retrieval you would use

cursor.execute("SELECT * FROM demo")
#Reads one record - returns a tuple
print cursor.fetchone()
#Reads the rest of the records - a list of tuples
print cursor.fetchall()

to make one of those records into a dictionary:

record = cursor.fetchone()
# Removes the 2nd element (at index 1) from the record
mydict[record[1]] = record[:1] + record[2:]

Though that practically screams for a generator expression if you want the whole shebang at once

mydict = dict((record[1], record[:1] + record[2:] for record in cursor.fetchall())

which should give you all of the records packed up neatly in a dictionary, using the name as a key.

HTH

a colon required after def s:

def getItemValue(item, key, defval):
    ...

boolean operators: In python ! -> not ; && -> and and || -> or (see http://docs.python.org/release/2.5.2/lib/boolean.html for boolean operators). There's no ? : ? : operator in python, there is a return (x) if (x) else (x) expression although I personally rarely use it in favour of plain if 's.

booleans/ None : True , False and None have capitals before them.

checking types of arguments: In python, you generally don't declare types of function parameters. You could go eg assert isinstance(item, dict), "dicts must be passed as the first parameter!" in the function although this kind of "strict checking" is often discouraged as it's not always necessary in python.

python keywords: default isn't a reserved python keyword and is acceptable as arguments and variables (just for the reference.)

style guidelines: PEP 8 (the python style guideline) states that module import s should generally only be one per line, though there are some exceptions (I have to admit I often don't follow the import sys and os on separate lines, though I usually follow it otherwise.)

file open modes: rt isn't valid in python 2.x - it will work, though the t will be ignored. See also http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files . It is valid in python 3 though, so I don't think it it'd hurt if you want to force text mode, raising exceptions on binary characters (use rb if you want to read non-ASCII characters.)

working with dictionaries: Python used to use dict.has_key(key) but you should use key in dict now (which has largely replaced it, see http://docs.python.org/library/stdtypes.html#mapping-types-dict .)

split file extensions: code = infile[0:-4] could be replaced with code = os.path.splitext(infile)[0] (which returns eg ('root', '.ext') with the dot in the extension (see http://docs.python.org/library/os.path.html#os.path.splitext ).

EDIT: removed multiple variable declarations on a single line stuff and added some formatting. Also corrected the rt isn't a valid mode in python when in python 3 it is.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM