Andrew Channels Dexter Pinion

Wherein I write some stuff that you may like to read. Or not, its up to you really.

February 16, 2006

Returning Database Rows as Dicts

A recent poster on the Python DB-Sig mailing list wanted to refer to the individual elements of his fetched rows by name.

By definition this isn't possible with the Python DB-API 2.0 as it specifies that query results be returned as sequences, which most module writer interpret as tuples.

A couple of the usual smart and erudite members of the list provided ways to do this though, and I'm putting them here mainly as a reminder to myself. First, Andy Dustman suggested;

>>> import MySQLdb
>>> db=MySQLdb.connect(db="mysql",read_default_file="~/.my.cnf")
>>> c=db.cursor()
>>> c.execute("select * from user")
6L
>>> for row in c.fetchall():
...     d = dict( [ (c.description[i][0], j) for i,j in enumerate(row) ] )
...     print d
...
{'Drop_priv': 'Y', 'Execute_priv': 'Y', 'Create_routine_priv': 'Y',
'Repl_client_priv': 'Y', 'Create_user_priv': 'Y', 'Create_priv': 'Y',
'References_priv': 'Y', 'max_user_connections': 0L, 'Shutdown_priv':
'Y', 'Grant_priv': 'Y', 'max_updates': 0L, 'max_connections': 0L,
'Show_db_priv': 'Y', 'Reload_priv': 'Y', 'Super_priv': 'Y', 'Host':
'localhost', 'User': 'root', 'Alter_priv': 'Y', 'ssl_cipher':
array('c'), 'Password': 'xxx', 'Delete_priv': 'Y', 'Repl_slave_priv':
'Y', 'Insert_priv': 'Y', 'x509_subject': array('c'), 'ssl_type': '',
'Index_priv': 'Y', 'Create_tmp_table_priv': 'Y', 'x509_issuer':
array('c'), 'Create_view_priv': 'Y', 'Select_priv': 'Y',
'Show_view_priv': 'Y', 'Update_priv': 'Y', 'Lock_tables_priv': 'Y',
'Process_priv': 'Y', 'Alter_routine_priv': 'Y', 'File_priv': 'Y',
'max_questions': 0L}
...

Then Carsten Haese, along much the same lines but perhaps slightly more succinctly, suggested;

class RowObject(object):
  def __init__(self, data, description):
    self.__dict__.update(dict(zip([col[0] for col in description], data)))  

for rowdata in cursor.fetchall():
row = RowObject(rowdata, cursor.description)
# now you can do stuff with row.PID, row.FIRSTNAME, or however the columns
# in the query are named.

Of course a little light Googling shows me that there's nothing new in the world and that this topic is covered quite well in the Python Cookbook, with a version of the code above provided in this recipe and an alternative version using Greg Stein's dtuple module in this recipe. Alternatively you can always use db_row.py.

Posted by Andy Todd at February 16, 2006 10:36 AM

Comments

For MySQLdb, at least, you can always just get a DictCursor instead of the normal Cursor if you want to consistently get dicts instead of tuples. Use db.cursor(DictCursor), where db is your DB connection object and DictCursort can be found in MySQLdb.connections.

Posted by: Manuzhai on February 16, 2006 12:50 PM

With MySQLdb, I use a DictCursor like this:

dbc = _connection.cursor(MySQLdb.cursors.DictCursor)

Where _connection is a connection object returned by MySQLdb.connect().

Posted by: Jochen on February 17, 2006 12:38 AM

why not have both row.NAME and row['NAME']?

class oDict(dict):
  def __init__(self,data, description):
    self.update(dict(zip([col[0] for col in description], data)))
    self.__dict__.update(self)

I usually get a list of dicts from fetchall, like
a=cursor.fetchall()
row = [ oDict(v,cursor.description)) for v in a]

then I can access row[1]['NAME'] or row[1].NAME


Posted by: Wensheng Wang on February 17, 2006 04:05 AM

Here is a little rant about how the fact that DBAPI 2.0 is un-pythonic made people build ORMs :

http://nicolas.lehuen.com/index.php/2005/12/02/79-why-use-an-orm-at-all-anyway

If cursors were iterators returning dictionary, we wouldn't need all this stuff most of the time.

Regards,
Nicolas

Posted by: Nicolas Lehuen on February 17, 2006 06:54 AM

If I remember correctly, the SQL specification requires elements to be returned as tuples instead of named associative lists. This is how

SELECT COUNT(*), MAX(SALES), PERSONID FROM SALES
GROUP BY PERSONID

works.

Posted by: Chui Tey on February 17, 2006 08:47 AM

>> If cursors were iterators returning dictionary, we wouldn't need all this stuff most of the time.<<

i don't think the column indexing method has much effect on ORM use.

also: pyPgSQL seems to return rows as dictionaries by default (?) - does the spec really imply that tuples should be returned instead of dicts?


Posted by: mike on February 17, 2006 09:22 AM

We designed the DBAPI to return raw data, rather than the more complex structures... ON PURPOSE.

You can always do that in Python, and it doesn't take much work. But if you make EVERY database extension writer do it... that's a serious pain.

Also, consider the state of the world in 1996. The database extensions that happened to exist, sucked. And they were radically different. Had way different APIs. Had numerous bugs. etc.

The DBAPI made it much easier for extension writers to create those extensions. If something more was needed, it could be done in Python. BY DESIGN.

You want ORMs? Fine. Build them. Just don't ask the extension writers to do it for you.

Posted by: Greg Stein on March 23, 2006 09:47 AM