June 11, 2008

Inspiration

Filed under: General — Andy Todd @ 9:03 pm

I just read, courtesy of Simon, a great article at the Guardian covering a presentation that Adrian Holovaty gave there last week.

Strangely enough I was talking about this with a colleague this very afternoon. My thesis was that data is generated by applications but should be considered independent of them. Treat your data carefully and you have a treasure trove of information that defines your organisation. Take an application centric view of the world and you end up with a load of blobs that are only meaningful in the context of your application code. If you can ever find a way to free your information you can find so many different ways of viewing, interpreting and analysing it.

In essence this is the excitement that surrounds mashups and the value that shown in sites like Chicago Crime and My Society. Expose the data and then marvel at what happens.

As I said in a (remarkably brief) presentation that I gave tonight - I’m Andy and I’m a data manager. If I can ever find a way of making a living taking data, turning it into information and making it available in new and interesting ways I guarantee that I will quit my day job in a heartbeat.

May 23, 2008

OSDC 2008 Call For Papers

Filed under: General — Andy Todd @ 1:37 pm

In case you haven’t seen this elsewhere;

Call for Papers
Open Source Developers’ Conference 2008
2nd - 5th December 2008, Sydney, Australia

The Open Source Developers’ Conference 2008 is a conference run by open source developers, for developers and business people. It covers numerous programming languages across a range of operating systems, and related topics such as business processes, licensing, and strategy. Talks vary from introductory pieces through to the deeply technical. It is a great opportunity to meet, share, and learn with like-minded individuals.

This year, the conference will be held in Sydney, Australia during the first week of December (2nd - 5th). If you are an Open Source maintainer, developer or user, the organising committee would encourage you to submit a talk proposal on open source tools, solutions, languages or technologies you are working with.

For more details and to submit your proposal(s), goto: http://osdc.com.au/2008/papers/cfp.html

If you have any questions or require assistance with your submission, please don’t hesitate to ask!

We recognise the importance of Open Source in providing a medium for collaboration between individuals, researchers, business and government. In recognition of this and ensure a high standard of presentations, we intend to peer-review all submitted papers.

OSDC 2008 Sydney (Australia) - Key Program Dates:

30 Jun - Initial proposals (short abstract) due
21 Jul - Proposal acceptance
15 Sep - Accepted paper submissions
13 Oct - Reviews completed
27 Oct - Final paper submission cutoff

For all information, contacts and updates, see the OSDC conference web site at http://osdc.com.au/2008/

Also if you are interested in sponsoring, please see: http://www.osdc.com.au/2008/sponsors/opportunities.html

May 9, 2008

Opening a file in Python

Filed under: General — Andy Todd @ 9:01 pm

I’m sure I read this somewhere recently, but my scratchy memory and command of Google can’t bring it back to me.

Is there a Python idiom for accepting either a file name or a file object as a function parameter?

The closest I can get is this;

def my_function(file_name_or_object):
    try:
        open(file_name_or_object)
    except TypeError:
        file = file_name_or_object
    return file

Any improvements on this are more than welcome.

Trouble Getting a Date

Filed under: database, python — Andy Todd @ 8:04 am

I’m having trouble with dates. This can be summed up in a couple of high level issues;

1. Date support in relational databases is insane, or at the best inconsistent.

As far as I can tell the ANSI SQL-92 standard defines date, time, interval and timestamp data types. Which doesn’t help when SQL Server only implements something called ‘datetime’ - at least I think so, have you tried accessing any sort of manual for a Microsoft product online? Blimey, I thought billg had embraced this web thing years ago. Oracle has the ‘date’ data type (which is actually a time stamp) and MySQL, well they’ve gone and outdone everyone by implementing DATETIME, DATE, TIMESTAMP, TIME, and YEAR.

2. The Python DB-API does not cope with date data type ambiguity well.

When it comes to the date question the Python DB-API states (and I quote) ” … may use mx.DateTime”, which if you ask me isn’t much of a standard. This needs to change so that all DB-API modules return consistent datetime objects, not such a big issue as datetime has been part of the standard library since, what, Python 2.3?

Sadly even if we fix this it won’t work with Sqlite as it doesn’t consistently support data typing. In my experiments regardless of what sort of date you insert into the database you get a unicode string back. Don’t believe me? Try this in Python 2.5;

>>> from sqlite3 import dbapi2
>>> db = dbapi2.connect('test_db')
>>> cursor = db.cursor()
>>> cursor.execute('create table date_test (id integer not null primary key autoincrement, sample_date DATE NOT NULL)'
>>> stmt = "INSERT INTO date_test (sample_date) VALUES (?)"
>>> cursor.execute(stmt, (1234, ))
>>> import datetime
>>> cursor.execute(stmt, (datetime.date(2008, 3, 10), ))
>>> cursor.execute(stmt, ('My name is Earl', ))
>>> db.commit()
>>> cursor.execute("SELECT * FROM date_test")
>>> results = cursor.fetchall()
>>> for item in results:
...     print item[1], type(item[1])
1234 
2008-03-10 
My name is Earl 
>>>

But note that it is fine for integers.

3. The people writing the Python standard library modules are on crack.

Outside of the database world and within the batteries included Python standard library some modules use datetime, others time and there are even uses of calendar.

O.K. I’ll accept that maybe the module authors aren’t on full strength crack, because the time module just exposes underlying posix functions. But the people who wrote those were on something strong and hallucinogenic. I table the following function signatures from section 14.2 of the Python Library Reference 2.5 as an example;

strftime(format[, t ])
strptime(string[, format ])

This has bitten me twice in the last twenty four hours and frankly I’m not happy.

I appreciate that there are historical reasons for having inconsistent function signatures but can someone please fix this in Python 3.0. All we need is a single module that can access the underlying system clock and then convert between a number of different representations of that and other epoch driven dates. How hard can it be? As far as I can tell this is not part of the proposed standard library re-organisation. I think it should be.

April 30, 2008

May Sydney Python Meeting

Filed under: python — Andy Todd @ 11:51 am

This Thursday, the 1st of May, 2008 from 6:30pm there will be a social gathering of the Sydney Python Users Group and any individuals interested in discussing Python, Web, Ruby, Perl etc.

Laptops, OLPC’s, code review, show and tell etc allowed and encouraged.

We meet in the ground floor area next to P.J. O’Briens Pub internal entrance in the;

Grace Hotel at the corner of York and King Street in Sydney, New South Wales 2000. See you there

April 27, 2008

Small Administrative Note

Filed under: General — Andy Todd @ 3:28 pm

As the feedback I got on the daily twitter posts was entirely negative they are gone. Sorry for that folks.

It seems that I’ll have to write more original content instead, I will see what I can do.

April 24, 2008

Twitter Updates for 2008-04-24

Filed under: General — Andy Todd @ 11:59 pm
  • Dodging the rain by ducking into cafes #
  • @alang I’m there. With bells on. #
  • Another skinny cappuccino? Oh alright then. #
  • Getting the barbecue ready? #
  • Preparing the barbecue to burn a load of meat and fish #

Powered by Twitter Tools.

April 23, 2008

Twitter Updates for 2008-04-23

Filed under: General — Andy Todd @ 11:59 pm
  • Off to talk about Feature Driven Development #

Powered by Twitter Tools.

April 22, 2008

Twitter Updates for 2008-04-22

Filed under: General — Andy Todd @ 11:59 pm
  • Making calls, writing down random thoughts #

Powered by Twitter Tools.

Whither Relational Databases?

Filed under: database — Andy Todd @ 9:16 pm

Following on from a theme that Simon has been pursuing here is an interesting piece - How SimpleDB differs from a RDBMS. A thorough analysis of SimpleDB, but I think the extra value here is in the comments. I particularly liked Greg Jorgensen’s submission that programmers just don’t like RDBMS because they take some learning. Whilst I don’t have empirical evidence to back up this supposition I can say that most Java programmers I’ve come across go slightly green if you suggest that they can solve most problems with a SQL statement (and yes, that was meant to be read ironically).

If I can sum up the message of this post and it’s comments it is that we should be thankful for having different tools available to us, because this isn’t a one size fits all world. Where you’ve got a big list of simple things ™ tools like BigTable and SimpleDB work well. Where you’ve got large pieces of unstructured data (sometimes referred to as ‘documents’) you can use CouchDB, and where you have complex, structured data that has to adhere to certain validation and usage rules use a relational database. Each of these will store up to terabytes of information so let’s not even talk about (the myth of) scalability. Choose the right tool for the job and stop insisting that every problem is a nail.

So to answer my question from the title of this post - still around, and still kicking arse.

Next Page »

Powered by WordPress