Andrew Channels Dexter Pinion

Wherein I write some stuff that you may like to read. Or not, its up to you really.

July 12, 2005

On the Ubiquity of Excel

Excel is great. It is, as Joel Spolsky keeps telling us, the king of spreadsheets. And I agree with him.

As a matter of course I don't use it at all at home, as I don't have any licenses for office. I do use it at work though, but as with any tool I make sure to use it only where it's applicable.

Like all good software, it gets used in situations where it really shouldn't. The main reason for this in Excel's case is that it's a great way to store semi-structured information. Lists of things are hard to maintain in Word (most people's other weapon of choice) and with Excel you get sorting and filtering thrown in for free. Of course this is the tip of the iceberg. Because Excel doesn't scale well or easily adapt to changing requirements.

There are a couple of cases in point on my current project.

The system test bug (issue/opportunity) list is being maintained in an Excel spreadsheet by the test manager. This is bad in a couple of ways. Only one person can practically look at or amend it at once so there is little or no access control, meaning that anyone can be the 'one' person that edits the spreadsheet at any time. Because it's in Excel there is no version history.

Luckily they have "solved" the last two problems by mandating that anyone who changes the bug list in any way has to save a new copy with a filename that includes today's date and their initials. Can anyone else see the drawbacks in this approach? I did ask, just to be awkward you understand, what date a certain issue was last updated. Surprisingly enough it took an awfully long time to find out.

The other problem this raises is when the requirements change. Initially the test manager just thought he would like to know which issues were open and which closed. It very quickly became apparent that it would be useful to know which problems were being tackled by a developer, which were waiting for re-test and which weren't bugs so much as change requests. So another column was added to the worksheet, but because it was just another Excel column there was no rhyme or reason to the values that ended up in it and it manifestly failed to serve it's purpose of being able to group issues in a like state.

The other major failing of this particular spreadsheet, and it's in the implementation not the tool used, is that they don't link issues to the particular versions of source code that have been changed to resolve them.

Needless to say I didn't want to include my team in this use of inappropriate technology. We're using Subversion for version control and I convinced a colleague to install Trac on the same machine. I'm using it's ticketing feature to log and manage all of our system testing issues. It's not perfect, although the features coming in versions 0.9 and 1.0 will get it very, very close, but it's a chauffeur driven limousine compared to the unicycle that is the Excel solution.

Our other major use of spreadsheets on the project is in the data entry system. We have some obstensibly straight forward data entry requirements, which involve quite large volumes of data. So we've produced some rather nice spreadsheets which download basic information from Oracle, allow the user to add pages and pages of detail and then upload it back into the database.

At first glance this requirement just screams spreadsheet. Until you start adding some very stringent data integrity and validation rules. Then things start to get a bit scary. Because we end up replicating large parts of the (quite complex) data entry screens provided as part of the packaged application.

The problem here is that Excel doesn't lend itself to tied down data entry. It's strength is that worksheets are free form and allow you to specify your own formulas and manipulate information as necessary. In an ideal world we would provide a template and then convert the worksheet to a comma seperated file which would be extensively validated as it's uploaded.

Sadly the other requirement which mandates the use of Excel in this circumstance is that the data (including all of it's complex validation rules) need to be available to users who aren't connected to the network. So we're stuck with developing a standalone .EXE or giving them a spreadsheet.

So whilst we've produced an enormous amount of code that replicates standard functionality with all of the risks that entails. But given the skill set we have in the team I don't see a viable alternative. Any suggestions?

Posted by Andy Todd at July 12, 2005 10:14 PM

Comments

I find myself using Excel as a report target for MS Access. Once you get around the Automation coding, and use SQL to lay things out, something like:
excel_range.CopyFromRecordset( rst_report_data )
can be a fast operation.

Posted by: Chris Smith on July 13, 2005 02:57 AM

I developed a sophisticated data collection application in Excel a few years ago for a multinational company that needed to collect financial data from its worldwide divisions. The data had to meet stringent data integrity rules. It was a lot of work, but the results were worth the effort, because it streamlined the processing by keeping the responsibility for correct data at the division entering the data, giving the division people an immediate feedback if the data wasn't going to pass validation when returned to the corporate office.

Here are the design features I included to make the process as "Excel-like" as possible:

1. I created a function that took two arguments, a boolean value and a string. If the boolean was true, the function returned a space. Otherwise, the function returned an X. In addition, if the boolean was true, the function erased whatever comment was in the cell containing the function, but if the value was false, the function created a comment on the cell containing the string value.

For example, if the function was named TestAndComment, it would be called by a cell containing a formula like:

=TestAndComment(SomeOtherCell=SomeValue,"This value should be SomeValue.")

2. I arranged the data entry into columns of data to mimic the data entry form the users were accustomed to.

3. To the right of each data entry cell column I placed at least 2 columns.

a. The first column would have a formula like the one above in each cell that was to the right of a data entry cell. This cell was set to display its value in bold red. Therefore, it would display either a blank or a red X. All cells in this column were protected to prevent the user from changing the formulas. The boolean expression in each formula would refer to the cell in the column to the right of it.

b. The second column would have a formula that returned either a true or a false, according as the data entry cell (two column left of the cell) contained a valid entry or not. This column was hidden.

4. I had a helper spreadsheet full of macros for each template spreadsheet designed as above. The helper spreadsheet had a macro to protect and hide all columns I wanted to protect and hide and also another macro to unprotect and unhide everything so I could manipulate the unseen formulas.

5. By using this approach, the user would see an immediate indication of any validation problem (a red X would appear). By hovering the cursor above the red X, the comment explaining the problem would be displayed. This used minimal screen real estate yet gave detailed error messages (as comments).

6. At the top of each worksheet in the workbook I had a formula that "and-ed" all of the boolean values in the test columns. If any value was false, it meant there was at least one error on that worksheet. If so, I displayed a red "Alert!" at the top of the sheet. I also had a status sheet that showed whether each of the other sheets had errors to help the user find the sheet(s) that needed attention.

I hope this explanation is clear enough to give you some valuable ideas.

Posted by: Howard B. Golden on July 13, 2005 05:31 PM