Monday, June 8, 2009

UPDATEs in OLTP: A Design Flaw?

This one has been on my mind for the past year or so...

Should you do UPDATEs in an OLTP environment?

The answer is maybe, or it depends, of course.

When I ask this question I ask it in relation to INSERTs. There is a difference between creating a record and updating a record. I think it comes down to a design decision.

For instance, I once used a table to track sessions (web sessions, using Application Context in the database to "reconnect"), it was defined something like the following:


An Acceptable Time to use UPDATE
A session defaulted was created with a 30 minute window. Each page the user would hit would update the END_TIME to SYSDATE + 30 Minutes. If they had no activity for 30 minutes, the END_TIME would be less then the current time (SYSDATE) and they would be logged out. This table did exactly what I needed, UPDATEs worked perfectly here.

What if I wanted to track page hits though? Could I do that with the current table? I could possibly add PAGE_ID or something I suppose, but then I would have to write this "complicated" query to find the first START_TIME and then compare that with the last END_TIME. Kind of changes the meaning of the table right?

An Unacceptable Time to use UPDATE
You have a table of scheduled transactions or some sort of recurring billing.


You use this table by having a job that looks at the DATE_TO_POST, takes those and attempts to post them. If there is a problem, you update the DATE_TO_POST column to the next date/time (based on rules somewhere) and you increment the TIMES_POSTED counter. That doesn't sound so bad right? If you don't have any processing or billing failures, it's not. But if you do, you lose some valuable data, in my opinion.

First, get rid of the TIMES_POSTED column. You don't need it. Create a record for every single transaction you have posted. This obviously changes the meaning of the table and will force you to change your code.

Here's why it's good though (to remove the UPDATE): You now have a single record for every attempt. Your reporting off of this becomes much easier and is performed with SQL. With the UPDATE, you have to maintain some PL/SQL code ( TIMES_POSTED + 1). With the INSERT, you don't perform a calculation at all.

A slightly different example and one that may be more familiar to everyone:


This kind of structure is more suited for a data warehouse. The columns have been pivoted to make reporting easier. In an ideal environment, this wouldn't exist on the OLTP server. However we don't all work in ideal environments and this type of structure is often employed. I pretty much hate it (in OLTP, not DW). The volume and complexity of code needed to maintain this type of table is large. There are all sorts of computations that must be performed and then someone, like me, has to come in and maintain it. I always vote for scrapping it in favor of the following type structure(s):


Now you have INSERT statements as opposed to one or more UPDATE statements. Reporting tools can then handle the pivoting or whatever else you want to do to the data.

UPDATEs in an OLTP are not a flaw, but when about to write one, think of the implications. Is that what you really want to do? Wouldn't you rather just do an INSERT and be done with it?

2 comments:

  1. 1. Updating keys is always a design flaw.

    2. Updating non-keys is never a design flaw.

    3. If you disagree with 1, then no one can help you.

    4. If you disagree with 2, then the design flaw isn't the use of update, it is a flaw in the data model itself.

    Carry on.

    ReplyDelete
  2. #4 exactly.

    There is nothing inherently wrong with the UPDATE statement, it's just the way it is used. But like DISTINCT, it can be a clue in certain situations.

    ReplyDelete