Wednesday, September 16, 2009

The Database Cleanup

I found a recent discussion on Ask Tom about "Unused Objects" via David Aldridge's post, Metacode Gone Wrong. The original poster's question:
We are in a process of removing unused objects (tables/functions/procedures/packages) from the database. Is there any script(suggestions) or short-cut method to find these unused objects (tables/functions/procedures/packages not used in ddl/dml/select statements for more than 3 months).

There are more than 500 objects(tables/functions/procedures/packages) in our database.

At least PLEASE help me in finding unused TABLES.For other objects I'm thinking to check manually in the application code(using find and grep commands)

Please Help me.
To which Tom replied:
You'll have to enable auditing and then come back in 3 months to see.

We don't track this information by default -- also, even with auditing, it may be very possible to have an object that is INDIRECTLY accessed (eg: via a foreign key for example) that won't show up.

You can try USER_DEPENDENCIES but that won't tell you about objects referenced by code in
client apps or via dynamic sql

I'm always perplexed by this. How does one get into a production environment where by
they don't know what the objects are used by/for? No documentation or anything?
I've been in several environments where production was not documented very well (if at all). I guess that's fortunate for me, as there is always work to be done.

At my last gig, I went about an effort to clean up the database. We had close to 600 tables in a single schema. The one good thing (for me anyway), what that those tables were not accessible outside of the database, they were called through PL/SQL. Finding dependencies (DBA_DEPENDENCIES) was fairly easy...but I also ran across the caveat that he mentions, Dynamic SQL. Nothing strikes fear in you quicker than the realization that all of your work might be nullified because you didn't consider the use [Dynamic SQL] up front.

I would complain during code review/architectural sessions about the use of Dynamic SQL...not sure if it was listened, but I got my opinion in.

Documentation of a database, in my recent experience anyway, is the last thing on anyone's mind. It's seen as time-consuming and un-important. I like what David says in the Ask Tom comments:
I think that if documenting code makes people sad then they ought to be in their bedroom writing card games in VB. The sad thing is that it doesn't have to be a huge overhead, it just has to be well thoughtout and #actually done#.
How To Clean Up The Database
Since I've had so much experience at this of late, I'll list the steps I have taken in the hopes that you can find something useful yourself.

1. DBA_DEPENDENCIES - It's a great place to start, but it's not a panacea. You can get 90% of everything you need here. It's that last 10% that is the hardest. For example, I'll focus in on one table, query DBA_DEPENDENCIES, and then put that list into a spreadsheet where I can then track my progress. Usually I'll add "fixed", "fixed date" and "comments" columns so I'll know that I have addressed it. I'll typically have a worksheet for each table.

2. Privileges - Specifically in relation to tables. Do other database users have DML access? SELECT is one thing (still important) but INSERT/UPDATE/DELETE is entirely different. If other users do have access, are they service accounts (used by outside applications) or are they solely database users (another application)?

3. Auditing - I had never thought to use auditing for this purpose, but it might be helpful in the future.

4. Logging - If you suspect a piece of code is no longer used (naturally there is no documentation), but are not sure, you can add a bit of logging code to it. It's not the best method in the world, but it works. With all things, it's not a 100% guarantee either, the code may be called once a year, there is really no way to tell.

5. Thorough and Meticulous Analysis - This isn't really a method but it's going in here anyway. Document everything you can which includes everything you've done. At the very least, you'll have some documentation to show for it. At the very least, you'll have a much better understanding of your application and it's inner-workings.

Update 09/17/2009 12:30 PM
Dom Brooks reminded me of DBA_SOURCE in the comments so I'm adding that in.

6. DBA_SOURCE - A case-insensitive search of DBA_SOURCE is a must have as well. Allows you to find all the references to a table/procedure/etc. Some may have just be in comments, but some may also be contained in Dynamic SQL.

4 comments:

  1. Man, what a mess.

    One of the problems is that some objects may only be touched annually, and some may need to be retained for an extended period of time (24 months?).

    In situations like this I like to get management involved and bought into the cleanup. While you can audit & log to see what's accessed, eventually you'll need to start removing access from allegedly obsolete objects and perhaps renaming them.
    Inevitably, some process will barf but at least you'll still have all the data.

    ReplyDelete
  2. I'd add a case insensitive search of DBA_SOURCE to your list.

    ReplyDelete
  3. dom

    yup, completely forgot about that one. i'll update the post. gracias.

    ReplyDelete
  4. kenfar

    Agreed. I've had management behind me (mostly) that last couple of times. Of course it takes a lower priority to other things...

    As far as barfing goes, I'm OK with that. The tables are typically backed up before dropping so restoring them is easy.

    I'd prefer to have someone scream that something is broken (utimately anyway).

    good points about the extended time frames though. if it's used once a year or more, it would be almost impossible to detect.

    Which brings me to Apex, one of the greatest things about that tool is the ability to find object dependencies. I wish you could do the same (as easily) with other languages.

    ReplyDelete