Oracle Schema Diff Tool

By iotk

Built into IOTK is an Oracle schema difference tool. It takes a source Oracle schema and a target Oracle schema and writes the exact SQL commands necessary to modify the target so that it is identical to the source. Presently it can detect differences in:

  • Reference tables
  • Reference data
  • Tables
  • Indexes
  • Sequences
  • Functions
  • Procedures
  • Packages
  • Types
  • Columns
  • Constraints

I have found that this type of tool is invaluable for increasing the speed of development while maintaining a high level of quality and consistency.  All too often, modifications to the database of an application are tracked by a single person or each developer is required to record their changes so they can manually be applied to QA or production later.  People make mistakes or, under the pressure of a forthcoming deadline, forget to record things accurately if at all.  I advised a group that told me that their “database schema diff tool” was an Excel spreadsheet that was “checked out” and modified with DDL statements to be executed at the next release.  Their check out process was a meeting in which the person that wanted to modify the spreadsheet verified that no one else was either using it or had left it open on their computer.

A developer whom I respect highly once told me that computers are great at exactly the types of operations that make IOTK’s schema difference tool possible.  I would not have been able to imagine the amount of stress and anxiety such a tool can relieve a team of until I found that I could never build a product without one.  Thus the reason I wrote IOTK’s versions.

Before I give more details on the schema diff tool, let me explain how I typically do releases.  Everything I work on is managed by the Subversion revision control system.  The main line of my code is called “trunk” and it’s the active copy of the software against which all of the developers involved in the project are actively developing.  The team will develop a product or a set of features for a pre-determined amount of time, called a release cycle.  During the coding of the release, the trunk repository is very active with updates and additions.

At the end of the release, a copy of the code is created by what is called branching.  A branch is a free standing repository that is not affected by changes made to trunk.  Branching essentially creates a stable version of the software because ad hoc modifications to it are frozen.  The branch is than pushed to a QA environment where thorough testing is done.  In the event a defect is found trunk is modified to correct the problem and the changes are patched to the release branch.  This guarantees that the fix is present both in the active line of development (and thus future releases) and the current (soon to be production) branch.

Once QA has been completed you know that what is on the QA environment is exactly what you want in production.  Because QA has no legacy value, it can always be thrown away and rebuilt from scratch guaranteeing that how it operates is the new way in which you want production to operate.  In essence you have the environment that is currently live and the one you want to be live.  You just need to know what’s different between them.

I achieve this in two ways.  First I use rsync to determine the difference in the code files and I have rsync transmit only those differences to the production environment.  Once I run my “sync scripts” I know that the production environment contains exactly the code and logic I want it to.  However, it might not have the proper database structures to operate against.  That’s what the schema difference tool is for.  It will literally tell me exactly what SQL commands to execute to get production to look just like QA.

Below is what the schema difference tool looks like when you run it:

nobody@qa:/tmp$ iotk-schema-diff.php source/p@inst target/p@inst

// +--------------------------------------------------------------------------+
// | Executing Schema Diff                                                    |
// +--------------------------------------------------------------------------+

+ The source schema is source/p@inst.
+ The target schema is target/p@inst.
+ Upgrade scripts will be written to:
    ./schema_diff

+ Would you like to proceed? [y|n]

Pressing “y” will start the process. I have used the tool extensively on small to medium sized schemas and it will typically finish in 1-2 minutes. When it is done, the directory ./schema_diff will contain the following files:

10-start.sql       30-columns.sql      50-functions.sql   70-ref_tables.sql
15-ref_tables.sql  35-constraints.sql  55-procedures.sql  75-finish.sql
20-tables.sql      40-ref_data.sql     60-indexes.sql
25-types.sql       45-sequences.sql    65-packages.sql

You can safely execute each of these files in SQL*Plus on the target schema. You execute these files by following the numbers: execute 10-start.sql first and 75-finish.sql last. Once I have completed doing this I like to take two last steps:

  1. I add the above files to the branch for archiving.  If there are issues later on with a release I can look into whether or not the issue was caused by something executed from these files.  I can also use these files to manually roll back the database (if possible).  To date I have never had to do that.
  2. I execute the schema diff tool again to make sure that it produces no output.  This means the source and target are the same.

Building this type of tool is surprisingly easy, though time consuming, due to the Oracle Data Dictionary.  Since it is a relational model describing a database’s structure, it is easy enough to execute SQL statements against it to determine differences or to pull data back into an application to compute differences.

The time I spent on it, though, was worth it.  I can develop my application’s without being burdened with having to remember what changes I made or how I am going to push to production.  I do my thing and let the servers sort it out.

Part of my development of the tool was also building the unit testing framework around it.  I literally build two schemas as part of the unit tests and load a source with known objects and a target with some of the same objects and some different ones.  The similarities and differences are enough to test every part of the schema difference code.  The unit tests than execute the schema diff tool’s processes to ensure the resultant upgrade scripts are correct.

Leave a Reply