8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23c | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » Misc » Here

One True Lookup Table (OTLT)

The One True Lookup Table (OTLT) design, also know as "one lookup table to rule them all", is a classic anti-pattern that looks so sensible at first glance, that you will defend your decision to use it, even as your database performance takes a nose dive, your data integrity is compromised, and your developers stick pins into a voodoo doll in your likeness!

What is One True Lookup Table (OTLT)?

Most systems have lots of basic lookup tables that look similar to this.

-- DROP TABLE order_types PURGE;

CREATE TABLE order_types (
  order_type_id    NUMBER         NOT NULL,
  order_type_code  VARCHAR2(10)   NOT NULL,
  order_type_desc  VARCHAR2(100)  NOT NULL,
  CONSTRAINT order_types_pk PRIMARY KEY (id),
  CONSTRAINT order_types_uk UNIQUE (code)
);

-- DROP TABLE order_statuses PURGE;

CREATE TABLE order_statuses (
  order_status_id    NUMBER         NOT NULL,
  order_status_code  VARCHAR2(10)   NOT NULL,
  order_status_desc  VARCHAR2(100)  NOT NULL,
  CONSTRAINT order_statuses_pk PRIMARY KEY (id),
  CONSTRAINT order_statuses_uk UNIQUE (code)
);

Having many similar tables seems like a messy solution to some designers, so they feel the need to consolidate them into a single table, under the guise of making the data model simpler, and reducing the amount of maintenance screens. This consolidation results in the One True Lookup Table (OTLT) anti-pattern. The starting point for our shared lookup table might look like this.

-- DROP TABLE lookups PURGE;

CREATE TABLE lookups (
  lookup_id     NUMBER         NOT NULL,
  lookup_key    VARCHAR2(100)  NOT NULL,
  lookup_value  VARCHAR2(500)  NOT NULL,
  CONSTRAINT lookups_pk PRIMARY KEY (lookup_id),
  CONSTRAINT lookups_uk UNIQUE (lookup_key)
);

You will need a way of distinguishing between the types of lookup data, so we might add a lookup type or domain column. Something to remember is we might have the same key for multiple lookup types, so we have to include the lookup type in the unique key. This might give us something like this.

-- DROP TABLE lookups PURGE;

CREATE TABLE lookups (
  lookup_id     NUMBER         NOT NULL,
  lookup_type   VARCHAR2(20)   NOT NULL,
  lookup_key    VARCHAR2(100)  NOT NULL,
  lookup_value  VARCHAR2(500)  NOT NULL,
  CONSTRAINT lookups_pk PRIMARY KEY (lookup_id),
  CONSTRAINT lookups_uk UNIQUE (lookup_type, lookup_key)
);

The lookup type is not very descriptive, so it might be better to have a proper description kept somewhere.

-- DROP TABLE lookups PURGE;
-- DROP TABLE lookup_types PURGE;

CREATE TABLE lookup_types (
  lookup_type_id     NUMBER         NOT NULL,
  lookup_type_code   VARCHAR2(20)   NOT NULL,
  lookup_description VARCHAR2(100)  NOT NULL,
  CONSTRAINT lookup_types_pk PRIMARY KEY (lookup_type_id),
  CONSTRAINT lookup_types_uk UNIQUE (lookup_type_code)
);

CREATE TABLE lookups (
  lookup_id         NUMBER         NOT NULL,
  lookup_type_code  VARCHAR2(20)   NOT NULL,
  lookup_key        VARCHAR2(100)  NOT NULL,
  lookup_value      VARCHAR2(500)  NOT NULL,
  CONSTRAINT lookups_pk PRIMARY KEY (lookup_id),
  CONSTRAINT lookups_uk UNIQUE (lookup_type_code, lookup_key),
  CONSTRAINT lookup_lookup_type_fk FOREIGN KEY (lookup_type_code) REFERENCES lookup_types (lookup_type_code)
);

Note. I would not normally foreign key to a unique key like this, as things change over time, but it makes the SQL even tougher if we don't.

We pat ourselves on the back at our wonderful solution...

The problem with this solution is it is dealing with a single datatype for the key and the value. What happens if we need a variety of key-value pairs from a data type perspective? Either we ignore the data type variations and incur additional data type conversions, or we add more optional columns of differing types, like NUMBER, DATE and even string length variants. We tell yourself it's fine, as it doesn't waste a lot of space to have empty columns. The problem now is we have made an important bit of information optional, when it really shouldn't be.

CREATE TABLE lookups (
  lookup_id                  NUMBER         NOT NULL,
  lookup_type_code           VARCHAR2(20)   NOT NULL,
  lookup_key                 VARCHAR2(100)  NOT NULL,
  lookup_string_value        VARCHAR2(500),
  lookup_short_string_value  VARCHAR2(100),
  lookup_long_string_value   VARCHAR2(4000),
  lookup_number_value        NUMBER,
  lookup_date_value          DATE,
  lookup_ts_value            TIMESTAMP,
  CONSTRAINT lookups_pk PRIMARY KEY (lookup_id),
  CONSTRAINT lookups_uk UNIQUE (lookup_type_code, lookup_key),
  CONSTRAINT lookup_lookup_type_fk FOREIGN KEY (lookup_type_code) REFERENCES lookup_types (lookup_type_code)
);

Then a couple of our basic lookups need a few additional fields to be kept with them. That means splitting them out of the generic lookup table, and revisiting all bits of the application code that reference it, or adding some additional columns to the lookup table, maybe as "flex fields", which can be used to store different things. Invariably we pick the "easy option" and end up with something like this.

CREATE TABLE lookups (
  lookup_id                  NUMBER         NOT NULL,
  lookup_type_code           VARCHAR2(20)   NOT NULL,
  lookup_key                 VARCHAR2(100)  NOT NULL,
  lookup_string_value        VARCHAR2(500),
  lookup_short_string_value  VARCHAR2(100),
  lookup_long_string_value   VARCHAR2(4000),
  lookup_number_value        NUMBER,
  lookup_date_value          DATE,
  lookup_ts_value            TIMESTAMP,
  flex_field_1               VARCHAR2(4000),
  flex_field_2               VARCHAR2(4000),
  flex_field_3               VARCHAR2(4000),
  flex_field_4               VARCHAR2(4000),
  flex_field_5               VARCHAR2(4000),
  CONSTRAINT lookups_pk PRIMARY KEY (lookup_id),
  CONSTRAINT lookups_uk UNIQUE (lookup_type_code, lookup_key),
  CONSTRAINT lookup_lookup_type_fk FOREIGN KEY (lookup_type_code) REFERENCES lookup_types (lookup_type_code)
);

Before we know it, our "elegant solution" to save on reference tables becomes a complete disaster area.

So What's The Problem?

There are so many problems with this approach. Here are a few.

Ugly SQL?

We are probably going to join to this table multiple times from queries. Here is an example using ANSI and Non-ANSI join syntax.

-- ANSI Join
SELECT o.order_id,
       o.order_type_code,
       ot.lookup_value AS order_type_description,
       o.order_status_code,
       os.lookup_value AS order_status_description
FROM   orders o
       JOIN  lookups ot
         ON  o.order_type_code = ot.lookup_key
         AND ot.lookup_type_code = 'ORDTYP'
       JOIN  lookups os
         ON  o.order_status_code = os.lookup_key
         AND os.lookup_type_code = 'ORDSTAT'

-- Non-ANSI Join
SELECT o.order_id,
       o.order_type_code,
       ot.lookup_value AS order_type_description,
       o.order_status_code,
       os.lookup_value AS order_status_description
FROM   orders o,
       lookups ot,
       lookups os
WHERE  o.order_type_code = ot.lookup_key
AND    ot.lookup_type_code = 'ORDTYP'
AND    o.order_status_code = os.lookup_key
AND    os.lookup_type_code = 'ORDSTAT'

What we can see is every reference to the lookup table requires two columns, and we are going to see the same table repeated in the execution plan multiple times. The table name and the column names don't give us any clue about why we are using this table.

I've seen examples of this where the number of references to the lookup table were in double figures.

Alternatively, we can do the job properly and things look clean, with correct table and column references.

-- ANSI Join
SELECT o.order_id,
       ot.order_type_code,
       ot.order_type_desc,
       os.order_status_code,
       os.order_status_desc
FROM   orders o
       JOIN order_types ot ON o.order_type_id = ot.order_type_id
       JOIN order_statuses os ON o.order_status_id = order_status_id

-- Non-ANSI Join
SELECT o.order_id,
       ot.order_type_code,
       ot.order_type_desc,
       os.order_status_code,
       os.order_status_desc
FROM   orders o,
       order_types ot,
       order_statuses os
WHERE  o.order_type_id = ot.order_type_id
AND    o.order_status_id = os.order_status_id

Conclusion

The One True Lookup Tables (OTLT) approach seems to make sense at first. When designing a logical model it looks kind-of nice. Once you get down to implementing it with some real code, it starts to go wrong, and as the system evolves over time, that's just going to get worse.

Some people claim reasonable results where they stick to basic lookups, but I feel they've just not reached the pain point yet.

Please don't do this!

Hope this helps. Regards Tim...

Back to the Top.