Every year that goes by I sit in hope that I won’t see bad database design.
Every year I am disappointed.
As an example here is a table create statement that I saw the other day (table and column names have been changed to protect the innocent)
CREATE TABLE [dbo].BestestTableEVAR(
Col1 [int] IDENTITY(1,1) NOT NULL,
Col2 [uniqueidentifier] NULL,
Col3 [uniqueidentifier] NOT NULL,
Col4 [smallint] NULL,
Col5 [smallint] NOT NULL,
Col6 [bit] NOT NULL,
Col7 [xml] NULL,
Col8 [xml] NULL,
ColA [xml] NULL,
ColB [xml] NULL,
ColC [datetime2](2) NULL,
ColD [datetime2](2) NULL,
COlE [datetime2](2) NULL,
ColF [datetime2](2) NULL,
CONSTRAINT [PK_BestestTableEVAR] PRIMARY KEY CLUSTERED
(
Col3 ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
So what’s wrong with this?
The clustered primary key on this table is a GUID. Is that bad? That’s an unequivocal YES! Read all that Kimberly Tripp (blog|twitter) has to say about GUIDs in database design.
What makes this all the more crazy is that the table has an identity column. That’s a natural clustering key ready and waiting to be used and yet for some reason it’s not.
This table is going to fragment like crazy, it won’t scale and performance will be hideous. Additionally, thanks to the XML columns this table can’t even be rebuilt online meaning there’s no way to help the fragmentation or performance without actually taking the table offline to do it, meaning it can’t handle any transactions. This is a problem in a table on an OLTP system.
I would go back and change some things. Let’s say you wanted to keep the table structure the same, that’s fine, but let’s be smart about the keys and indexes.
It would make sense to change the identity column to be clustered (I would also make this the primary key) and then, to ensure uniqueness on Col2 which is the current primary key a unique index is warranted.
CREATE TABLE [dbo].BestestTableEVAR(
Col1 [int] IDENTITY(1,1) NOT NULL,
Col2 [uniqueidentifier] NULL,
Col3 [uniqueidentifier] NOT NULL,
Col4 [smallint] NULL,
Col5 [smallint] NOT NULL,
Col6 [bit] NOT NULL,
Col7 [xml] NULL,
Col8 [xml] NULL,
ColA [xml] NULL,
ColB [xml] NULL,
ColC [datetime2](2) NULL,
ColD [datetime2](2) NULL,
COlE [datetime2](2) NULL,
ColF [datetime2](2) NULL,
CONSTRAINT [PK_BestestTableEVAR] PRIMARY KEY CLUSTERED
(
Col1 ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY];
CREATE UNIQUE NONCLUSTERED INDEX UI_BestestTableEVAR_Col2 on BestestTableEVAR (Col2);
Sure, we still won’t be able to rebuild the indexes online, but we won’t have the same crazy levels of fragmentation that we would have had before.
I know I’ll be seeing a lot of bad design this year and I know that I’ll be forced to push that bad design into production. Doesn’t stop me trying to change things however. Help your devs out, let them know when their design is a problem. Who knows, maybe you’ll change things for the better.