Some database design guidelines and best practices

Below are some best practices known within the SQL Server database community use when designing tables.

1. All data columns should be properly sized. (i.e. a column that contains a description that will never exceed 200 characters should be sized as VARCHAR(200) or NVARCHAR(200), not NVARCHAR(MAX)). The reason for not using the MAX data type size is explained in the next guideline below.

2. The data size MAX (i.e. NVARCHAR(MAX)) should be avoided at all costs, unless no other size is available for a given column. Using the MAX data type size has serious performance hits associated with it. One example is it cannot be included in a non-clustered index and thus execution plans will perform table scans (extremely slow and inefficient) as opposed to index seeks to retrieve the data. This is critical for database performance and scalability. Query performance will degrade as the table grows with new data, the queries retrieving the rows will take longer and longer to execute due to the ever increasing number of rows to scan. The application will timeout as the queries will not return the dataset in the time required. Below are a few links regarding this issue:

EF Code First uses nvarchar(max) for all strings. Will this hurt query performance?
MAX Data Types Do WHAT?

3. A standard naming convention for table names, columns, indexes and such should be consistently used throughout the database.

4. Primary keys (PK) should be designed using the smallest appropriate data type. INT types with the IDENTITY property are ideal for primary key since they are incremental, only 4 bytes in size, and new rows are added to the end of the data file and data pages. If a primary key needs to be of type UNIQUEIDENTIFIER (to be non-guessable for example), then a surrogate IDENTITY INT should be used for the row identifier on the clustered index so as to keep page-splits and fragmentation to a minimum. Using a surrogate row identifier will allow for the primary key to be a UNIQUEIDENTIFIER without the performance hit. See my blog post below showing how to do this with minimal changes to your code and minimal performance hits to the execution plans.

Why using GUIDs as Primary Keys on a Clustered Index (Table) is not a good idea and some alternative approaches

5. Non-clustered indexes (NC) must be designed for each table before being released to production. Missing non-clustered indexes cause all queries that need data from a given table to perform table-scans instead of index seeks, which result in SQL server needing to scan ALL rows instead of the few it needs to complete the request. This can result in timeout errors on larger tables. Generally these indexes tend to be created after the data model has been settled but should be present before testing begins.

6. All database designs and changes (i.e. schema, sprocs, UDFs, indexes, Primary Keys, Foreign Keys etc.) should be reviewed by a DBA so as to verify that the design meets company standards, is scalable, performant and employs best practices.

Views – 4685

LincolnBlog.net

…my observations in the world of SQL Server, PowerShell, Linux and PostgreSQL etc…

Some database design guidelines and best practices