Char(1-2) vs Varchar(1-2) to be NULL or to be NOT NULL

Continuing the small research that I have been doing on the matter of the Varchar(1) & Varchar(2) datatypes usage, I am taking this article to consider the situations when the columns are defined as NOT NULL.

Following the same scheme as with the first post of this series, here is the code for creating 2 similar tables with CHAR and VARCHAR datatypes, which are different to the first post that those 3 columns are not null:

CREATE TABLE dbo.CharTestTableNN
(
	id INT IDENTITY NOT NULL PRIMARY KEY,
	client_name CHAR(20) NOT NULL DEFAULT 'SuperClient',
	sex CHAR(1) NOT NULL DEFAULT 'M',
	confirmation CHAR(1) NOT NULL DEFAULT 'N',
	state_code CHAR(2) NOT NULL DEFAULT 'WA',
	some_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
	some_amount DECIMAL(9,2) NOT NULL DEFAULT 128.34,
);

CREATE TABLE dbo.VarcharTestTableNN
(
	id INT IDENTITY NOT NULL PRIMARY KEY,
	client_name CHAR(20) NOT NULL DEFAULT 'SuperClient',
	sex VARCHAR(1) NOT NULL DEFAULT 'M',
	confirmation VARCHAR(1) NOT NULL DEFAULT 'N',
	state_code VARCHAR(2) NOT NULL DEFAULT 'WA',
	some_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
	some_amount DECIMAL(9,2) NOT NULL DEFAULT 128.34,
);

-- Insert 1.000.000 rows into both tables
SET NOCOUNT ON;
BEGIN TRAN
	DECLARE @i INT;
	SET @i = 0;

	WHILE @i < 1000000
	BEGIN 
		INSERT INTO dbo.CharTestTableNN
			DEFAULT VALUES;

		INSERT INTO dbo.VarcharTestTableNN
			DEFAULT VALUES;
			
		SET @i = @i + 1;
	END
COMMIT;

-- Lets compare the space occupied 
EXEC sp_spaceused 'dbo.CharTestTableNN';
EXEC sp_spaceused 'dbo.VarcharTestTableNN';

-- Drop Those Tables
DROP TABLE dbo.CharTestTableNN;
DROP TABLE dbo.VarcharTestTableNN;

The results speak for themselves: the difference is very noticeable, as in the case of the CHAR columns we have ~50MB occupied and which is exactly the same as when using NULLable columns. From the other side, our table VarcharTestTableNN which is using Varchar datatype for the columns jumped from ~46MB to almost 58MB, thanks to all the additional space occupied by the Varchar Datatype.

-- Consult the row length of the tables
select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
	from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.CharTestTableNN'), null, null, 'detailed');

select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
	from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.VarcharTestTableNN'), null, null, 'detailed');

Checking out the internal row length we can see that in this case that while the table with CHAR datatypes continues occupying 48 bytes per row, the table with Varchar datatype columns has jumped to quite fat 56 bytes per row. This naturally confirms that when you know the exact length, you should use the exact datatypes instead of the varying.

So, it looks quite logical to me, that when you have all the textual content filled out – CHAR datatype is definitely in occupying less space, but what about situations when our columns are defined as NOT NULL but they are actually empty or half empty? Lets check it out :

-- Lets create 2 additional tables, one with non null columns but with no content, and other with only half of the columns full
CREATE TABLE dbo.VarcharTestTableNN_Empty
(
	id INT IDENTITY NOT NULL PRIMARY KEY,
	client_name CHAR(20) NOT NULL DEFAULT 'SuperClient',
	sex VARCHAR(1) NOT NULL DEFAULT '',
	confirmation VARCHAR(1) NOT NULL DEFAULT '',
	state_code VARCHAR(2) NOT NULL DEFAULT '',
	some_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
	some_amount DECIMAL(9,2) NOT NULL DEFAULT 128.34,
);

CREATE TABLE dbo.VarcharTestTableNN_Avg
(
	id INT IDENTITY NOT NULL PRIMARY KEY,
	client_name CHAR(20) NOT NULL DEFAULT 'SuperClient',
	sex VARCHAR(1) NOT NULL DEFAULT 'M',
	confirmation VARCHAR(1) NOT NULL DEFAULT 'N',
	state_code VARCHAR(2) NOT NULL DEFAULT '',
	some_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
	some_amount DECIMAL(9,2) NOT NULL DEFAULT 128.34,
);



-- Insert 1.000.000 rows into both tables
SET NOCOUNT ON;
BEGIN TRAN
	DECLARE @i INT;
	SET @i = 0;

	WHILE @i < 1000000
	BEGIN 
		INSERT INTO dbo.VarcharTestTableNN_Empty
			DEFAULT VALUES;
			
		INSERT INTO dbo.VarcharTestTableNN_Avg
			DEFAULT VALUES;

		SET @i = @i + 1;
	END
COMMIT;

-- Lets compare the space occupied 
EXEC sp_spaceused 'dbo.VarcharTestTableNN_Empty';
EXEC sp_spaceused 'dbo.VarcharTestTableNN_Avg';

The results are quite logical - the empty space filled, NON NULL table VarcharTestTableNN_Empty occupies a 45.7 MB which is a bit more then a NULL table in the previous article (45.6MB). At the same time, the table with just half of the content columns filled (VarcharTestTableNN_Avg) occupies an intermediate amount of space - 53.9MB, which is exactly between the 46MB of the NULL and 58MB of the full ones.

Lets see what has changed in the terms of the row length for this tables:

-- Consult the row length of the tables
select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
	from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.VarcharTestTableNN_Empty'), null, null, 'detailed');

select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
	from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.VarcharTestTableNN_Avg'), null, null, 'detailed');

The row length speaks for itself, here with 44 bytes for NON NULL Empty table and with an intermediate result of 52 bytes for the table with only half content columns filled (VarcharTestTableNN_Avg).

These results need just one last comparison - the missing link of the Char datatype column with empty content, so lets check it out:

CREATE TABLE dbo.CharTestTableNN_Empty
(
	id INT IDENTITY NOT NULL PRIMARY KEY,
	client_name CHAR(20) NOT NULL DEFAULT 'SuperClient',
	sex CHAR(1) NOT NULL DEFAULT '',
	confirmation CHAR(1) NOT NULL DEFAULT '',
	state_code CHAR(2) NOT NULL DEFAULT '',
	some_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
	some_amount DECIMAL(9,2) NOT NULL DEFAULT 128.34,
);

-- Insert 1.000.000 rows 
SET NOCOUNT ON;
BEGIN TRAN
	DECLARE @i INT;
	SET @i = 0;

	WHILE @i < 1000000
	BEGIN 
		INSERT INTO dbo.CharTestTableNN_Empty
			DEFAULT VALUES;
			
		SET @i = @i + 1;
	END
COMMIT;

-- Checking on the row length
select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
	from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.CharTestTableNN_Empty'), null, null, 'detailed');

-- Checking out the space occupied
EXEC sp_spaceused 'dbo.CharTestTableNN_Empty';

The results are presented here on the screenshots with row length being 48 bytes and occupied space being around 50 MB. The Char datatype are consistent through different types of situations and seems to be quite predictable.

Basic results:

Lets compare all the results we have tested so far (I am already including the math of removing from the reserved space the pre-allocated unused space):

Name Description Row Length (Bytes) Size (KB)
CharTestTable Char NULL 48 48888
CharTestTableNN Char NO NULL full 48 48888
CharTestTableNN_Empty Char NO NULL empty 48 48888
VarcharTestTable Varchar NULL 44 45632
VarcharTestTableNN Varchar NO NULL full 56 57884
VarcharTestTableNN_Empty Varchar NO NULL empty 44 45632
VarcharTestTableNN_Avg VarChar NO NULL hallf full 52 53904

There are a couple of things visible to me from this table - first of all is that with CHAR you get the what you request in the terms of the space, independently from the final result you get some constant values. The tables with Varchar(1) or Varchar(2) are the same consistent story of the variable length columns - it can be smaller and it can be bigger, depending on the content and the rules.

The interesting thing is so far being confirmed that even those small datatypes perfectly having their own space in the database design. Also, if your table has a lot of NULL or empty spaces then go for Varchar datatype, it can improve occupied space by the table.

To be continued

4 thoughts on “Char(1-2) vs Varchar(1-2) to be NULL or to be NOT NULL

  1. Bradley Ball

    Hi Niko, really interesting article. I enjoyed it alot.

    When a Varchar Column conatains a null value, instead of having to store that value the Null Bitmap (2005 and up) marks those portians of the data record as null, and no variable length array or variable length data is stored in a regular data record.

    Very cool stuff!

Leave a Reply to Rob Volk Cancel reply

Your email address will not be published.