Varchar(x) where x <= 2 does not make sense… or does it?

While preparing my “Inheriting a database for Developers” session for SQLSaturday#162 in Cambridge, I have had a very interesting finding on the matter of the Varchar datatype usage. There is a more or less common knowledge of the SQL Server internals, that the Varchar datatype internally adds 2 additional bytes to the content of the column. Those 2 bytes are used to control the length of the text content internally, so SQL Server knows and understands where the text actually ends. From this point of view, even when the content of the column is empty, then it takes 2 bytes of space anyway.

With this knowledge, it was for me extremely easy and logical to believe that creating any columns with Varchar(2) or even Varchar(1) datatypes does not make any sense at all.
I know that I did believe in it until a couple of days ago, until I actually started playing with it. :)

This is a first blog in the series of exploring a little bit deeper this datatype usage.
I have found that the following situations are very interesting to investigate:
– Tables with Varchar(x) NULL;
– Tables with Varchar(x) NOT NULL, having different amounts of data;
– Tables with Varchar(x) NULL, using Sparse columns (SQL Server 2008+);
– Tables with Varchar(x) using different Data Compression methods (SQL Server 2008+);

All of the findings will be compared between CHAR(x) and VARCHAR(x) datatypes on SQL Server Box and Azure SQL Database editions.

Some basic information:
First of all, lets consider tha we should bare in mind that both datatypes Char and Varchar while being similar in purpose of holding a some short text content, are very different in their internal implementation nature. The Char datatype size is always well defined and is placed on the datapage at the location and at the sequence order defined by the table creation, while Varchar, like other variable length datatypes are being moved from their position to the end of the table definition, right after the last fixed size column datatype. This gives a very interesting twist to the final results of this experiment. :)

1. We shall create 2 similar tables using different datatypes (Char and Varchar) compare the respective occupied space. Both tables will have 7 columns, clustered primary key and the differences between them shall be marked in bold.
2. We shall insert 1.000.000 rows into each of the tables, so that we could have some information to play with. The final results will naturally be much bigger on bigger tables or in more extreme internal layouts, with much wider rows.

I won’t spent more time keeping you waiting, and lets dive into SQL Server 2012 and experiment the following code:

CREATE TABLE dbo.CharTestTable
(
	id INT IDENTITY NOT NULL PRIMARY KEY,
	client_name CHAR(20) NOT NULL DEFAULT 'SuperClient',
	sex CHAR(1) NULL,
	confirmation CHAR(1) NULL,
	state_code CHAR(2) NULL,
	some_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
	some_amount DECIMAL(9,2) NOT NULL DEFAULT 128.34,
);

CREATE TABLE dbo.VarcharTestTable
(
	id INT IDENTITY NOT NULL PRIMARY KEY,
	client_name CHAR(20) NOT NULL DEFAULT 'SuperClient',
	sex VARCHAR(1) NULL,
	confirmation VARCHAR(1) NULL,
	state_code VARCHAR(2) NULL,
	some_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
	some_amount DECIMAL(9,2) NOT NULL DEFAULT 128.34,
);

-- Insert 1.000.000 rows into both of the tables
SET NOCOUNT ON;
BEGIN TRAN
	DECLARE @i INT;
	SET @i = 0;

	WHILE @i < 1000000
	BEGIN 
		INSERT INTO dbo.CharTestTable
			DEFAULT VALUES;

		INSERT INTO dbo.VarcharTestTable
			DEFAULT VALUES;
			
		SET @i = @i + 1;
	END
COMMIT;

-- Lets compare the space occupied 
EXEC sp_spaceused 'dbo.CharTestTable';
EXEC sp_spaceused 'dbo.VarcharTestTable';

These are results of what those queries, with Varchar clearly occupying much less space. The difference is around 4MB which represents in this case around 10% of the total space, which is pretty amazing to say at least. What has just happened here? Why on planet Earth do Varchar is occupying much less space, when it was clearly stated, that it has 2 extra bytes, so it should be at least equal to the table with Char datatypes.

I was lucky enough to remember a great article on the matters of NULLable records when using variable datatypes, written by Kimberly Tripp. I don’t want to spoil all the fun of reading that article, but I shall just refer that the Varchar Nullable columns shall not occupy a lot of space when having a NULL value inserted. This actually means that the records of the second table (VarcharTestTable) are actually smaller. We can confirm this by consulting the internal information of the both tables:

-- Consult the internal data of our tables
select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
	from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.CharTestTable'), null, null, 'detailed');

select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
	from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.VarcharTestTable'), null, null, 'detailed');

Consulting the results brings some very interesting information: the minimum and the maximum row length of the CharTestTable is 48 bytes, while the rows at the VarcharTestTable occupy just 44 bytes. Also, at the last index level the number of pages (8K) differs between 6212 for CharTestTable and 5862 for VarcharTestTable.

This actually means that in the cases where almost every row of our table is NULL, the Varchar(1) or Varchar(2) datatype actually can be quite useful in the terms of space. In the second part of this post, we shall take this matters a little bit deeper and from more different angles. :)

-- Drop the tables if not needed anymore
DROP TABLE dbo.CharTestTable;
DROP TABLE dbo.VarcharTestTable;

Update for tests on Windows Azure SQL Database on 27th of August of 2012:

I have successfully tested almost the same script on Windows Azure with a couple of exceptions, which shall be revealed in the next post. I just reveal that after inserting the required registers I have rebuild both clustered indexes with this commands:

Alter Table dbo.CharTestTable
rebuild;

GO

Alter Table dbo.VarcharTestTable
rebuild;

Since at the moment of the writing, there is no support for the so much beloved sp_spaceused procedure, I have used the following script for consulting the created table sizes

select    
      obj.name as TableName, sum(reserved_page_count) * 8.0 / 1024 as SizeInMb
from sys.dm_db_partition_stats part
   join sys.objects obj
   on part.object_id = obj.object_id
group by obj.name

My results were 48,76 MB for CharTestTable and 44,64 MB for the VarcharTestTable, which are in par with the results obtained on the SQL Server Sunshine edition.

I have used then the very same script to consult the internal values of the row length for both tables and they were perfectly the same as on the Box version of SQL Server.

select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
 from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.CharTestTable'), null, null, 'detailed')
union all
select OBJECT_NAME(object_id) as TableName, min_record_size_in_bytes, max_record_size_in_bytes, alloc_unit_type_desc, index_level, record_count, page_count, avg_page_space_used_in_percent
 from sys.dm_db_index_physical_stats (db_id(), object_id('dbo.VarcharTestTable'), null, null, 'detailed');

Presenting at SQLRally 2012 Nordic

Yes, it is true! I have been selected to present at one of the biggest European SQL Server events – at the PASS SQLRally 2012 Nordic Edition! No words can describe the excitement and the honor that I am feeling of being able to present there. When I look at the list of the presenters, I cannot stop thinking that someone just made a mistake picking me up, because the agenda looks so amazingly impressive – Connor Cunningham, Mark Souza, Adam Jorgensen, Alberto Ferrari, Marco Russo, Chris Web, Hugo Kornelis, Tobias Ternstrom, Buck Woody, Thomas Kejser, Victor Isakov, Allan Hirt, Stacia Misner, Denny Lee, Lara Rubelke, Neil Hambly, Klaus Aschenbrenner and others.

I am in deep deep awe and while I am still conscious I am trying to send the warnings and some vital signals into my melting brain that I am definitely not belonging into that club. The most of those people are simply living legends of SQL Server, and I will be spending large number of hours preparing myself to try to deliver the best possible presentation of my life. You probably know the phrase that with a great powers comes great responsibility, and that is precisely what I am feeling at the moment. :)

One of the funniest things about the presentation that I am going to deliver at SQLRally is that it will be actually the third time that I am going to deliver it – and every time it will be slightly different, as well as the presentation language will be different. I have presented my session on Dynamic SSIS Frameworks on a regular meeting at SQLPort in July of 2012, and that presentation was delivered in Portuguese language. The second time I shall present this session (with some changes) will be on 15th of March in Munich at SQLSaturday #170 and the language of choice shall be German.
Naturally, for the SQLRally this presentation shall be adjusted (based on the comments and on the time constraints – 75 minutes instead of 60 min for SQLPort and 50 min for SQLSaturday#170), and naturally I shall be presenting it in English.

Join me at SQLRally and feel free to meet and greet me there – because there are no doubts that this event is going to be a very remarkable event in the life of so many, and especially in mine.

Going to and presenting at SQLSaturday #170 Munich

The first SQLSaturday in Germany is just a couple of weeks away and I am so excited about it. Knowing one of the organizers – Constantin Klein, I am confident that this event will be a great success and generally I am genuinely excited that Germany is finally stepping in for one of the best tech community events in the world. Looking at the presenter’s list, it feels like a very serious SQL Server experience – Hugo Kornelis, Allan Mitchel, Dejan Sarka and Jen Stirrup between others are very recognized in the community.

Doing the first SQLSaturday in Germany in Munich is an obvious choice for me, since Microsoft’s Headquarters in Germany are located exactly at the location of the SqlSaturday – at Unterschleißheim, near Munich. There is no need to describe the beauty or the importance of the Munich as a city on a global level. Coming there once in a lifetime is a must for anyone looking for culture or business.

I am very happy and deeply honored to be selected as one of the speaker, and I will be working fiercely to improve my presentation on “ETL Frameworks with Dynamic SSIS Packages” which I have already delivered exactly 2 weeks ago for my local Portuguese PASS Chapter. I will be doing my very best to deliver my presentation in German, so it will take some serious work from my side, since it has been quite some ages, since I talked German in public.

I am looking forward to meet and greet German SQL Server Community and to meet some good old friends like Mark Broadbent, Milos Radivojevic, Jen Stirrup and Hugo Kornelis between others. If you live in Germany, then there are definitely very few excuses to not to be at the single most important SQL Server Community event of the year in Germany.

SQLSaturday#148 part VII

Writing this last post about the SQLSaturday in Shanghai being almost 40 days away is such a rewarding task, mainly because it wakes up so many great memories and emotions about 3 days spent in China. There were so many amazing experiences for me personally, that it has inspired me writing so much on blog. The time has passed very fast, with a speed of a local train between Shanghai and Suzhou, which easily reaches 300 Km/h.

Was it a good time? Just look at the picture to get an idea. I have met people from the local SQL Server Community, a good old friend and made whole lot of new ones. I have learned more stuff about SQLServer while attending English-speaking sessions and even some Chinese-speaking ones. The whole experience was a big plus and I am definitely looking for more. I know that I will see around a lot of the people I met there – at the PASS Summit, at other SQLSaturdays. Hopefully soon, because I have had way too much fun and I am definitely looking to repeat it. :)

I just wish that the Chinese SQL Server community will use the impulse of this event for its own good. Those people in attendance are ready for more, and Chinese SQLPass Chapter in Shanghai can give them what is needed. I am confident that the next time they have an event, a lot more people around the world shall join their forces to support them.

Going to SQLSaturday #162 Cambridge

My good friend Mark Broadbent and his team are organizing the very first official SQLSaturdy in the UK – The SQLSaturday #162. This is going to be a huge and important landmark for the SQL Server community in the United Kingdom, which important can not be underestimated. There were already 10 editions of SQLBits with Saturday offering a free entrance for anyone, but this is a different kind of an animal – this is a SQL Server community event which I expect to set a new high standard on the SQL Server events.

Besides the free SQLSaturday, the SQLCambs team are organizing 3 pre-cons on friday with 3 great specilaists – Jen Stirrup, Buck Woody and Mark Rasmussen. This is going to be a very tough choice for the most SQLPeople for sure. I highly recommend anyone and everyone signing up and attending those pre-cons. I am loooking forward to find out if I can get a day extra to attend one of them. Which one? – I am not sure, since all 3 of them are looking so damn good. :)

I think there must be very few people in Europe or for that matter around the world, who has never heard about Cambridge – their University is constantly being referred as a top-5 University worldwide. Cambridge technology pool is also know for some big corporation that it has produced, and for SQL Server people, it will be enough to refer the name of Red Gate Software.

I feel deeply honored to be selected to present at the SQLSaturday #162 Cambridge. The organization committee has selected my session «Inheriting a database for Developers» and the moment I have received the confirmation started randomly jumping around my chair – with my coworkers probably picking up their phones and calling for the emergency and security. :) haha. I have already presented this session almost 1 year ago at SQLPort, but I am already working for a new version which I expect to have some things in common, but to be a new generation of it, a version 2.0 so to say.
I cant make any promises, besides that I will spend the biggest amount of time I have ever spent preparing a presentation, because I really want it to rock.

I am looking so much forward to this event that I really can’t describe – there are so many great people that are going to be around: Mark Broadbent, Neil Hambly, Jen Stirrup, Laerte Junior, Karla Landrum, Hugo Kornelis, Gavin Payne, Andrè Kamman, Rob Volk, Andy Leonard, and much more!

SQLSaturday#148 part VI

SQLSaturday #148 Speaker party was a dinner, and for those who attended it, it will stay in the memories for a very long time. Looking at the picture – do you see anyone being unhappy or unsatisfied? The dinner was absolutely amazing if you ask me – the food and the company were memorable, and we simply have had fun, enjoying the moment.
I have to say that I feel honored sharing all those moments with each and everyone who were present, and that the connections that were made at this event are to last.

Besides enjoying the company and having an enormous amount of fun, we still have had time to share our impressions on the event, to discuss some things that went right and wrong, and make some expectations on what the future might bring to the Chinese SQL Server community. Everyone wants to see the results of their work, their impact, to see things growing, each and everyone wants to leave their print on whatever they choose to do – and knowing that we did our best to help SQLFamily in China to grow is definitely a big satisfaction for me.

Sure thing, nobody knows how things will look in the future, but there was a strong spark of the community which should help community to carry on, reaching for better and bigger results. :)

On the next day, Tiffena was kind to take Mark Stacey, Matt Horn and myself for a trip to Suzhou for a Humble Administrator’s Garden visit. Besides being an amazing place by itself, this short trip (around 110km) provided me with a better understanding & insight into the Chinese society normal life. I have had experienced different types of transportation – subway, local train, bus, catching a cab on the train station (easy), catching a cab near the Gardens (impossible). :)

This was an important experience for me, because it was an outlook on a real life, that people in China are living.

There are actually very few words that can describe the beauty of the gardens – you gotta just walk around, get lost in them to be able to feel them. I enjoy greatly staying in any garden – being surrounded by the nature is a big pleasure, and I certainly believe that the gardens are created for relaxing and viewing pleasure – so if you are inviting me for a walk in Garden, most certainly I will accept it. The Humble Administrator’s Garden had a lot of places to view and a lot of opportunities to sit down and have a talk – which we certainly did a number of times through the day.

The main problem I have had during that day was the exhaustion and the temperature combined with a humidity – it was absolutely devastating. I had to stop, sit down and drink a lot of water in order to be able to carry on. Especially the humidity was something which I have found out was quite difficult to support. The beauty of the gardens and the company were really compensating – I have had so many great conversations that I am constantly coming back to, though it is being almost a month that since this trip has passed. The positive things were in such a superior quantity, that they are what I cherish the most from this trip.

This is a photo was taken near the gardens and when I look at it – it takes me back into a different century, into a different time and space where people were living in a much simpler way. I wish I had an experience of spending a week exploring this beautiful places like Bill Graziano did. I am sure that one day there will be an opportunity to come back with my family and explore chinese cultural heritage. :)

(to be finished)

SQLSaturday#148 part V

I have already written IV parts about my SQLSaturday experience in Shanghai, but this will be the first which will be dedicated to the event itself. Warning – this post is an honest point of view, and while you might not agree with it at all, it represents what I think and feel about the event itself. Do not forget that it takes so much courage and personal time to organize any event, and all organizers of the SQLSaturday #148 Shanghai deserve a lot of respect for all the hard work that was done. My most sincere admiration goes to Tiffena, who was the driving force behind this event, while being thousands of miles away in Seattle.

The schedule of the event was very nice for the first event of this kind in China – with PASS President Bill Graziano opening the “sql hostilities”, and then bringing some serious BI firepower with Pragmatic Works South Africa founder Mark Stacey. Han Wong then delivered the first chinese speaking session of the day, and then it was my time to carry one the mission right into the lunch break. After the lunch and some very interesting conversations with some of the attendees, we had another Pragmatic Works SA guy, Matt Horn delivering a very interesting session on the SSAS manipulation with a SQL Agent. The finishing touches for the event were given by Wee Hyong Tok and Miguel Granados, who were presenting on SSIS and DQS respectively.

Every great event has some things to improve, and this one is no exception – first of all, I felt that the right balance between the local and the foreign speakers was not achieved. I always try to balance the events I am organizing by giving around 50% of the time to the local speakers – this gives the opportunity for them to grow their CV into a bigger international events. Also, the event felt like a simple mixtures of sessions and not a cohesive event which were telling a story to the attendees: BI was dominating the event, but there were 2 T-SQL sessions (Bill’s and mine) and there was also some Azure with a Data Sync Introduction. I would like to repeat once again though that for the first SQLSaturday, organized by the people who has never been to a one, it was a very solid effort.

The expected number of attendees as we were told by Tony was around 70 (exact number of the registrations), which was quite surprising as for example in Portugal we have a rule of 2/3 for any big events, which means that only around 70% of the people will appear. The final count for me was 48, but I admit that I might have missed a couple of people. Also to note is that the knowledge and the experience level of the attendees was far from what was expected – the crowd was most definitely not a advanced professionals group as we were told a day before.

Unfortunately for Bill Graziano, he had to find out about it in the worst way – while delivering his presentation (having some advanced topics), because of the lack of the feedback he asked how many people actually knew how to read and to us the execution plans of the SQL Server. At this very moment I was extremely close to getting a heart attack, because there were only 3 people in the crowd who responded positively on his question. Since I have had 2 hours before my presentation, I started rewriting it in panic, throwing away all more advanced stuff.

From my personal experience of the event I really enjoyed it a lot, though I had to skip some amount of the presentations firstly because of the changes that I was doing to my presentation and secondly because of the timezone difference. I almost felt asleep at one of the sessions, and so asked Tiffena and Bill Graziano to accompany me to the nearest place to get some coffee.
I was a kind of fascinated by the Wee Hyong Tok’s presentation on the new functionalities in SSIS because I really enjoyed it a lot, and because I felt like I was really getting what he was saying in Chinese. Maybe it is because I already tried the new SSIS in SQL Server 2012, but I am really confident to say that it is definitely has to do with his skills as a presenter. To get an attention of somebody living half across the globe, while speaking a language which has nothing in common – this takes a lot of skill to do. :)

My own presentation went quite well, and I felt that I have managed to get the people interacting, which is always important when having a language barrier. The question about the EXEC WITH RESULT SETS function, asked by Peng, really brought me to investigate better the new functionality and I am definitely coming with a blog post on this matter in the next weeks. I am definitely looking forward to coming back and presenting once again in China in the future.

By the end of the event, all of the international speakers were really ready to take some rest. Actually the event ended sooner as planned because one of the local speakers did not show up, but the timezone difference, the hot temperature and all the excitement were definitely getting the best of us. We were ready to get some rest before the speaker party, organized by Tony & co.

Lessons learned :
1. To be more flexible and to be ready to change your presentation on the fly if you see that the attendees are simply not ready for the material. Naturally it depend on the situation, and if you have just a small part of the attendees that have no knowledge to keep up, then give a small introduction in order to bring them a little bit up to speed.
2. If your are presenting in a non-English speaking country, then adjust .. your .. presentations .. speed. The people might have difficulties in understanding some of the words and terms or they might be having difficulties in translating your accent.
I think that I have got a lot of interaction with the crowd, because I managed to speak slowly and adjusted my presentation to a simpler level.

Update on 17th of July, 2012: Tiffena sent me a message saying that actually after contacting a number of the attendees after the event, she found out that they have considered the presentations to be quite simple and easy. One more lesson for me to learn: in a multicultural environment you never know until you know.

(to be continued)