Upgrading you’re SQL Environment. Test, test and test.
Recently I was working on a situation where an application performed remarkably slower on SQL 2005 than it did on SQL 2000. Of course this doesn’t make any sense. SQL 2005 is a big improvement over SQL 2000 and the query optimizer is more optimized. Unfortunately this is not always the case. SQL 2005 and beyond are better but sometimes they make decisions different from SQL 2000 causing longer execution times.
The question is, is this due to SQL or is it bad database design. Well, I leave that discussion for another time and will provide you with a link demonstrating a situation where SQL 2005 is slower than SQL 2000. http://blogs.msdn.com/b/sqlserverfaq/archive/2009/04/14/performance-of-a-query-on-computed-column-can-degrade-in-sql-2005-with-the-increase-in-complexity-of-function-in-computed-column.aspx
Now in the case I was working on was a situation where after the upgrade from SQL 2000 to SQL 2005 an update took 12 minutes where it used to take a couple of seconds. The funny thing however was that this behavior was not experienced on the test system. This is an identical system, the database is a copy of the production site and the number of concurrent users was the same.
With a profiler trace the end-user experience was confirmed. The trace showed a stored procedure on production which took 12 minutes. I searched the same stored procedure in the trace file from the test system and it showed 4 seconds. More interesting was the number of reads on both systems. In the test system the stored procedure did around 200,000 reads and on the production system 40,000,000. That’s a whopping 200 more. One of the things you have to ask yourself with performance tuning is, is this execution time normal for this load? Well, 12 minutes for reading 40M records on a virtual machine seems right. At least it comes near the 4 seconds times 200, which is 800 seconds. Around 12 minutes. So, conclusion one: The execution time is correct. The next step is to check where the excessive reads come from.
Normally you would see this in a showplan right away. This stored procedure did around 70 individual updates (all with subselects like ‘where @id in (select …..)’) and a lot of before-insert and after-insert triggers. Even on a full HD monitor you couldn’t make out the plans easily. I decided to take a different approach and use the showplans as a last resort. One of the things you should do as a best practice after upgrading a database is perform an index rebuild and a full statistics scan. I did this on the production site and the performance did not change. I cleared out the procedure cache and tried it again but the results we’re the same. I did the same on the test system with a remarkable result. After rebuilding the index, full statistics scan and clearing the procedure cache the query took 12 minutes on the test system as well. A logical explanation would be that the cached plan on the test system was different than the one on the production system,causing less reads. The customer didn’t know what they tested before so I wasn’t able to reproduce this to see if that resulted in a different showplan. But then again,I’m quite sure it would under certain conditions.
Now I knew it for sure, the excessive reads we’re coming from a missing or wrong index. Using the dmv from Bart Duncan’s blog (http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx) I created a list of index candidates. Since this wasn’t our application I wanted to investigate some more to make sure the indexes would be beneficial. I contacted the ISV and asked if it would hurt if I ran the query from SSMS. They indicated that this shouldn’t be a problem. So by adding SET STATISTICS IO ON I ran the query. This showed two tables where more than 20M reads we’re done. When I checked the table both of them together did not include more than 100K rows. So, 20M reads on 100K is a lot, more than needed. From the missing index query I took the indexes that involved these two tables and placed these. I ran the stored procedure and it got from 12 minutes to 3 seconds. That seemed fair for now. After two full tests with users no performance degradation was noticed and I could call it a day.
Moral of the story is that making your application run on SQL 2005 or 2008 is more than checking the upgrade advisor. SQL 2005/2008 do have a newer/better/smarter but different processing engine. The differences could be minimal, but a difference of 1 second could have a huge impact if it involved thousands of users. Do not just test of your queries work but also test if the number of reads are at least equal, compare showplans if possible.