Move along... No dacpac to see here!!!

By robert at April 26, 2010 22:19
Filed Under: SQL Server 2008 R2, Data Tier Applications, VS2010

So before I will show an alternative way to achieve application lifecycle management let's get a shortlist of what tools you need to even begin to use the Data-tier Applications. You must have:

  • SQL Server 2008 R2
  • Visual Studio 2010 Ultimate or Premium (Professional has limited power here)

So, by just using SQL Server 2008 R2 and Visual Studio 2010 Ultimate can I achieve the same as this awesome thing called a DACPAC? Good news, you can and I will show you how.

I've created the same database as in the first post. Now I will fire up VS2010 and choose a SQL Server 2008 Database project template. It looks the same as the Dacpac template but there are some important differences which we will notice later. First off you will have to import the schema either by using a script or connecting to the database and import it from there.

First off is to add a CustomerAddress column. This is the same as with the Dacpac. When that's done make life easy for the sake of the demo and right click the project and choose Properties. Select the Deploy tab and put in the right connection string and live database name. This is just for the sake of not typing all the stuff in the cmdbox. But if you are in a life environment get some guidance here.

Then start off a VS2010 command line to be able to use VSDBCMD which is a command line tool to deploy databases. And if you were thinking that a DBA does not have VS2010 check out this page to see how that's dealt with. Navigate to the debug folder of your project where all the output will be created. Notice the files in the screenshot below. Some of theme you will see in the Data-tier as well. Now run VSDBCMD /a:Deploy /dd+ /manifest:[Manifestfile]. This will deploy your database.

Now that was easy, nobody was hurt and no DACPACs were killed using this procedure. So let's see how VS2010 and this SQL Server 2008 Project does scenario 2 where we add a column with a default constraint.

  1. Add the OrderDate column as DATE data type with NOT NULL to the .sql script for the Orders table. Notice how this is trimmed down from the Data-tier version. There are no constraints here. So where are they? Well, where they belong, at the constraints folder. Nice to see everything nice and tidy.
  2. So navigate to the Constraint folder, right click and choose Add Item. From the next screen choose Default Constraint. Yes, it's there. It was not with the Data-tier Application, but it is supported on this SQL Server 2008 Project.
  3. So, add the constraint and click Build followed by Deploy.
  4. Fire up the command line again and run the same command as before and presto. Database updated.

And you might want to take notice of the next points as well:

  • No database was copied, dropped or created.
  • No data was copied unless absolutely necessary.
  • You can specify file groups and files within the database project if you like.

What's the catch? Nothing, this is all available in Visual Studio Premium and Ultimate as with the Data-tier applications. But wait, it gets better. This stuff works on older SQL versions as well. And it can do more, like Schema compare and data compare. It also implements with your build process and can be included in web deployment.

So we have a great tool (which we already had as Add-in for Visual Studio 2008) that does amazing things with a database. This tool has been around for somewhat two years now and is getting better with every release. And here comes SQL 2008 R2 to the rescue with the Data-tier application. It does less, does it worse and just creates a new database in essence. In my opinion it has the words EPIC FAIL written all over it. Maybe it's time Microsoft took a good look inside its own organization and starts normalizing. There is some redundant data there (remember LINQ2SQL and the Entity Framework).

The way the Data-tier application is implemented at this point I will not use it. If in a future release it will be extended and there will be more (live) migration possibilities with Multi Server Management it will be a whole different story.

Warning! DacPacs ahead!!!

By robert at April 24, 2010 16:36
Filed Under:

Last week I attended the European PASS Conference in Neuss and I delivered a session on the new DBA features in SQL 2008 R2. One of these new features is the Data Tier Application. The more I'm looking into the new features the more they seem useless to me. In this post I want to take you through some common scenarios and see how the dacpac handles these. In my next post I will show you an alternative way.

Scenario 1 – Add a simple column to the database

As a start I will take a simple database with a Customers table and an Orders table.

So first you will have to register this database as a Data Tier Application. Open up a SQL Management Studio and navigate to the database, right click and choose Tasks -> Register Data-tier Application. This will guide you through a simple wizard and register the database as a Data Tier Application.

Now the users comes in and wants to have a column in the Customer table so he can saves the Address if a customer supplies one. The address column will not be mandatory. So again, right click the database, choose Tasks -> Extract Data-tier application and go through the wizard. After the wizard has finished a dacpac will be created in the chosen folder. Remember this folder.

We will hand over the dacpac to the developer so he can import it into his Visual Studio Environment and add the desired column to the table. He will then change the build number and build the solution. This will create a new dacpac which he in turn can give to the DBA who can upgrade the existing database.

So, pretend to be a developer and open up Visual Studio 2010. Start a new project and choose the SQL Server Data-tier Application template. When the project is created you can right-click the project and choose Import Data-tier Application. This will fire up a wizard. Point the wizard to the earlier dacpac and import it. When the import is successful you can navigate to the Customer table and add the column as illustrated in the screenshot below.

The next step is to navigate to the project properties and raise the buildnumber, add a description if desired and build the solution.

Once the solution is build a new dacpac is created which the DBA can use to upgrade the database. To upgrade it, select the Management folder in SQL Server Management Studio and expand Data-tier Applications. Right click your application and choose Upgrade Data-tier Application. Step through the wizard and watch the result closely.

So, what just happened?

  1. SQL Created a new database.
  2. SQL created the tables.
  3. SQL disabled the constraints
  4. SQL changed the real database to SINGLE_USER
  5. SQL changed the real database to READ_ONLY
  6. SQL copied all the data from the real database to the copy
  7. SQL renamed the real database.
  8. SQL renamed the copy to reflect the old real name.

So, instead of an upgraded database, you now have two databases, old and new. Which is cool for back-out scenario's but not so cool if you have a large database. First it will take forever to load the data, second you end up with twice the data. But wait it gets better.

Check out the file properties of the old one and the new one. To illustrate it better I will show you a screenshot of what happened at my system.

So, what do you notice? My database files changed. In the original database I specified a growth in MB for the logfiles which is changed to a percentage growth. The explanation is easy. As you might recall a new database was created. And as most new databases the model database is used as template. Same case here. So you lose all your carefully planned growths and predefined sizes. Cool, imagine upgrading your 1TB database and see SQL move over the data when the logfile is set to 1MB en 10% growth. If it's done you can send your VLF count to Kimberly Tripp or Paul Randall for trying to achieve the most VLF's in a log file. And if you we're wondering how about multiple files in the primary filegroup: Gone!!! Multiple filegroups: not supported by Data-tier Applications.

Scenario 2 – Add a column with a default value

Next scenario is adding a column to the Orders table that will hold the OrderDate. If an order date is not supplied a default value of 01/01/1970 must be used. So the developer will continue on his solution and add the column and then the constraint. But something's going on here. You cannot add the constraint in the constraint folder in the project, it simply does not exist. So, why not add it to the .sql like the primary key constraint.

This will result in an error: SQL02014: A Default constraint can exist only at the column level in an ALTER TABLE statement.

So, last alternative is to add a ALTER TABLE statement in the post deployment scripts like this.

But if you upgrade the database the constraint is not added. In fact the post deployment script is never executed and if you would take the effort to deploy the dacpac from within Visual Studio you will be presented with an error saying the Orders object is not available. This is because the post deployment is executed in the MASTER database context.

So to recap, a Data-tier application is just another way to script out a database but in a bad way. If you just script out the database using the SQL Management Studio filegroups will be scripted as wel as multiple datafiles. And I haven't bothered to touch a different collation setting but in this whole experience it wouldn't surprise me if the collation setting was reset to the instance collation.

But do not be afraid, in my next post I will show you the proper way using the same tools you use for the Data-tier Application to deploy and upgrade databases.

Microsoft and CEC, do I care?

By robert at April 07, 2010 10:43
Filed Under: Microsoft, SQL Server, SharePoint, CEC

I was preparing a post on building reports with Report Builder against the SCOM data warehouse and why you would not use the reports in SCOM itself. But then I got distracted by a post on the Microsoft Common Engineering Criteria (CEC). While the home page, (http://www.microsoft.com/cec/en/us/default.aspx was promising "The goal of the CEC program is to reduce the overall total cost of ownership (TCO) through improved integration, manageability, security, reliability, and other critical infrastructure attributes that are expected by our customers." I was still a bit skeptic.

On the same page there is a scorecard link showing how each server product scores on the CEC. This image shows a part of the scorecard for SQL Server 2008

So for all the server products a scorecard is listed. There is however something odd going on. E.g. SQL Server 2005 and SQL Server 2008 are on the list but SharePoint 2007 is not. SharePoint 2010 is on the list but this isn't generally available yet. Same goes for Exchange, 2007 and 2010 are both in the list. Apparently SharePoint 2010 supports hot-add CPU and SQL does not. I was not aware of that. Also SharePoint complies with the Data Protection compliancy. If you read the description of this compliancy it says "Restores the application and operating system, including quick recovery". So if I wanted to restore a site collection, or a list within a site collection, or a document within a list, this would all be possible not to mention quick. Well, some would be of course, and depending on your infrastructure it would even be quick. This brings whole new insights to SharePoint for me. According to these scorecards SharePoint 2010 scores great, which I believe it does. But it would have been fair to put SharePoint 2007 on it as well. Just to see the improvement that was made with the next version.

But…. Something is missing. I don't see a compliancy "Follows best practices on SQL Server (or any other Microsoft product for that matter)". In my daily work I deal with a lot of Microsoft server products and I've been a DBA for quite some time now. I have some developer skills and I know about some of the best practices. I will always make sure my application works with the least privilege principle. And when I need to give permissions to objects I grant these permissions to a database role and map a login to the role.

So, SharePoint you ask. It works great with SQL, does it not?

  • According to the manual some SharePoint (2007) account needs to be DBCREATOR and SECURITYADMIN. No, it does not. There is even documentation on this but it's separate from the main installation documentation.
  • If you pre-create the databases and make an AD-group with your SharePoint admins DB_OWNER, installation fails. Has to do with the script which creates some objects specific in the DBO schema and some in no schema. Which in turn means that SQL will create a schema for the user (because you made an AD-group DB_OWNER and not a single user) and create the objects in that schema. And then SharePoint cannot find the objects because there not in the DBO schema. But in all fairness, the documentation states clearly to give the Setup User account DB_OWNER permissions and not the AD group the Setup User account is in. So, that's one on me trying to be smarter than SharePoint.
  • If you are en experienced DBA you might consider creating separate file groups. Not with SharePoint. Some SharePoint databases will explicitly create objects in the PRIMARY file group.

But you cannot just blame SharePoint alone. Have you ever worked with SCOM 2007, or tried to install Reporting Services for SCOM integrated with SharePoint. I love to see Microsoft products work well together. But the best I've come across is OCS 2007 R2, which is a big improvement on OCS 2007. This Microsoft product uses another Microsoft product, SQL Server. Famous for scalability, consolidation etc. etc. And this product (OCS) needs to be installed on a SQL Server with no other databases on it. Just alone on its own instance. And that's new too. There was a time you could not place a second instance on the same machine. I bet you cannot name the databases yourself either and the service account needs to be in the local administrators group and on SQL it needs the SYSADMIN, SECURITYADMIN and DBCREATOR server roles. Great.

It is completely beyond me why it seems impossible to follow some best practices Microsoft puts on its own site to be followed by its own products. And wait, SharePoint 2010 is way cooler because there are index tuning and maintenance timer jobs in SharePoint. So, behold DBA man, I'm SharePoint man and I will take care of those indexes for you.

Enough for now. There is good stuff to, don't let that be misunderstood. And SQL is a complex product as is developing a good database application as big as SharePoint. My next post will be about reporting in SCOM and why not use the built-in Reports. That's just me trying to change the Microsoft world one bit at a time.

Data Tier Apps – What they can and cannot do...

By robert at April 01, 2010 22:50
Filed Under: SQL Server 2008 R2, Data Tier Applications

One of the new features in the upcoming SQL Server 2008 R2 release is the Data Tier Application. The Data Tier Application is a single unit (dacpac) which contains all objects for the database. You can deploy it to a R2 server (for now) or pass it to a developer to have him alter it in Visual Studio 2010 en pass it back to upgrade the database.

So, that's all cool but there are some caveats. First, creating a dacpac. If you create (extract) one from an existing database it cannot contain all SQL Server objects. Actually refer to the list in Books Online with what can be there http://msdn.microsoft.com/en-us/library/ee210549(v=SQL.105).aspx. But in short, Extended Stored Procedures will not work as well as CLR functions.

Another thing to notice is what happens when you upgrade a Data Tier Application. Check out this screenshot. It's the final page after you successfully upgrade a Data Tier Application.

What is important to notice is that your database is not actually upgraded but a second database is created with all the objects, following the data is copied. When that's all done, the databases are switched and made available to the users. So, seems logical but what if you have a 1TB database. When you have it registered as a Data Tier Application and are upgrading it as such you will end up with two 1TB databases. Not a big problem per se, just something to remember.

Another thing to remember is what if you were to change the database. Say, for example, a VARCHAR column has a length that's too short and the developer is on a holiday. As an experienced DBA you corrected the developer's error but what if you upgrade the Data Tier Application. The alteration will be detected and you will not be able to upgrade unless you accept the possible loss of data.

There is a lot more to tell about this feature. The best way is to go through this excellent document on the Microsoft website. It explains all the (im)possibilities and has some great examples on using PowerShell in combination with Data Tier Applications.

Count # of words in a string

By robert at March 17, 2010 16:55
Filed Under: T-SQL, CLR, SqlZaterdag

Two weeks ago I was at the first Dutch SqlZaterdag which is the Dutch version of SQLBits or SqlSaturday. It was a great day and almost 200 people showed up. I’m sure there will be another SqlZaterdag.

This post is a reply to the session from @andrekamman and Henk van der Valk who has put the presentation online. In their session there was a question: What is the fastest way to count the words in a string? The correct answer according to Andre and Henk was:

select LEN(@s) - LEN(replace(@s, ' ', ''))+1;

Which indeed does the trick. If @s is “To be or not to be, that is the question.” it will return 10. But what if someone didn’t put a space after the colon or put two spaces. A typo is easily made. To test it I used this script:

declare @start datetime2 = SYSDATETIME();
declare @end datetime2;
declare @s varchar(100);
declare @s1 varchar(100);
declare @s2 varchar(100);

set @s = 'To be or not to be, that is the question.';
set @s1 = 'To be or not te be,that is the question.';
set @s2 = 'To be or not to be, that  is the question.';

select LEN(@s) - LEN(replace(@s, ' ', ''))+1;
select LEN(@s1) - LEN(replace(@s1, ' ', ''))+1;
select LEN(@s2) - LEN(replace(@s2, ' ', ''))+1;
set @end = SYSDATETIME();
select @start;
select @end;
select DATEDIFF(MICROSECOND, @start, @end);

The first time it takes a wile to run but the second time it will run much faster. The problem is that this will return three different outcomes, 10,9 and 11. So, yes it maybe the fastest way to count words in a string but not the most precise. But then again, if that’s not what you’re after, this is it.

If you would ask a developer for a solution he/she will say Regular Expressions. So, how do we get these in T-SQL. Yes, by using a CLR function.

Fire up Visual Studio 2008 and create a new Database – SQL Server Project. Give it a name, in my case I named it CountWords. Select your project and choose Add – User-Defined Function

image

Name the function RegExCount and replace the contents with this code.

using System;
using System.Text;
using System.Text.RegularExpressions;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlInt32 RegExCount(string pattern, string message)
    {
        Regex r = new Regex(pattern, RegexOptions.IgnoreCase);
        int i = r.Matches(message).Count;
        return new SqlInt32(i);
    }
};

This CLR function will accept a pattern and a search string. It will then count the occurrences of the pattern in the string and return it. Great thing about Visual Studio is that you can deploy this function directly on your SQL Server. So click Build and then Deploy to get the DLL inside your database. Or do it the proper way, build it and use CREATE ASSEMBLY to add it to a database.

Now run the following script:

declare @start datetime2 = SYSDATETIME();
declare @end datetime2;
declare @s varchar(100);
declare @s1 varchar(100);
declare @s2 varchar(100);

set @s = 'To be or not to be, that is the question.';
set @s1 = 'To be or not te be,that is the question.';
set @s2 = 'To be or not to be, that  is the question.';

select dbo.RegExCount('(\s+)|(\w,\w)|(\w\.\w)', @s)+1;
select dbo.RegExCount('(\s+)|(\w,\w)|(\w\.\w)', @s1)+1;
select dbo.RegExCount('(\s+)|(\w,\w)|(\w\.\w)', @s2)+1;
set @end = SYSDATETIME();
select @start;
select @end;
select DATEDIFF(MICROSECOND, @start, @end);

 

We feed the CLR-function with a RegEx Pattern:

  • \s+: Matches one or more whitespaces.
  • \w,\w: Matches a character followed by a colon followed by a character.
  • \w\.\w: Matches a character followed by a period followed by a character.

 

The result will be 10 for all three strings and although the first time it will take some time (the DLL needs to be compiled), subsequent runs are faster. So is this the fastest way? Maybe not but it is the more accurate one. And the pattern can be extended to catch more typo’s.

But still, SqlZaterdag was great!!!