SQL Server security best practice

Security! This is the word comes in mind of every concerned person when it come...

Change the Collation Settings in MS SQL Server

This post will show you how to change the collation settings in MS SQL Server for specific database...

Resolve collation conflict

In this post I will show you how you can resolve collation conflict error...

Book: SQL Server 2008 High Availability

In this book I have tried to cover every single piece of information that might requires for installing and configuring SQL Server HA option like Clustering, Replication, Log Shipping and Database Mirroring...

Why to recompile Stored Procedure

Generally, we create views and stored procedures (proc here after) ...

12/18/2009

Common Mistakes in SQL Server - Part 4


Last week we discussed advantages of recompilation in Common Mistakes in SQL Server – Part 3. This week I will discuss the Null Value and how it can cause an issue!!


What is NULL? 


Null is the value which indicates the value is unknown. Null is different than zero or an empty string, as a zero or empty string indicates that the value is known (refer http://en.wikipedia.org/wiki/Null_%28SQL%29 for further reading.)


Why should we avoid NULL values?


In one of my previous projects I was asked to perform some tests and do analysis on a specific database which had a tremendous growth rate, and it seemed that either the estimation related to growth was wrong, or the application inserts data more often than is expected. The database growing more than 300% in just 2 years wasn’t a good sign - something was really wrong. After I finished the analysis I found that there was a single table which had significantly contributed to the database size - that table itself was 450+ gigs in size; it had 217 columns and more than 50 million records, most of the columns allowing null values, which is where the problem was.


There is a misconception that if we have the NULL values in a table it doesn't occupy storage space. The fact is, a NULL value occupies space – 2 bytes. If we do not consider this point at the estimation stage we will have trouble with the growth of the database. Sometimes, however, application design demands some columns to be there for future use, and in this case, there will be NULL values.


Let me give you an example – create a table “tblEmployee” having thirteen columns of fixed length data type. I will add 150000 records and check the size of table. Remember, we are not creating a primary key/clustered index - thus this will be the Heap.


1. Specify the number of rows that will be present in the table
    Num_Rows = 150000


2. Specify the number of fixed length and variable length columns:
    Num_cols = 13
    Fixed_data_size= 4+15+15+4+25+25+10+10+3+4+4+35+35 = 189
    Num_Variable_Cols = 0
    Max_Var_Size = 0


3. Part of the row, known as the null bitmap, is reserved to manage column nullability:
    Null_Bitmap=2 + ((13+7)/8) = 4.5 = 4


4. Calculating the variable-length data size:
    Variable_Data_size: 0


5. Calculate the total row size:
    Row_size = Fixed_data_size + variable_data_size + Null_bitmap + 4
    Row_size = 189 +0 + 4 + 4
    Row_size = 197


6. Calculate the number of rows per pages (8096 free bytes per page)
    Rows_per_page = 8096 / (122+2) = 40.27 = 40


7. Calculate the number of pages required to store all the rows
    Num_Pages = Num_rows / Rows_per_page
    Num_Pages = 150000 / 40 = 3750


8. Calculate the amount of space required to store the data in the heap (8192 total bytes per page)
    Heap Size (bytes) = 8192 * 3750 = 30720000 Bytes
  
The size of the table tblEmployee is 30776KB which means storing 150,000 records would occupy 30 MB. Now, lets calculate this for 50 million records - you will get 9990 MB ((50000000/150000)*30). Remember we have calculated this for 13 columns only; if your table has more columns the size could be bigger than this. Download the demo script here


The above example shows how your estimation will be wrong if you do not consider NULL values while estimating database size. To estimate your table size correctly, you must consider this point, and you should try to narrow down the possibility of having NULL values in your tables.


Resources:

12/11/2009

Common Mistakes in SQL Server - Part 3


Last week in Common Mistakes in SQL Server - Part2 we discussed the advantages and disadvantages of using Identity columns. In this article I would like to shed some light on the advantages of recompile , and in what cases you should attempt recompilation!
Generally, we create views and stored procedures (“proc” here after) to avoid the re-write  T-SQL  which is called often by an application or on ad-hoc basis. And, there is strong reasoning to do so: both -  views and procs provides am edge over executing T-SQL as they can reuse cached plan stored in the system table syscacheobjects ( you can query it using DMV sys.syscacheobjects). 

We are aware that once we have created a view and/or proc and call it for the first time it gets compiled and executed(and the execution plan is stored in the syscacheobjects system table).  Then, every time we call this view or proc, it uses the same cached execution plan. By doing this, SQL Server Engine is reducing the overhead of recompiling the same set of T-SQL.


What exactly is an Execution Plan?
There are number of connections that can be made to the database server,  from one or  more different locations, and each of them are querying the database server to return some data. The specific data structure is called an execution plan (i.e. select * from Product where ProductSubCategoryId = 4).


What the SQL Server engine does here is - if the T-SQL came from a view or proc, it searches for the cached execution plan for the same T-SQL statement.  The SQL Server Engine will reuse the plan if it  is found ; otherwise, it will compile and execute the statement.  For example  we have to query Product table where SubCategoryId is 4 and 11 and this query is being called frequently( 500 times a day). If you query the syscacheobjects system table, or query DMV sys.syscacheobjects, you will notice that the Usecount value is not incremented.

Now, if you were to, create a view and execute it 5 times, and then query DMV sys.syscachedobjects again,  you will notice that the Usecount value has been incremented. This is because the SQL Server engine has used the cached execution plan.

Why should we recompile the proc or view?

There are certain events which invalidate the cached execution plan,  and it is advisable to recompile*  a proc or view if one of these events occur:

·         Schema changes / alter table
·         Alter index or drop index
·         Calling sp_recompile or using the  ‘with recompile’ or option (recompile)when executing the T-SQL
·         Large insertion or deletion of data in the tables
·         If you call DML and DDL in single set of T-SQL
·         Change the value of SET option

*Note: Recompilation is an expensive operation,  but it will help to avoid use of old/invalid plans.


Let us query the Product table in the AdventureWorks database; note that it uses the clustered Index Scan (PK_Product_ProductId)



Now we will create a non-clustered index on column ProductSubCategoryId in Product table.



Now we will create a procedure that uses the same select statements that we have used earlier (CMPart3_Proc1)


As we have the non-clustered index on the column ProductSubCategoryId, the relational database engine should use the non-clustered index. Let’s check if it does it by executing the query



Let's create a proc here, and execute it. If you notice, the database engine still uses the same execution plan even though we have  a newly created index on the ProductSubCategoryId.





We will free the procedure cache now and recompile the proc so that relational engine can create a new execution plan and use a newly created non-clustered index


You will note that as soon as we execute DBCC Freeproccache & sp_recompile ‘usp_ProdSubCatId’  Relational engine has created a new plan and is now using the correct index (in our case idx_ProdSubCatId).

That is why it is advisable to recompile proc or view as I've described above. You may download the demo script here.

12/03/2009

Common Mistakes in SQL Server - Part 2

Last week we discussed about whether to choose variable or fixed length data type and when to use when to use a Unicode data type like nchar or nvarchar in Common Mistakes in SQL Server – Part 1.
Let’s discuss about the 2nd common mistake I have observed - using Identity column as primary key.

Identity is a property which can be defined on an integer, decimal, numeric, small integer, big integer or tiny integer data type, an identity column is a column which - generally is used as primary key.  The  Identity property creates an incremental value for the specified column automatically, which is why it is widely used by developers when they designed the table and a primary key – an identifier column.

How does it work?
The Identity column property has two components: a seed value and an incremental value. A seed is the intiial value  set for that particular column , where as an incremental is added to last inserted value when a new row is created. 

When we set the Identity properties, we have to define the value for first record aka seed value (for example, 0) and the incremental value (for example 1). Once you have defined this property and you insert a record into the table for the first time, it will add 0 for the Identity column.  upon the 2nd time you insert a record it, will have a value 1 (seed of 0 plus the incremental of, 1),  and at 3rd record insertion it would have a value  of 2  (1+1). (Download the demo script here Part2_1 which illustrates the table definition with Identity column)

Advantage of using an Identity column as a Primary Key:
There are couple of advantages using an Identity column as a Primary Key, including:

  • SQL Server itself manages the  Identity values
  • Developers don’t need to write a code for new primary key value because Identity column value is created automatically
  • The size of index will be low as compared to index size on char based columns

Are there any catches?
While there are couple of advantages, there are some disadvantages which makes managing Identity columns a bit harder including:


1. You cannot control Identity values, because this column and its values are controlled and managed by SQL Server. Please note, however, we may use 'set Identity_insert tblName on' to insert an Identity value explicitly.

2. For any reason, if insertion fails, a value for Identity will get created. Then the next time you insert a record it will create gap in numbering.

Let us check this out with an example. Create a table called TestIdentity with three columns: one Identity column, one varchar(10) and  one varchar(5) column. We’ll insert 10 records here, and then 11th record will fail because we’ll try to insert a value which is greater in size then what is defined in table definition. Once it has failed, we’ll again pass the correct values. Please notice the gap created in the Identity value. (Download the demo script here Part2_2)

3. You cannot add an Identity property to an existing column. There are only two options that you have if you need to add an Identity column to a table - drop and recreate the table or add a new column with Identity value (and then drop the existing column, if applicable). (Download the demo script here Part2_3)

4. There has been a lot of said about the disadvantage of Identity columns , especially when it comes to a replicated database., you have to alter your identity property and say “Not for Replication”.






Then, if you have to restore your replicated database for any reason you will have to be very careful, as you will have lost all your replication settings unless you specify that you wish to  keep_replication. For further information on what all we have to aware of when we back up or restore replicated database please refer to this article.

Whenever the Identity has reached its threshold value you will have to reseed the Identity values and adjust Identity range at publisher end. This happens when we restore backup at subscriber end; you will have to find the last inserted Identity value and adjust the Identity range at publisher, using these steps:
a. Execute select ident_current(‘tblName’) for every subscriber you have
b. Note down the highest value
c. Go to your publication server and execute
dbcc checkident(‘tblName’,reseed,value*+1) where value = value found in step b
d. Finally execute sp_adjust
e. Now, go to your publication server and
execute sp_adjustpublisherIdentityrange PublicationName, TblName

Note: above case is for Merge Replication.

Here is a nice article on MSDN on Replicating Identity Columns, and you may also refer Hilary Cotter’s article Identity Crisis (generally known as Identity Crisis).

I personally believe that one should avoid using  the Identity property , especially as a primary key when you have replicated database to over come  the issues you face with Identity columns. Rob Volk has written an interesting observation on Identity and Primary Key. Yes, doing this will require some more efforts on your part to create incremental values for your Primary key; but it really is just a matter of writing a few more line of code. Additionally, there are alternate ways of doing this - for more information you may refer an article here on how to generate auto incremental values in SQL Server. For further reading you can refer 
http://msdn.microsoft.com/en-us/library/aa175244(SQL.80).aspx
 


11/26/2009

Common mistakes in SQL Server - Part-1

Since last 10 years I’ve worked on many projects where in the significant amount of work comes for Performance and Tuning, the first approach is to identify the culprit and quick fix it and the next course of action would be to fix it permanently. In most of the cases what I’ve found is the size of the database grows beyond the expectation (estimation!!), sometimes 300%. Sometimes it has a problem with the data type or index etc. There are many things we should have consider when we design a database and when we develop an application which uses MS SQL Server as backend.



There are some points I think we have to be careful when we design a database or an application go live, this will save our lots of effort in future!!

So, what are the Common Mistakes done in SQL Server?


Well, as I said they are so common that sometimes we just overlooked them and we came to know the significance of these pointers when we got trapped. Based on my experience I’ve pen down 10 points below; this is the part 1 of the series so keep visit this place.

I hope this would have answer some common questions like:


  • What data type to be used?
  • Difference between varchar and nvarchar!! and varchar vs nvarchar




The first thing which I’ve noticed is length of the data type to be used. One have to be very careful here, one should not use fixed length of data type when one is not sure about the length of the data which is being inserted either from website or from an application; if one fails to choose the correct length it will unnecessarily occupy the valuable disk resource. For example when data which is to be inserted is “Address” and we are not sure of the length we should use variable length data type such as varchar and when we are sure of the length of the data type we should use fixed length data type such as char for “Gender”.


There are cases when I’ve seen where Unicode data are not stored but people have used nvarchar data type for their websites and applications . This will give you a pain when database size grows over a period of time and one is trying to figure out what are the areas we’ve to consider!! nvarchar or nchar should be used when application or website requires to store data in Multilingual format. Both, nvarchar and nchar occupies double the space of stored data the reason is, ncahr or nvarchar uses UTF-16 (16 bit per character ) where as char or varchar uses UTF-8 (8 bit per character) .


Let’s see an example, create a table with 2 column (FirstName and LastName) which uses varchar and nvarchar data type respectively to store data. Now, insert record and then use function len() to count number of characters inserted and datalength() to know how much space it has occupied. Download the script


Also, we should be very careful about the trimming the white space this should be taken care either at front end or at back end.

Here, what I am advising is to use nvarchar or nchar whenever you required to add multilingual support and try to limit it to where it is required the most, say a particular column.

As I said earlier this is the part 1 of the series so keep visiting this place!!
Data Types: It requires careful attention when we design a table and choose type of data we are going to store here and what Data Type we are going to have for particular column of a table! And, that is why choosing right data type is most crucial part of database design, because, if one failed to choose the right data type there could be several pitfall he/she would encounter as the database size grows.

11/17/2009

SQL Server Cluster Best Practices

What is the best practices for SQL Server Cluster? What are the steps to be followed when we install SP on SQL Server Cluster? How to quick review SQL Server Cluster? Where I can find SQL Server Cluster checklists!!!  These are the FAQ on SQL Server Cluster, and, here are some links you can refer for the same.
http://technet.microsoft.com/en-us/library/cc785714(WS.10).aspx
http://blogs.msdn.com/jorgepc/archive/2009/01/21/how-to-quickly-check-your-sql-server-cluster-configuration-on-windows-server-2003.aspx
http://www.sql-server-performance.com/articles/clustering/clustering_best_practices_p1.aspx
 

and a list of articles on SQL Server Cluster on my fav http://www.sql-server-performance.com/   for your ready reference http://www.sql-server-performance.com/articles/clustering/index.aspx  
and http://technet.microsoft.com/en-us/library/cc917693.aspx

11/16/2009

Download SQL Server 2008 R2 November CTP

Microsoft has releases the new CTP of the MS SQL Server 2008 R2, here is the download page to Download November CTP of the MS SQL Server 2008 R2 http://www.microsoft.com/sqlserver/2008/en/us/R2Downloads.aspx


Find the information on the System Requirement here >http://msdn.microsoft.com/en-us/library/ms143506(SQL.105).aspx

More information about the SQL Server 2008 November CTP >http://www.microsoft.com/sqlserver/2008/en/us/R2.aspx

11/09/2009

SQL Server High Availability Options Preview

As it is named, all the SQL Servers and databases should be highly available (HA) to cater the business requirement, and, the business application should be up and running all the time. To achieve this Microsoft has introduced SQL Server Cluster which is one of the highly recommended and used solutions for SQL Server high availability which is server specific that means that all the databases on the SQL Server Cluster are available in case of failover. Database Mirroring (DM) and Log Shipping (LS) along with the Replication (Repl.) are the alternate solution for the high availability but all this options are database specific that means that you can set it up for specific database(s).

Here are the brief about each (I'll post detailed article on each in near future):

Replication: As it is named, replicates the data/database to one or more than one location. Replication was first introduced in SQL Server 6 with minor support and was enhanced in SQL Server 7.0; with launch of SQL Server 2000 it has added some more features to it. And, with SQL Server 2005 Microsoft has introduced Peer-to-Peer replication which is capable of doing load balancing.

The Application:  Based on the business requirement i.e. for reporting purpose, data from more than one location needs to be merged and or if wish to have transaction level replication of data.
Component of the Replication:
Publisher:  Is a server which published the data
Distributor: Is a server which is responsible to distribute data to the subscriber(s)
Subscribers: are servers which has the subscription, the distributor sends the data which is being published by the publisher here.
Subscription Type: Subscription has two method, pull and push. With Pull subscription, subscriber connects to the publisher periodically and pulls the data where as with Push subscription publisher connects with the subscribers and push the changes over subscriber.

 There are 4 type of Replication available: 
1.       Snapshot:  Snapshot replication is the very basic and simplest type of replication available. As name implies, it took snapshot of the published data, connect to the subscriber and then applied the snapshot on subscriber.  This is most suitable when data is required for reporting purpose and are not updated on the subscriber end, and high latency is acceptable.

Application:  Where higher latency is allowed and data is used for the reporting purpose only.

2.       Transactional:  In this mode of replication, it is replicating data transaction level by continuous monitoring.  To achieve this Log Reader Agent is being created when Transactional Replication is configured, it monitors every database which is configured for Transactional Replication and then copies the transactions which are marked for replication from T-Log to distribution database. Distribution database then send (either pull/push subscription) it to the subscriber.
       Application:  Where higher latency is not allowed and data is used for the reporting
       purpose, and   sometimes subscriber is allowed to updates.
     
3.       Merge:  Merge replication uses Triggers to track the changes in subscriber to publisher.  And, as it named it merges the records, irrespective of who updates/inserts records.  This is the most complex topology of replication.  As it is a merge replication wherein updates/modifications are allowed from more than one subscriber at a time, it has in-built facility to resolved conflict.  There are some pre-requisites which you need to have before Merge Replication can be configured, here is my article on Merge Replication Tips here
Application:
a.       Multiple Subscribers might update the same data at various times and propagate those changes to the Publisher and to other Subscribers.
b.      It might have a chance of occurring conflict, and if they do, there should be the mechanism to resolve it.
c.       One time modification is done at subscriber end and the same needs to update at publisher and other subscriber.

4.       Peer-to-Peer:  This is the best topology of Replication introduced up till. This has been introduced in with launch of SQL Server 2005. In, Peer-to-Peer replication the participants have the complete schema and all the data modifications (DML operations) are made at all nodes.  Now, you would be thinking, hey what is the catch here, what is new then merge replication? Well, the simplest answer is Peer-to-peer Replication is designed for Load Balancing by spreading out write activity across all available nodes and then based on the sync latency it synchronizes database, thus read and write operations become faster.  And, additionally if any of the server is not available it will continue to perform as it is with another node and write operation for this node is routed to another node.
Application: Best suited for online shopping application

Database Mirroring: Database mirror is the best and cost effective solution for automatic failover. Database mirroring is introduced with launch of SQL Server 2005. A database mirroring replicate the change to secondary server at transaction level and provides two modes: High Performance (Asynchronous) and High Safety (Synchronous). The difference between two is, with High Safety mode it will not send and write any data to the secondary node until it is committed on the primary node whereas with High Performance mode it will ship the transaction and write the transaction on to secondary server no matter it is committed or roll back on the primary server.

Log Shipping:   This is the cost effective solution, as it is named, it ships T-Log backup to the secondary node(s) on regular interval and then restore job performs T-Log restoration at secondary server to keep it updated with the primary server. The disadvantage is, it don’t have automatic failover feature and hence manual intervention for role change would take 2~3 minutes time (I guess this is allowable downtime).


SQL Server Cluster:  This is the best HA feature in SQL Server, which provides automatic failover for Entire Server, in comparison to other options which provides supports to particular database(s). But, this solution is costlier than other 3 options because it requires identical hardware.  

8/11/2009

How to recover from the Transaction Log (t-log) filling up

How to recover from the Transaction Log (t-log) filling up!!!
This is the most frequently asked question in the forums or blogged. Hence, I thought let me write something on this so that this would be helpful to the DBA while they face this issue.

What is Transaction Log (T-Log)?
Why T-Log is full?

Why my T-Log got filled up so fast?
How do I get rid of T-Log filling up so quickly?

What’s the issue all about?
The problem here lies in the statement itself, that’s T-Log is getting filled up or is full!!!!

To get better understanding of the problem statement let me give you some insight into T-Log and significance of it.
As the name says, it logs all the transactions that are executed in SQL Server, be it fully logged or minimal logged (based on the recovery model you’ve chosen). This is what the simplest definition I can give you.

What does T-Log do and why it is so critical?
As I said, it logs all the transactions that are executed in SQL Server. Now you would be thinking that why it is needed!! Let me explain you: when ever and what ever the transaction happened, you always wanted to ensure that the data you entered are in consistent state and whenever you want to see them they are available to you, the utmost requirement.

What role T-Log plays here is, while recording all these transactions it ensures that they are committed so as it will be available and will remain consistent. In the time of failure or non-availability of the database server or severe damaged at the physical failure of server; backup of the T-Log along with the full database backup (off course it is required) would really helps you to recover your database to the point-in-time (I’ll explain you this in my next article). When ever you restart your SQL Server, observer your SQL Server logs carefully you can see some thing like below:

SQL Server is recovering individual transactions, and all incomplete transactions to the database.
This message means, that T-Log writing of all the commited transactions and roll forward all the uncommited transactions.
What else T-Log can do?
T-Log is help full to recover database point-in-time in case of failure, it is also useful to setup High Availability for Server(I’ll explain more in my upcoming articles). That means you can setup Log Shipping and database mirroring with the help of T-Logs.

Now as we know what T-Log is and what is significance of it, let’s come back to the main topic,
Why T-Log is getting full?
The cause which leads to T-Log to fill up, I am penning down some points I’ve recollected, they are:
1. Transactions are not committed
2. Transaction is not open but is active for long time
3. While creating / modifying Index
4. Bulk Operations
5. Auto Growth option of database files
6. Group by and Order by clause will fill up tempdb
7. Un-replicated transactions in T-Logs
8. T-Log backups are not taken for long

How to get rid off this messages (Solutions):

1. Ensure that application is written appropriately, and each transaction is wrapped between begin and commit transaction. This will not only helps to keep your data in consistent form but helps you to reduce the blockings.

2. This are the transactions that are not open but active for long time, means while user is sending some input to the database server his/her system got hung, rebooted or shutdown and network doesn’t send this information appropriately to the database thus, SQL Server understands this transaction is open but not active. This kind of sessions can be killed provided you’ve examine what this process is doing exactly by using DBCC Inputbuffer (SPID), be cautious while killing any process as it would take 4 times longer. * To know more on how Kill command works click here… especially when it rollback transaction

3. This is the most important point; creating or recreating an index would occupy much of the space as it is fully logged operation. When you run DBCC DBREINDEX it will drop existing index and recreate index (will recreate every index of your table if you won’t give name of index) . I would advise that you care fully examine output of DBCC SHOWCONTIG or sys.dm_db_index_physical_stats and based on that fact you can go ahead and defrag your index. Recreate your index only when required.

4. Although, bulk operations are minimally logged but it occupies the space certainly, in case your SQL Server is in full recovery mode (which is highly recommended for point-in-time recovery and for production environment) it will fully logged Bulk Operations.

5. Auto Growth option could be really a horrible if it is not configured properly and your database size is considerably large, lets say few hundred gigs or greater. If you’ve set Auto Growth option set on and they are in percentage and not in fixed size you would observe more spics in disk I/O and decrease in performance, read complete article here.

6. As we all know Group by or Order by will resort data and ordering result set, this will increase the space occupied, try to avoid this operation unless it is necessary. This way you would save up your space and increase in performance.

7. When you have transactional replication configured in your environment the transaction which are marked for replication but not replicated actually are lying in your T-Log and then distribution database which will certainly occupy reasonable amount of space. For any possible reason, see if something is wrong with your Transactional replication and rectify it.

8. This is the best way to keep your T-Log in shape, the more frequent i.e. 15 minutes. Backup Log will empty the T-Log file and thus reducing the size of log file.


Summery: Transaction Log, as name says records every transaction that is executed. When database is in full recovery model or frequent bulk operations happening it is most likely that your T-Log get filled quickly.

The above explanations and probable solution I’ve given came from my 10+ years of IT Industry experience while working on projects and my participations in various SQL Server forums. I do hope that this quick points will help you and save your time.

7/22/2009

Surat User Group DNN Portal upgraded

Great News!!!!

Surat User Group's website which was running on older version of DotNetNuke is just upgraded to newer version!!!

Surat User Group is runing on DotNetNuke CMS v 5.0 since it has been launched officially, and I thought to upgrade it with the new release of DotNetNuke. Since this is the first time I am upgrading DNN Portal I was googling a good reference material for Upgrade Procedure, and I landed on a Mitchel Sellers Blog entry Upgrading from DotNetNuke 4.3.5 (and other 4.x versions) Although this is meant for old version it worked for the upgrade I did.

7/08/2009

Register for SQL Server 2008 R2 CTP

Microsoft is planing to release R2 CTP for SQL Server 2008, get your self register so as you will get notified as it becomes available :)

7/02/2009

TechEd on the road, Surat - Great show

Microsoft has organized TechED India 2009 at Hyderabad in the month of May during 13th to 15th. This was a biggest community event for IT professionals and Developers from all over country, and was big success and User Group leads of various cities have been asked to organized similar events in their cities, Surat was among those 10 cities.

Surat had its first TechEd event on 28th June 2009, and was organized by Surat User Group (IT Pro community) in association with DotNetChaps (Developer community). The event was organized at Umra Police Station auditorium and 150+ participants were registered for the event. You can see the event photographs



We had 80+ attendees inspite of heavy rain in the morning, we've got a mix of students, developers, IT Pro and lecturers from various institutes. We had 5 speakers who delivered power packed sessions on new technlogies, I spoke on ROW/Page compression in SQL Server 2008 and Virtualization 360 , Mrityunjaya Kumar spoke on Silverlight 3.0 and Windows 7 for Developers, Dhaval Faria spoke on What's new in .Net framework 4.0: VS 2010 and Aviraj Ajgekar spoke on Windows Server 2009 RC2.

It was really nice experience for me, all the technical sessions has got 4 out 5 rating and participants suggested to have more events in the future.

Let me say my thanks to our sponsors Microsoft, SQLPASS, UG Support Services, SolidQ Mentors, Sharma Infoway and XCellence-IT for their support, I would also like to thank Bhavin Gajjar | DotNetChaps lead for his support in organizing this event.

6/02/2009

TechEd - TechEd on the Road, Surat - 28th June 2009


Microsoft has organized Tech-ED India 2009 at Hyderabad in the month of May during 13th to 15th. This was a biggest community event for IT professionals and Developers from all over country, this event was big success.
To continue spreading knowledge and educate IT community, Microsoft in association with SQLPASS, Culminis and Ineta has asked User Group Leaders of various cities to organize mini version of Tech-ED “Tech.Ed on the Road” . TechEd on the Road will be held in 9 cities of India during the Month of June and Surat is one of those cities where TechEd on the Road to be organized on 28th June by Surat SQL Server User Group and DotNetChap. It is a great opportunity for Developers, IT professionals, Students to know about trend of IT and upcoming technologies from Microsoft.

We SQL Server User Group and DotNetChap feel immense pleasure to bring you the Mini TechEd - TechEd on the Raod, Surat event for IT Professionals, students and End User has lots to get back from this event in terms of Technical know how! There are mainly two tracks for this event Developer and IT Pro track.

This event is open to all and is free; the only thing required is to get your self registered for the Mini TechEd - TechEd on the Road as we have limited seats available.

To know about the Agenda and Speakers, click here

--- Hemantgiri S. Goswami | Surat SQL Server User Group Lead

5/05/2009

Integration Services (SSIS) in SQL Server Cluster

Often the question being asked is "How can I install SSIS in SQL Cluster" or "How to run DTS/SSIS package in cluster when SSIS is not running"

You can very much run DTS/SSIS package in SQL Server Cluster even when SSIS is not runing. So, what all you need to run DTS/SSIS package on SQL Server Cluster!! All you require is dtexec.exe or dtexecui.exe to execute the DTS/SSIS package. Apart from this, you may use SQL Agent to schedule a job to run DTS Package.

The fact is SQL Server Integration Service(SSIS) is not cluster aware service, hence it won't support fail-over. Although, if you still want to Install/Configure SQL Server Integration Service on SQL Server Cluster here are the reference
Install SQL Server Integration Services(SSIS) on SQL Server Cluster or Configure SSIS for SQL Server Cluster

How to Secure SQL Server - SQL Server Security Criteria

Security! This is the word comes in mind of every concerned person when it comes to store, access and share the data and database or database server. At times when applications are run in the geographically restricted area, there were less chance of its exploitation and with spread of internet and the availability of application over the internet, security of data, databases and database servers has become vital.

Why SQL Server Security has been so much important!!
Well, for every organization or institute or company or government has there data that is vital to them, and not to be accessed by the authorized person or entity only. And that is vital to set some security standards for such a critical data so that you can prevent your data as much as you can with proper security mechanism and set of standards within organization and in your scope for MS SQL Server Security.

I have pen down some pointers/criteria that requires serious consideration when it comes to secure MS SQL Server.

1. Guest User Account: It is always good practice to disable guest user account; this will keep our server robust from skilled user that can make use of guest user account to gain access over the server.

Skilled user always looked into a guest user account to establish a NULL session with which, he/she can compromise server.

2. Public Account: Do not grant permission to Public role, as every single user is a member of this role; hence if you grant any permission to this role it will be available to every user and that will breach security standards.

3. System Administrator Account: We would need to rename and disable System Administrator account as best security practice; to do so we need to create integrated account that has SA permissions then create users and assign them appropriate permission based on the fact who need to access what data!! One of the best practice is also to rename or disable the SA account, it is advisable that we follow this best practice. However, make sure you have one account having equivalent permissions as of SA, also advisable to remove "Built-in\Administrator" group.

4. Application Roles: While answering threads at Microsoft/Non-Microsoft Forums for SQL Server most of the time what I’ve found is developers/users/dba(s) just avoid creating and/or maintaining schemas/application roles. Actually it does lots of helps to us in terms of rework (granting and/or revoking permissions) and easy manageability of sql users. Above all if guest user is not mapped to application role of database, guest user cannot access to database object; we can audit application role’s activity also.

5. Strong Password Mechanism: This is the most vulnerable thing; if our password mechanism is not that strong, one can easily get into our box and steal our data or can do whatever he/she wants to do with; sometimes we may found users with NULL/TRIVIAL passwords. Again, this is much more critical if application is accessed over Internet we are openly inviting thieves!!!

6. Access port: By default SQL Server 2000 listens to TCP port 1433 and UDP port listens to 1434 which is known to everybody; we would need to change it to another port then default and firewall rule should be created accordingly for exceptions. SQL Server 2005 listens to dynamic port*

* Whenever SQL Server 2005 names instance starts it get port automatically, but as always you may change it to static. Refer below http://blogs.msdn.com/sqlserverfaq/archive/2008/06/02/how-to-change-the-dynamic-port-of-the-sql-server-named-instance-to-an-static-port-in-a-sql-server-2005-cluster.aspx

7. NTFS file system: NTFS file system is introduced in early 1993 with launch of Windows NT 3.5, this is the file system that has some good feature that were not available in FAT 16 and FAT 32 file system. File and Folder level security is the key benefit of this file system which also benefited us to keep our SQL Server files secure by assigning appropriate permissions.

8. Updating Server: Microsoft keep releasing service packs (SP) and hot fix time-to-time to keep software more secure, robust and bug free. It is recommended that we should update our box with latest SP(s) and hot-fix on regular basis.

9. Audit: We should enable audit for login failures and warning errors and need to be monitored on daily basis so that if any error, login failure or suspected login attempt will be notified and we could take necessary action based on the fact available in Log, this way we can foresee any probable vulnerability or can avoid it to be happened.

10. Integrated Logins: Using this feature one can assure him/herself that sql box is more secure; integrated/windows authentication* uses domain account to access server, database and database objects. Here, whenever user tries to access sql box his/her account is validated by domain controller first and then permit or denying to access the system without requiring separate login id and password; after this it will check with SQL Server for kind of permission this user(s) has.

The other benefit is one can use encrypted password, various handshake methods like PKI, Kerberos, EAP, SSL Certificates, NAP, LDP and IPSec policy; this will ensure our highly critical data are being sent securely over the network.

* http://msdn.microsoft.com/en-us/library/aa905171(SQL.80).aspx

http://msdn.microsoft.com/en-us/library/aa905172(SQL.80).aspx

http://msdn.microsoft.com/en-us/library/aa905190(SQL.80).aspx

11. Instances: We can create different Instances to isolate development/production environment from each other; and / or isolating users from accessing database which are not meant for them. This can be done by application roles/fixed DB / fixed server roles but using instances we can hide name of the databases from the users this way they don’t even come to know which databases are there on server and who is accessing which database.

12. Service account: Always use least privileged user account to start the server and agent service on the server. Domain account with the local admin privilege is enough to start the services; domain user account for services is required if we would have to work on some special services which requires network access also like replication, log shipping, mirroring, cluster, remote procedure calls, backing up-restoration on or from network and remote data access.

13. Network Library: Don’t installed and allowed network library except those are required.

14. Isolated from IIS: Though we can have both SQL Server and IIS on single machine; it is advisable to keep them on separate machine. The idea is; even if IIS server is compromise our database(s) are secure, the other benefit is if more memory and processer resource is eaten by IIS server SQL box will not suffer.

15. Stored Procedure: We should have to make a practice to wrap DML statements in SP(s) to avoid SQL injection.

16. Monitoring SQL Server and Windows Event Viewer: We shall keep out eye on SQL Error Log and Windows Event Viewer for any suspicious activity on daily basis so that we can take corrective action(s) as soon as it is identified.

17. Encrypting data: We should use WITH ENCRYPTION option to keep our code safe; and for data we could use symmetric encrypted columns in SQL Server 2005 or third party tools like activecrypt for data encryption in SQL Server 2000.

18. Linked server: Prevent access to linked server from those users who don’t need to access by assigning proper privileges.

19. System Stored Procedure(s): System stored procedures (SP) like xp_cmdshell, xp_regread, xp_regwrite needs to be restricted to access. Using this SP(s) one can easily read, modifies or deletes registry information or can manipulate system information.

20. Anti Virus: We should install Antivirus on SQL Box and exclude sql server database files; this way we can be assure from Virus/Trojan/Malware/Spam attacks on our servers that could harm our database files.

Conclusion:
As the time passes Securing data has become the most vital part, and we must agree and honor it. Because this is the information that contains our financial, social, business and historical data; and as a DBA it is our prime responsibility to make sure that this has been taken care and secure enough. These are the key points that I’ve collected so far, if anybody would like to draw my attention to some point that I might have missed out can write back there comments here.

Update: Edited point 3, added a line saying "SA" account should be renamed or disabled according to best practice. 
Updated on 2nd March 2010
photo credit: david_shankbone via photopin cc