SQL Server Deep Dives

Thursday, April 19, 2012

Bulk Insert data to specific columns

Bulk insert will fail if we try to insert data to specific columns in a table, e.g. we have a Candidate table where we store the list of candidates and has an Identity column CandidateID

CREATE TABLE Candidate

(CandidateID INT IDENTITY(1,1) PRIMARY KEY,

CandidateName VARCHAR(60),

DOB DATETIME

)

Suppose our source file has only columns "CandidateName" and "DOB", If we do BULK INSERT from source file to this candidate table as,

BULK INSERT Candidate

FROM 'F:\Candidates.csv'

WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')

We will end up in an error,

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 1 (CandidateID).

An easy method to avoid this error is to create a view against the destination table with columns that we can provide as input and BULK INSERT to the created view.

CREATE VIEW vw_Candidate

(

CandidateName VARCHAR(60),

DOB DATETIME

)

BULK INSERT vw_Candidate

FROM 'F:\Candidates.csv'

WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')

Now, the BULK INSERT is successful and our base table has the data.

Thank!

Wednesday, April 4, 2012

Adding dynamic connection in SSIS

I have seen many post regarding "Adding dynamic connection in SSIS" in various SQL Sever forums, the question is like

Hi

Whats the best way to edit the connection of a slowly changing dimension. I want to point to a new connection.

CheersI

The best way is to use XML configuration file to change the connection dynamically, using this method saves the SSIS configuration in an XML files, and later we have the flexibility to edit this xml file Programmatically (from the Application) or manually. The steps are as follows

· Right click on package designer, choose "Package Configuration and Organizer"

· Click Add, this will open up a wizard, Click next.

· In Select configuration type window, choose" XML Configuration file" as your "configuration type". Specify a location to save this configuration file in "Configuration File Name" and click next

· In "Select Properties to Export Window", expand "Connection Managers", Choose Servername, (initialCatloge, only if you need to edit database as well) in both source and destination and click next button.

· Specify a name for this configuration and click finish to complete the wizard

Check the XML configuration file created above, you can customize the connection by editing this config file.

Tuesday, May 3, 2011

Auditing Password Changes in SQL Server 2005

Here on this article, I will demonstrate how to track SQL login Password changes on SQL Server 2005, on SQL Server 2008 we can rely on newly introduced feature called SQL Server Auditing to achieve this. Since SQL Server 2005 does not have Auditing feature, so let’s see how to track these changes in SQL Server 2005.

By creating custom trace on Event Class: Audit Login Change Password Event Class we can track password changes to the SQL Logins. By default this event class is not a part of default trace.

This is how it’s done:

1. Create a stored procedure to create custom trace on Event Class: Audit Login Change Password Event Class.

2. Mark the above created stored procedure as Startup procedure

1. Creating Stored Procedure to Create Custom Trace.

Here I am using two system stored procedure

1. sp_trace_Create :- This is used to create new trace definition.

2. sp_trace_setevent:- Once the trace is defined, we need to add events to the trace that we created in above step. This stored procedure is used to Add or Remove Events to a trace. Remember this may be executed only on existing trace having status ‘stopped’.

Script

USE [master]

SET ANSI_NULLS ON

SET QUOTED_IDENTIFIER ON

ALTER PROCEDURE [dbo].[usp_password_change_trace]

BEGIN

DECLARE @TraceID INT,

@FileName NVARCHAR(245) ,

@ErrorMessage NVARCHAR(4000),

@ErrorSeverity INT,

@ErrorState INT,

@maxfilesize BIGINT

BEGIN TRAN

BEGIN TRY

-- Specify the location and file name to which the trace will be written

SELECT @filename = 'D:\SQLTrace\Password_Change_Trace_'

+ CONVERT ( CHAR(8), GETDATE(), 112 )

+ '_' +REPLACE(CONVERT (CHAR(18), GETDATE(), 108), ':', '')

,@maxfilesize = 50

-- Add Event Class to the trace.

EXEC sp_trace_setevent @TraceID -- TraceID

@EventID = 107, -- 107 : Audit Login Change Password Event, Occurs

when a SQL Server login password is changed.

Passwords are not recorded.

@columnID, -- Columns to be addedd

@On = 1 -- 1 = Add, 0 = Remove

EXEC sp_trace_setevent @TraceID, 107, 6, 1

-- Adds Column NTDomainName: Windows domain to which the user belongs.

EXEC sp_trace_setevent @TraceID, 107, 7, 1

-- Adds Column HostName : Name of the computer on which the client is running.

EXEC sp_trace_setevent @TraceID, 107, 8, 1

/* Adds ApplicationName : Name of the client application that created

the connection to an instance of SQL Server. This column is populated with

the values passed by the application rather than the displayed name of the

program.*/

EXEC sp_trace_setevent @TraceID, 107, 10,1

-- Add Column LoginName : Name of the login of the user.

EXEC sp_trace_setevent @TraceID, 107, 11,1

-- Add Column StartTime : Time at which the event started, if available.

EXEC sp_trace_setevent @TraceID, 107, 14,1

/* Add Column EventSubClass: 1=Password self changed, 2=Password changed ,3=Password self reset 4=Password reset,5=Password unlocked,6=Password must change*/

EXEC sp_trace_setevent @TraceID, 107, 21,1

-- Add Column ServerName : Name of the instance of SQL Server being traced.

EXEC sp_trace_setevent @TraceID, 107, 26,1

-- Add Column ObjectTyoe : Value representing the type of the object involved in the event.

EXEC sp_trace_setevent @TraceID, 107, 28,1

-- Add Column ObjectName : Name of the object being referenced.

EXEC sp_trace_setevent @TraceID, 107, 34,1

-- Add Column DatabaseName : Name of the database in which the user statement is running.

EXEC sp_trace_setevent @TraceID, 107, 35,1

-- Adds Column DBUserName : SQL Server database user name of the client.

EXEC sp_trace_setevent @TraceID, 107, 40,1

/* Adds Column LoginSid : Security identification number (SID) of the logged-in user. You can find this information in the sys.server_principals catalog view. Each SID is unique for each login in the server.*/

EXEC sp_trace_setevent @TraceID, 107, 41,1

/* Add Column TargetLoginName: For actions that target a login (for example, adding a new login), the name of the targeted login. */

EXEC sp_trace_setevent @TraceID, 107, 42,1

/* Add Column IsSystem : Indicates whether the event occurred on a system process or a user process. 1 = system, 0 = user. */

EXEC sp_trace_setevent @TraceID, 107, 60,1

/* Adds Column SessionLoginName :Login name of the user who originated the session.

For example, if you connect to SQL Server using Login1 and execute

a statement as Login2, SessionLoginName shows Login1 and LoginName

shows Login2. This column displays both SQL Server and Windows logins.*/

EXEC sp_trace_setevent @TraceID, 107, 64,1       

      COMMIT TRAN

      END TRY

      BEGIN CATCH

                        IF @@TRANCOUNT > 0

                        ROLLBACK

                        SELECT  @ErrorMessage = ERROR_MESSAGE(),

                                    @ErrorSeverity = ERROR_SEVERITY(),

                                    @ErrorState = ERROR_STATE();

RAISERROR (@ErrorMessage,  @ErrorSeverity, @ErrorState);

      END CATCH

END

Execute the following query to check the trace we created

SELECT * FROM ::fn_trace_getinfo(default)

SELECT * FROM sys.fn_trace_getinfo

The column property with ‘5’ shows the current status of the trace

The value can be 0 = Stopped or 1 = Started

2. Mark the stored procedure to run while SQL Server Starts.

By using sp_procoption you can set a stored procedure for auto execution, this makes the stored procedure to execute automatically whenever the SQL SERVER instance starts.

Before that check the “Scan for Startup Procs” is enabled by using sp_configure.

-- Check run value "scan for startup procs"

sp_configure 'scan for startup procs'

If it is disabled enable it

sp_configure 'scan for startup procs', 1

RECONFIGURE WITH OVERRIDE

-- make the sp to run at startup

EXEC sp_procoption @procname = 'dbo.usp_password_change_trace', @optionname = 'startup', @optionvalue = 'true'

After making any password change, you can test this by two ways – By opening the trace file created or by using the function fn_trace_gettable

To Modify the state of the trace

By using sp_trace_setstatus, you can modify the current state of the trace.

-- To run the trace.

sp_trace_setstatus @traceID, 1

-- To stop the trace.

sp_trace_setstatus @traceID, 0

-- To delete the trace, you need to first stop if the trace running.

-- if trace is running, stop.

sp_trace_setstatus @traceID, 0

-- Delete its definition

sp_trace_setstatus @traceID, 2

Friday, April 8, 2011

Top SQL Server performance killers..!

Here are some areas that degrades SQL Server Performance. These are not in any order.

Improper Database and SQL Server Settings

tempdb is one of the busiest system database, It is a place for temporary storage such as when you create temp tables, table variables, cursors, joins and operations like sorting, row versioning etc. All these operations can hurt IO, CPU performance. So it is necessary to adequately design the storage structure of tempdb.

Another important problem can be with incorrect storage design for data files, log files, SQL program files and OS files. They should be placed on appropriate RAID levels.

Also you should properly set the configuration settings for SQL Server like Memory, Query time out, parallelism etc, Database settings like Managing Statistics, Auto growth of db files etc.

Database Design

Bad logical database design results in bad physical database design, and generally results in poor database performance. Both the logical and physical design must be right before you can expect to get good performance out of your database.

Database should be adequately normalized and de-normalized to improve the performance and reduce blocking. Normalized database reduces redundant data in turn reduces storage requirements. Row size should be well planned so that more rows can be fit on an 8KB data page.

Over normalized tables also as bad as under normalized database, this results in increased number of complexity of joins to retrieve the data.

Index and Statistics

One of the common and biggest performance problems on SQL Server is poor or missing indexes and outdated statistics. When a query executes, Query optimizer will search for the useful indexes, in the absence of proper index for a query may result in blocking, dead lock, a high amount of stress in Disk, memory and CPU because SQL Server needs to process or retrieve much more data while executing the query.

Coming to statistics, Statistics are the information about the distribution of data in columns or indexes. SQL Server query optimizer relies on this statistics to generate more efficient query plan to retrieve or update the data. A good updated statistics helps to create high quality plan. By default SQL Server automatically creates and updates statistics. An outdated statistics gives the query optimizer inaccurate information about data distribution like the number of rows returned etc, this creates inefficient plans to execute the query.

Excessive Fragmentation

SQL Server manages the indexes automatically whenever an INSERT, UPDATE, DELETE occur on the underlying data this results in page split, fragmentation. There are two types of fragmentation External fragmentation – the leaf level of index is fragmented in a non-orderly fashion. The other one is Internal Fragmentation - the index pages are not being used to their maximum volume. A property called fill factor of an index decides how much space to be left on index pages, severe internal fragmentation can lead to increased index size and cause additional reads to be performed to return needed data. These extra reads can lead to degradation in query performance. You can overcome the fragmentation by rebuilding or reorganizing the index.

Blocks and Dead Locks

When a modification occurs in a transaction, SQL Server Isolates it from other concurrent transactions, this prevents the transactions to see intermediate states of the data, i.e. A transaction either sees the data either before state or after state of the data. This is the default property of SQL Server, and is done by using a mechanism called locks. Because of this isolation Blocking occurs in a database

A deadlock occurs when two resources attempt to escalate or expand locked resources and conflict with one another. The query engine determines which process is the least costly to roll back and chooses it as the deadlock victim.

Thus the execution time of a query is adversely affected by the amount of blocking and deadlock it faces.

Poor Query Design

The effectiveness of index depends on the way how you write queries like use of functions on a column of a where clause e.g Year(SaleDate) = 2010, this will result in index scan rather than an index seek. Another problem is retrieving more data from the table than that is required. You must ensure that the queries are fine tuned and it can make use of the indexes effectively.

T-SQL is a set based language meaning it operates on sets of data rather than rows. It performs better when deals with set of data. Non-set based approach make use of cursors, loops etc rather than more efficient Joins and subqueries. These Non-Set based approach kills the performance. If you are forced to use cursor if you are forced to deal with cursors, be sure to use efficient cursor types such as fast-forward only.

Poor Execution plans and Re-Compliation

When a query comes for execution optimizer checks for query plan available to execute the query, if it is missing it will generate a fair plan and usually this plan will be cached in memory for future use once it is created. There is fair amount of CPU expense while creating the Query plan. If a plan is designed so that you cannot plug variables to it, SQL Server creates a query plan every time the query is resubmitted with different value. This eats up the CPU cycles, so it is important to submit your queries in a form that SQL Server can cache and reuse it.

One of the standard ways of ensuring a reusable execution plan, independent of variable values used in a query, is to use a stored procedure. Using stored procedure to execute a set of SQL queries allows SQL Server to create parameterized execution plan.

Thursday, March 24, 2011

No of Read and Write against tables

Is it possible to find the number of reads and writes performed against each tables in a database? We can depend on sys.dm_db_index_usage_stats DMV to achieve this partially. The columns user_lookups, user_scans, user_seeks, user_updates are counters and gets incremented for each individual seek, scan or update on the specific index on a table when a query is executed.

Here is the query,

SELECT S.name,

SUM(user_lookups + user_scans + user_seeks) AS Reads,

SUM(user_updates) AS Updates

FROM sys.dm_db_index_usage_stats v

INNER JOIN sys.objects S ON S.object_id = v.object_id

WHERE type = 'U'

AND database_id = DB_ID()

GROUP BY S.name

The Reads column displays the number of times sql server accessed the indexes on that table and, the Updates column represent the number of time sql server performed update/write operation on the indexes of a table.

For example – I have a table, Employee

CREATE TABLE Employee

(

ID INT PRIMARY KEY,

EmpName NVARCHAR(30),

DeptID TINYINT

)

CREATE INDEX IX_NCL_Employee_EmpName

ON Employee(EmpName)

CREATE INDEX IX_NCL_Employee_DeptID

ON Employee(DeptID)

Our Employee table has three indexes, One Clustered Index on column ID and the rest are non-clustered index on EmpName and DeptID. For each insert against the Employee table SQL Server needs to do update the underlying indexes on the table also, in our case it is three. If we are doing an update on ID column then SQL Server needs to update the three indexes, because when you update a clustered index all the non-clustered indexes are also get updated. What if we update DeptID of that table? This time SQL Server needs to update two indexes, index on DeptID and the Clustered Index.

Well, there are some limitations with this query, the major one is that the counters reset when sql server starts. Another limitation is, when we drop any indexes on that table, we will lost reads/update information performed against those indexes. So this method is a temporary solution only!