Mr Pro Tools


Hi I have a production table in SQL Server 2005 that has approx 500,000 records---every 6 hours this table needs to be truncated and filled

The basic SSIS package uses a script compentant and imports the data into a staging table which has the same structure as the production table. I have a final Execute SQL Task that Truncates the production table and does a Insert into production-select * from stage table.

Takes around 30 seconds to run this last Execute SQL Task--problem is there is a risk that our webservice will query this table during the Execute SQL Task and return incorrect results.

Q1: In this last Execute SQL Task if I used a BEGIN TRANSACTION and COMMIT TRANSACTION; will this be any quicker

Q2: In this last Execute SQL Task- would it be better to use a RENAME TABLE technique in TSQL--any code examples

Q3:Is there any way in TSQL I can compare EVERY FIELD in two tables ie Stage and Production which have identical data structures and figure out a way to update only the records that changed Is SQL Server Replication the best way to do this

thanks kindly

Dave




Re: What technique to use to update production table from staging table

jwelch


1: Probably not.

2: You could use sp_rename. You'd have to drop the other table first.

3: The checksum function might help with this, but the built-in SQL Server function tends to have a high collision rate (same checksum value for different input values). You also need a primary key to compare on. So you'd do a join between the tables on the primary key, run all the columns from each table into its own checksum function, and include a WHERE clause looking for checksum values that don't match. Not sure how good performance will be. It helps if you can modify the table definition to include the checksum value as a column, so you don't have to do the calc each time.

Hope this helps.







Re: What technique to use to update production table from staging table

Phil Brammer

Would replication help here Just curious, as I don't know the answer to that....





Re: What technique to use to update production table from staging table

Mr Pro Tools

Thanks Im really interested in the checksum technique but have no idea of how to implement this; I have complete control of the table design If I make up a simple example like this can you please show me the TSQL to achieve (do a join between the tables on the primary key, run all the columns from each table into its own checksum function, and include a WHERE clause looking for checksum values that don't match.)

Stage table
--------------------
S_Foxprofile1
S_ID (AutoInt) PK
PhoneNum (BigInt)
Account (Varchar 50)

Prod Table
--------------------
P_Foxprofile1
P_ID (AutoInt) PK
PhoneNum (BigInt)
Account (Varchar 50)

Finally this must be a very common ETL problem to update a production table with only the data that has changed from its mirror stage table with all fields in scope.
I was hoping for a common design pattern in the Microsoft world for this.

thanks kindly Dave





Re: What technique to use to update production table from staging table

Mr Pro Tools

Oh the Replication idea was probaly not correct --- the Checksum idea is a really good lead
Thanks Im really interested in the checksum technique but have no idea of how to implement this; I have complete control of the table design If I make up a simple example like this can you please show me the TSQL to achieve (do a join between the tables on the primary key, run all the columns from each table into its own checksum function, and include a WHERE clause looking for checksum values that don't match.)

Stage table
--------------------
S_Foxprofile1
S_ID (AutoInt) PK
PhoneNum (BigInt)
Account (Varchar 50)

Prod Table
--------------------
P_Foxprofile1
P_ID (AutoInt) PK
PhoneNum (BigInt)
Account (Varchar 50)

Finally this must be a very common ETL problem to update a production table with only the data that has changed from its mirror stage table with all fields in scope.
I was hoping for a common design pattern in the Microsoft world for this.

thanks kindly Dave





Re: What technique to use to update production table from staging table

Phil Brammer

Have you read the sticky post on the first page of this forum Checking to see if a record exists, if so do XXX if not do YYY







Re: What technique to use to update production table from staging table

Phil Brammer

Also, I just created a blog entry in regards to using the Konesan's Checksum Transformation.

http://www.ssistalk.com/2007/03/09/ssis-using-a-checksum-to-determine-if-a-row-has-changed/





Re: What technique to use to update production table from staging table

Mr Pro Tools

Thanks Phil,

Would you mind ziping up that SSIS project in your blog entry and emailing to me [removed e-mail address to keep spam bots away - Phil Brammer]

I have also thought of a much simpler idea --stick to idea of loading data into a stage table that mirrors production and then have a final execute SQL task that does as outlined here http://www.mssqltips.com/tip.asp tip=1023

example of using a CHECKSUM is to store the unique value for the entire row in a column for later comparison. This would be helpful in a situation where all of the rows in a table need to be compared in order to perform an UPDATE.

SET NOCOUNT ON
-- SQL Server 2005
USE ProfilerTest;
GO
UPDATE a
SET TextData = 'Not Applicable'
FROM dbo.ProfilerResults3 a
WHERE a.RowCheckSum = CHECKSUM(RowNumber, EventClass, ApplicationName);
GO

SELECT * 
FROM dbo.ProfilerResults3;
GO




Re: What technique to use to update production table from staging table

jwelch

I would be careful when using the built-in SQL Checksum function. HashBytes might be a better choice. From Books Online:

"CHECKSUM satisfies the properties of a hash function: CHECKSUM applied over any two lists of expressions returns the same value if the corresponding elements of the two lists have the same type and are equal when compared using the equals (=) operator. For this definition, null values of a specified type are considered to compare as equal. If one of the values in the expression list changes, the checksum of the list also generally changes. However, there is a small chance that the checksum will not change. For this reason, we do not recommend using CHECKSUM to detect whether values have changed, unless your application can tolerate occasionally missing a change. Consider using HashBytes instead. When an MD5 hash algorithm is specified, the probability of HashBytes returning the same result for two different inputs is much lower than that of CHECKSUM."

http://msdn2.microsoft.com/en-us/library/ms189788.aspx

I've also had good experiences using the Checksum transform from SQLIS.com (http://www.sqlis.com/21.aspx).






Re: What technique to use to update production table from staging table

Phil Brammer

Mr Pro Tools wrote:

Thanks Phil,

Would you mind ziping up that SSIS project in your blog entry and emailing to me



Sent.

Phil