Dotnet Fellow


On another thread a poster suggested that you can experience performance gains in Ssis by not using stored procedures.

I asked if I could have some Microsoft articles that articulated this position, but received no response.

Does anyone have links to Microsoft supported articles articulating when/if it is appropriate from Ssis to Not exec Stored Procedures for data manipulation or data collection




Re: when should ssis package use embedded sql query?

jwelch


The best way to get performance data is to test it yourself. Everyone's hardware, data, etc. varies, so the best results are usually the ones you create with your data.

My experience is that any operation that can be performed as a batch is going to be faster than a single row operation. Since most stored procs are single row operations (they perform a single row insert, update, or delete), they are slower than performing a batch operation (using an INSERT...SELECT statement for example). This really isn't anything to do with SSIS, it is a simple fact of how most relational databases are designed to work. They function best of sets of data, rather than single rows.







Re: when should ssis package use embedded sql query?

Dotnet Fellow

Actually most selects return a set of data, and one key question here is, "Is it faster to embed a dynamic sql select clause in an Ole Db Source object, or is it faster to have the Ole Db Source object execute a stored procedure on the database engine that returns the dataset " Some of the objections I have heard in the past about embeddding a dynamic sql select clause in an application is that the database server is not able to pre-optimize the query, whereby if you place your select clause in a stored procedure, then the database engine can create a plan ahead of time, thus speeding up the process. However, like I said, some have suggested that Ssis has been built to performance within Sql Server and that bypassing the sproc layer is actually faster. I am trying to separate myth from fact and need hard evidence, such as a Microsoft performance tuning article(s) as it relates to Ssis and this specific topic of calling sprocs from Ssis.

Having come from a more traditional Sql Server background, I can appreciate the school of thinking that a sproc layer is more efficient. However, as I use Ssis I am finding tremendous performance improvements. The problem I am having is with our operational department, which insists that all queries be performed through a sproc. So, I need some ammunition in my arsenal to convince the establishment that embedded dynamic sql within Ssis Ole Db objects is okay in some if not most cases. And when it is more efficient to call a sproc layer, I should be able to outline the reasons for those occassions as well.

Any help anyone can provide would be greatly appreciated.






Re: when should ssis package use embedded sql query?

jwelch

My original response was focusing more on the data manipulation half of your original question. If you are only talking about using stored procs that SELECT data, and not writing data with them, then it is likely that the stored procs will have a performance advantage over the embedded queries, due to the reasons you mention above. Again, the best way to determine this is to run some tests in your environment, using both embedded sql and stored procs. That way, you have numbers specific to your environment, and you know exactly what the difference in performance is.

A non-performance related reasons that you might want to use embedded SQL is easier maintenance. If you need to update the package, you make the changes in one place. If you use a stored proc, and the set of columns being returned is modified, you still have to open the package to make sure the metadata is updated.






Re: when should ssis package use embedded sql query?

Dotnet Fellow

It fascinates me there seems not to exist a best practice for referencing stored procedures vice embedded dynamic sql from inside Ssis.



Re: when should ssis package use embedded sql query?

Phil Brammer

Dotnet Fellow wrote:
It fascinates me there seems not to exist a best practice for referencing stored procedures vice embedded dynamic sql from inside Ssis.


Because there is no best practice. It's up to the user and his/her environment to determine best practices.





Re: when should ssis package use embedded sql query?

Dotnet Fellow

Hi Phil, In a prior thread you suggested there were performance gains to be had by not calling sprocs from Ssis. I am just trying to discover in what scenarios that is the case and where there are articles elucidating these occassions.



Re: when should ssis package use embedded sql query?

Phil Brammer

Dotnet Fellow wrote:
Hi Phil, In a prior thread you suggested there were performance gains to be had by not calling sprocs from Ssis. I am just trying to discover in what scenarios that is the case and where there are articles elucidating these occassions.


Do you have that thread handy, by chance







Re: when should ssis package use embedded sql query?

Kent Waldrop Jn07

"...My experience is that any operation that can be performed as a batch is going to be faster than a single row operation. Since most stored procs are single row operations (they perform a single row insert, update, or delete), they are slower than performing a batch operation (using an INSERT...SELECT statement for example). ..."

OK, this may be true, but isn't this more to do with bad coding practice rather than any inherent property of stored procedures (or functions) Most INSERT .. SELECT statements that can be done outside of the scope of a procedure can also be done within the scope of a procedure. It sounds to me like what you are talking about are stored procedures that base their process on a cursor rather than a set based process -- is this the case





Re: when should ssis package use embedded sql query?

Dotnet Fellow

Post number 4 in this thread here:

http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=1785287&SiteID=1

You said: "They are missing the boat on performance gains possible by NOT using stored procs."

I asked but received no response to: "If you could point me to those performance gains articles then this will help me better communicate to our operations department the merits in going with data manipulation from within ssis and bypassing the sproc layer."

I am just looking for a set of guidelines or principles of when to and not to call sprocs from within Ssis.





Re: when should ssis package use embedded sql query?

Phil Brammer

Kent Waldrop Jn07 wrote:

jwelch wrote:
"...My experience is that any operation that can be performed as a batch is going to be faster than a single row operation. Since most stored procs are single row operations (they perform a single row insert, update, or delete), they are slower than performing a batch operation (using an INSERT...SELECT statement for example). ..."

OK, this may be true, but isn't this more to do with bad coding practice rather than any inherent property of stored procedures (or functions) Most INSERT .. SELECT statements that can be done outside of the scope of a procedure can also be done within the scope of a procedure. It sounds to me like what you are talking about are stored procedures that base their process on a cursor rather than a set based process -- is this the case



But using, for instance, the OLE DB Destination using fast load (when properly configured), should be faster than an INSERT ... SELECT statement.





Re: when should ssis package use embedded sql query?

Kent Waldrop Jn07

I have no arguments with that and I guess I am therefore a bit off base. I guess I misunderstood because I thought what was being implied was that stored procedures were generally "single record processes" -- which is in general not true. Meaning the answer to my second question was "no"; that is not what was being talked about. Certainly the biggest point was the virtues of set-based processes -- with which I fully agree.

Sorry about the misunderstanding.

Dave





Re: when should ssis package use embedded sql query?

Dotnet Fellow

Okay so if I am understanding correctly, it sounds as though some of the arguments in favor of not using a sproc layer called from Ssis, may include:

1) When inserting data utilize Ole Db Destination fast load. This will outperform calling a sproc layer from Ssis to insert data.

Are there other arguements favoring performing data manipulation directly from Ssis

2) . . .

I have another question, when you call a select sproc to retrieve data from Sql, do you have to wait for the completion of the sproc before any data is returned to Ssis My thought is that there may be a argument for performing an embedded ad-hoc select from the Ole Db Source, if data immediately begins streaming back to the Ssis package, because then the Ssis can begin queueing and passing the data down the pipeline instead of having to wait for the completion of the entire sproc to return the dataset before continuing down the pipleine





Re: when should ssis package use embedded sql query?

jwelch

Kent Waldrop Jn07 wrote:

I have no arguments with that and I guess I am therefore a bit off base. I guess I misunderstood because I thought what was being implied was that stored procedures were generally "single record processes" -- which is in general not true. Meaning the answer to my second question was "no"; that is not what was being talked about. Certainly the biggest point was the virtues of set-based processes -- with which I fully agree.

Sorry about the misunderstanding.

Dave

Poor wording in my original post on this. I wasn't really talking about stored procedures in general, I was talking about them in the context of their use from SSIS. If you are using stored procs to add, update, or delete data from SSIS, and you are calling them from the data flow, they are likely to implement single record operations (since that is how the data flow itself operates). If you aren't using data flows, and are just using SSIS to orchestrate a series of Execute SQL tasks, then that wouldn't be true.






Re: when should ssis package use embedded sql query?

Dotnet Fellow

Post 9 by Donald Farmer dated 5/21/2006, 5:38 PM, here:

http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=415180&SiteID=1

"It is imporant to remember that SSIS is simply a client of SQL Server, like any other. If you are seeing issues and differences between queries in management studio and SSIS, then the reasons will be found using profiler. I know a lot of people have found the information here http://www.sql-server-performance.com/stored_procedures.asp very useful. Donald"

I find the above quote by Donald Farmer interesting because if you go to the link he gives it explicitly states an absolute and emphatic rule that "Whenever a client application needs to send Transact-SQL to SQL Server, send it in the form of a stored procedure instead of a script or embedded Transact-SQL."

However, if what Phil and John suggest in this thread is true, then going through a sproc is not always the best performing technique.

I think that Sql Server Integration Services and Database Engine engineers need to get on the same page about the best approach of when to and not to use sprocs from Ssis.