Tamim Sadikali


Hi All,

Using a data access mode of SQL Command, I have a stmt that returns one date ( e.g select max(LastChangedDate) from TblA ).

I want to store the result in a variable. How do I do this

Cheers,

Tamim.





Re: OLE DB Source Adapter & variables.

DarrenSQLIS


If you want to just return a single value and stor eit in a variable, the Data Flow is the wrong task to use. Try the Execute SQL Task, and the ResultSet option of Single Row. You can set the output on the Results page.

Some more detailed examples are covered in this article-

The ExecuteSQL Task
(http://www.sqlis.com/58.aspx)







Re: OLE DB Source Adapter & variables.

Tamim Sadikali

...but this will be the first step in each of 4 logical threads of processing (e.g. select max(LastChangedDate) from TblA, select max(LastChangedDate) from TblB,...), and I intend executing each of them in parallel, which I can only do within the Data Flow. Surely there must be a way persisting the result (in a variable or in the buffer), to pass amongst other source/destination adapters, and transforms

Hope you can elucidate,







Re: OLE DB Source Adapter & variables.

Phil Brammer

But you can have more than one Execute SQL task in your control flow, and in your control flow you can do your parallel processing. You can create four data flows, all referenced in parallel by the control flow.





Re: OLE DB Source Adapter & variables.

DarrenSQLIS

"I intend executing each of them in parallel, which I can only do within the Data Flow" -- Why You could use multiple SQL tasks. What would be wrong with that

You could use a Script Component to write to variable, but really the idea of having a data flow (path) with just one row seems silly. Even if you have 4 sources in the data flow this still does not make sense to me. The data flow is all about moving data and you aren't. Saying that I'm all for bending something to do your work, but in this case I still think it would make more sense to use the Exec SQL Task.






Re: OLE DB Source Adapter & variables.

Ryan.Kelley

sigh... doesn't anyone use the internet anymore

Read up on using variables here and how to pass them in out and around enviornments.

http://blogs.conchango.com/search/SearchResults.aspx q=variable&PageIndex=3

http://blogs.conchango.com/jamiethomson/archive/2005/02/28/1085.aspx

you need to use a sql statement to get the date. There is a place within the task to set the results to a variable.





Re: OLE DB Source Adapter & variables.

Tamim Sadikali

If I have two unconnected Execute SQL Tasks in my Control Flow, they will be executed in parallel I tried this out and when I went to the Execution Results tab, it was clear from the start/finish times of each, that said two tasks were executed sequentially, and not in parallel.

'...You can create four data flows, all referenced in parallel by the control flow...'

So you are saying that 4 unconnected Execute SQL Tasks, each of which (on success) executes a Data Flow task, will result in SSIS processing each of the 4 logical threads in parallel I thought that the only way to achieve parallel processing was within the Data Flow Task itself, wherein I could initiate parallel processing with multiple data sources. So this is simply incorrect






Re: OLE DB Source Adapter & variables.

Phil Brammer

Why don't you connect them (I can see why you wouldn't if they are the start of the control flow) You can have one upstream control flow item pointing to more than one downstream item. SSIS will try to run in parallel anything it can, depending on MaxConcurrentExecutables and also depending on your machine.





Re: OLE DB Source Adapter & variables.

Tamim Sadikali

"I intend executing each of them in parallel, which I can only do within the Data Flow" -- Why You could use multiple SQL tasks. What would be wrong with that

So multiple unconnected Execute SQL Tasks would be processed in parallel Pls confirm, as this is a crucial point of my (mis)understanding. Repeating a point I made to Phil above, how comes when I tested this thro' a dummy package (consisting of nothing more than 2 unconnected Execute SQL Tasks), the Execution Results tab clearly showed (thro start/end times) that the two tasks were run sequentially

The data flow is all about moving data and you aren't

I am Darren, I just haven't explained the full situation:

  1. I firstly establish when data was last flowed down, into each of four distinct/disjoint areas - select max(LastChangedDate) from TblA, select max(LastChangedDate) from TblB,...
  2. Then for each of these areas I extract all new (or newly updated) data from their sources, based on the relevant LastChangedDate.
  3. Then for each of these areas I transform
  4. Then for each of these areas I load the results into their respective destinations.

2-4 are of course Data Flow Task work, and I'll want to perform each area's ETL work in parallel. (I previously however thought that if I wanted to perform 1 for the four areas in parallel too, I could only do that in the Data Flow). Hope this shows that I'm not trying to bastardise the Data Flow task too much.

Hope you can clarify my (mis)understanding re. Control Flow & parallel processing, and sincere thanks for your time.

Cheers,

Tamim.






Re: OLE DB Source Adapter & variables.

Tamim Sadikali

MaxConcurrentExecutables - we have lift off! Thanks Phil - I've corrected a potentially disasterous misunderstanding here, in thinking that the control flow could only handle sequential processing. My two disjoint Execute SQL Tasks now run in parallel.

Sincere thanks,

Tamim.






Re: OLE DB Source Adapter & variables.

DarrenSQLIS

If you have two or more tasks, without any workflow to enforce order then they can excute in parallel. MaxConcurrentExecutables could prevent this as mentioned above. Saying that they may not execute exactly at the same time. It is also not always the case with two sources in a a single data flow. It may not be visible since you have no component level diagnostics as you do with a task, but that does not mean they may be serialised, blocked or otherwise constrained. Trust me I have seen this just from the pretty colors you get when running a large package with several sources in a Data Flow. One source may experience network latency and run after another for example. One query may take longer than another. They do not run at the same time.

Your scenario is a common one, get the last date, then use that as the basis for a new extract. Point 1 is not data movement. Point 2-4 is.

Why can you not have 4 Exec SQL Tasks and 4 Data Flow tasks You could have 4 Exec SQL task and 1 Data Flow task even, or combinations in between. What is the issue with running them at exactly the same time That is really not a realistic goal as resources and/or connection behaviour may well prevent this. If you need to ensure that you are extracting data from multiple tables in a consistent (as in atomic consistency) then use a transaction, don't try and build something that "looks" right, the 4 sources in Data Flow, but is not really doing what you want.

So Control Flow can be serial or parallel, and in some ways this is not up to you, you can force serial, but not ensure parallel. The same goes for the Data Flow. So why do you need parallel, would a transaction give you what you want and need, and actually ensure it happens they way you want You could use tasks in a transaction. Either use the RetainSameConnection transaction pattern (Google it) or group them in a sequence container and transact the container. Then do the Data Flow (you points 2-4) afterwards.

Any clearer at all






Re: OLE DB Source Adapter & variables.

Rafael Salas

Tamim Sadikali wrote:

"I intend executing each of them in parallel, which I can only do within the Data Flow" -- Why You could use multiple SQL tasks. What would be wrong with that

So multiple unconnected Execute SQL Tasks would be processed in parallel

Yes, the should run in parallel (hardward permit). A single processor, litle memory, etc may make harder to see such parallelism.

Tamim Sadikali wrote:

I firstly establish when data was last flowed down, into each of four distinct/disjoint areas - select max(LastChangedDate) from TblA, select max(LastChangedDate) from TblB,...

  1. Then for each of these areas I extract all new (or newly updated) data from their sources, based on the relevant LastChangedDate.
  2. Then for each of these areas I transform
  3. Then for each of these areas I load the results into their respective destinations.

Your approach looks right; I would recommend you to break that logic in smaller pieces (e.g. create single package per area); then use a 'master' package to call other packages. It would make easier the maintenance and to re-arrange the flow if necessary.






Re: OLE DB Source Adapter & variables.

Tamim Sadikali

That's much clearer Darren, thanks.

The issue in running them in parallel is just about saving time - nothing else. Therefore if this isn't (entirely) realised/actualised, it's no biggy. You've captured this maxim well: '...you can force serial, but not ensure parallel...' Understood. The 4 Exec SQL Tasks each followed by a Data Flow Task seems a good way to go.

Many thanks to you & everyone else for their input here,

Tamim.

P.S. Am enjoying 'Professional SQL Server 2005 Integration Services' - it's a very good ramp on.