PALSUM

Hi ALL,

I am working in an windows based application using SQL Server 2000 as database. There are few tables (refer parent tables) in the application which are uploaded by a seprate application.

My application fetches data from parent tables and put data in separate tables (chils tables) of application.
I am using dataset to fetch data from parent tables and insert/update data in child table.
the problem is the records are so high (3-4 millions) then data is too large and it takes hell lot of time to complete the process. Also application server CPU utilization shoots out to max.

What will be the best way to achive this.
1. Should i use DataRepeater instead of dataset. or
2. Should i do processing in chunks. how can I do processing in chunks

or is there any other way i can process data.

Thanks




Re: Architecture General large dataset problem : please help

Ahmed Nagy

Hi;

Why you are using Dataset for such purpose What I understand that you have ASP .Net web application that process that

I believe that all you are doing is Data Transformation and no human input is required. My guess is based on the size of data you are talking about.

If my guess is right, You should look for another approach. In SQL Server 2000 you can use DTS (Data Transformation Services) to do such transformation.

Let me know if my guess is correct and if I can help further in this.






Re: Architecture General large dataset problem : please help

PALSUM

Thanks for the reply...However as mentioned it is not an ASP.NET web application. It is kind of windows service developed in VB.NET that is used to fetch data from one parent tables and populate in child table.

No user input is required, it is a service that is deployed using using windows service and based on timer it checks if data is available in parent table, if yes then it pulls the data from parent tables and populate in child table..

Thanks

P






Re: Architecture General large dataset problem : please help

Ahmed Nagy

Hi;

well, I think my suggestion is still valid regardles if it is .Net app or Windows service. Look for DTS to do the transformation for you.






Re: Architecture General large dataset problem : please help

Martin Platt

I don't know how you're implementing this problem, or planning to, but it seems to me that the fact that there's a timer, means that you're incrementally building data, so it would seem to make sense to be sending across deltas of the data that has changed since last time you looked

In terms of scalability, I think my approach would be to use a connection and DataReader, however, again, I don't know what you're doing, and why you have chosen a DataSet and a DataRepeater as your shortlist for plausible solutions...

Without requirements, or a business context, it's quite difficult to answer the questions that you have asked thus far.






Re: Architecture General large dataset problem : please help

Ahmed Nagy

Hi Martin;

I think .Net technology is not suitable for such great amount of data (3-4 million records). DTS of SQL 2000 are optimized in terms of processing and memory consumption. Let me know what you think.






Re: Architecture General large dataset problem : please help

Martin Platt

Yes, I would agree with you there. The more that you can leave up to the server box to do, the better...

Kind of like writing granular Services and then combining the results, it works, but not very fast!!






Re: Architecture General large dataset problem : please help

Ron Jacobs

A Dataset is the wrong choice for this type of application. You should be using a Datareader.

From MSDN: Best Practices for Using ADO.NET

To determine whether to use the DataSet or the DataReader when you design your application, consider the level of functionality that is needed in the application.

Use the DataSet in order to do the following with your application:

  • Navigate between multiple discrete tables of results.
  • Manipulate data from multiple sources (for example, a mixture of data from more than one database, from an XML file, and from a spreadsheet).
  • Exchange data between tiers or using an XML Web service. Unlike the DataReader, the DataSet can be passed to a remote client.
  • Reuse the same set of rows to achieve a performance gain by caching them (such as for sorting, searching, or filtering the data).
  • Perform a large amount of processing per row. Extended processing on each row returned using a DataReader ties up the connection serving the DataReader longer than necessary, impacting performance.
  • Manipulate data using XML operations such as Extensible Stylesheet Language Transformations (XSLT transformations) or XPath queries.

Use the DataReader in your application if you:

  • Do not need to cache the data.
  • Are processing a set of results too large to fit into memory.
  • Need to quickly access data once, in a forward-only and read-only manner.





Re: Architecture General large dataset problem : please help

Arnon Rotem Gal Oz

I agree with Ahmed, You should use Data Transformation Services (DTS) or SSIS if you can upgrade to SQL 2005.
There's an article by Diane Lersen on Technet that can give you an introduction to DTS

Arnon





Re: Architecture General large dataset problem : please help

Ahmed Nagy

Hi ;

Thanks arnon. Just to mention, DTS and SSIS gives you the same time based schedules you might need. They also include transformation functionality that are suitable for extracting data. DTS includes Scripting capabilities that allows for any transformation you might think of.

Be careful when you use any technique to transform large amount of data. You should expect down time for your service. Any transformation for large amount of data consumes the Memory and processing power of the server. The more you optimize your transformations, the more memory and process are consumed.






Re: Architecture General large dataset problem : please help

Ron Jacobs

Congratulations - your post has been selected for an ARCast.TV Rapid Response!



To listen to Architect MVP Scott Hanselman and Ron Jacobs address your question click here


Hope this helps,

Ron Jacobs






Re: Architecture General large dataset problem : please help

LivetoCodeCodetoLive!

Another solution would be to move all that processing to a stored procedure and use cursors to traverse the data. That way it won't matter if it is a windows or a web application. This solution is optimum if all the tables are on the same physical server.

I hope this helps.

Guys, please give me your feedback on this solutiuon.