I have a problem to design the apps.
There is sql 2000 server with 5 tables to join in order to generate report.
Each table contain about 1-2 million records.

What's the best practice to design the apps

Pull all tables to local copy (dataset)
Run all the join using store proc and return as 1 huge table

any idea

Re: .NET Framework Data Access and Storage Bulk record processing

Figo Fei - MSFT


Since the number of the records reach several million, you could use dataReader to read out what you currently need instead of putting all the records into the dataset which may cause memory leak or performance issue.


Re: .NET Framework Data Access and Storage Bulk record processing

Sean Fowler

For that amount of data I wouldn't process it client-side. If you have that much data there will doubtless be other reports that are needed in the future.

If you want to do it client-side you definitely shouldn't try to load it all into memory, and should use the DataReader instead as Figo suggested.

However I'd be looking at Sql Reporting Services, which is designed for exactly this kind of workload.

Also if it's not acceptable to slow the database server down while you're running the report you can look at having a separate database server with its own copy of the data and run the reports on that.

However that's not a complete solution. Typically the best way to do this is to have a separate database that contains the amalgamated data. You populate it fully the first time you run the report. The next time you run the report it uses this pre-amalgamated data, after updating just the things that have changed since the last time the report was run. It's more complicated to set up but the performance can be thousands of times better than processing everything every time. This is a simplified implementation of data-warehousing.

Anyway the first thing I'd do is to look into Sql Reporting Services and go from there.


Re: .NET Framework Data Access and Storage Bulk record processing


Hi All,

thanks for the reply.

I actually tried using DTS because it can generate csv file.

and it took 4 - 5 hours per reports and i have 8 reports to be generated.

I'll try to look for Sql Reporting Services, never use it before.

When you said "The best way to do is to have separate database..." does it mean in the same server or different server


Re: .NET Framework Data Access and Storage Bulk record processing

Sean Fowler

It can be either on a separate server or the same one. Typically it would be on a separate server to avoid slowing down the main server so much while reports are being run.

4-5 hours per report is a loooooong time, but then you knew that. You say you have millions of rows in each of the tables, but how many rows does each report contain Has the data in the reports been aggregated (i.e. do you have group bys, sums, counts etc)

If you post the Sql you use for one of the reports I'll take a look to see if we can speed it up, and hopefully others will too.

There are two main ways to write complex queries. One is to try to do as much as possible in a single select statement. I used to do it this way, but I found a far better way. In your first statement have only the tables you need in order to select the right rows, and put what data you need from those tables into a temp table, along with any columns you'll need for joing to the other tables. The temp table should also contain columns for the additional data you need from those other tables. These additional columns will be filled by subsequent update statements, typically one for each table you need to get data from.

This sounds as though it should be slower, but in fact it's generally much faster. Sql Server can handle multiple simple queries far better than a single big one. It's also far easier to write and maintain the Sql when it's written this way.

You may already be writing it this way, but if not it's a potential area for performance gains.