Friday, March 30, 2012

help importing large flat file into relational tables

I am trying to find the best (fastest) way to import large text files into sql server 2000 relational tables. The database has simple recovery model.

The total size of data to be load is about 1gb (each file)

source:

Multiple source files with different layout.


Destination:

3 sql server tables; a parent table with a ID (IDENTITY) primary key and a 3 column unique index (alternate key) and a detail table that has a DETAIL_ID (IDENTITY) primary key and a foreign key ID to the parent table.


So, for each source file, I need to convert some columns to decimal and separate the data into parent and detail tables.

Here is one way to do this:

For each source file

* use a thread for each file with Microsoft Visual Studio .NET 2003

* run insert query with SP's por each record, and join parent table with details.

* process next record

My app works fine in XP and SQL express, When I run the queries the CPU utilization is consitantly around 90%. This is my development environment.

My production enviroment is with SQL 2005 and Win 2003 Enterprise Edition Sp1.

When I run the queries the CPU utilization is consitantly around 3%.

I have 2 Questions:

1- Is there a better way to do this?

2- Why XP has utilization around 90%, and Win2003 3%?

Speed is the primary concern.

Thank you for any suggestions.

Using SPs calls is not the fastest way. The quickest way is to generate files including the identity values and then bcp/.bulk insert the data into SQL. You could use SSIS to do this as well.

The reason you should generate you ID values outside of the database is so that when spliting your input file into your tables you can just insert the data straight into the tables. You can let SQL generate the PK identity values of the detail tables.

No comments:

Post a Comment