Friday, March 30, 2012

help importing large flat file into relational tables

I am trying to find the best (fastest) way to import large text files into
sql server 2005 relational tables. The database has simple recovery model.
The total size of data to be load is about 1gb (each file)
source:
Multiple source files with different layout.
Destination:
3 sql server tables; a parent table with a ID (IDENTITY) primary key and a 3
column unique index (alternate key) and a detail table that has a DETAIL_ID
(IDENTITY) primary key and a foreign key ID to the parent table.
So, for each source file, I need to convert some columns to decimal and
separate the data into parent and detail tables.
Here is one way to do this:
For each source file
* use a thread for each file with Microsoft Visual Studio .NET 2003
* run insert query with SP's por each record, and join parent table with
details.
* process next record
My app works fine in XP and SQL express, When I run the queries the CPU
utilization is consitantly around 90%. This is my development environment.
My production enviroment is with SQL 2005 and Win 2003 Enterprise Edition Sp1.
When I run the queries the CPU utilization is consitantly around 3%.
I have 2 Questions:
1- Is there a better way to do this?
2- Why XP has utilization around 90%, and Win2003 3%?
Speed is the primary concern.
Thank you for any suggestions.Macisu wrote:
> I am trying to find the best (fastest) way to import large text files into
> sql server 2005 relational tables. The database has simple recovery model.
> The total size of data to be load is about 1gb (each file)
> source:
> Multiple source files with different layout.
>
> Destination:
> 3 sql server tables; a parent table with a ID (IDENTITY) primary key and a 3
> column unique index (alternate key) and a detail table that has a DETAIL_ID
> (IDENTITY) primary key and a foreign key ID to the parent table.
>
> So, for each source file, I need to convert some columns to decimal and
> separate the data into parent and detail tables.
> Here is one way to do this:
> For each source file
> * use a thread for each file with Microsoft Visual Studio .NET 2003
> * run insert query with SP's por each record, and join parent table with
> details.
> * process next record
> My app works fine in XP and SQL express, When I run the queries the CPU
> utilization is consitantly around 90%. This is my development environment.
> My production enviroment is with SQL 2005 and Win 2003 Enterprise Edition Sp1.
> When I run the queries the CPU utilization is consitantly around 3%.
> I have 2 Questions:
> 1- Is there a better way to do this?
> 2- Why XP has utilization around 90%, and Win2003 3%?
>
> Speed is the primary concern.
> Thank you for any suggestions.
SQL Server Integration Services is the most obvious solution to try.
Read about Integration Services in Books Online. Depending on the
format of your files BCP may also be an option.
--
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--

No comments:

Post a Comment