sunliners81
Starting Member
4 Posts |
Posted - 2008-10-30 : 13:38:33
|
I'm having some problems with exporting data from a view to text files. Here are the issues;1. The view returns about 1.2 billion rows.2. I've partitioned the primary table in this view by date range, but only one processor of four is max at 100% while running this operation.3. I'm using the following script to output by date.DECLARE @YEAR CHAR(4), @MONTH VARCHAR(2), @FILENAME VARCHAR(50), @BCP VARCHAR(1000), @SQL VARCHAR(200); SET @YEAR = '1999'; SET @MONTH = '1'; WHILE (@YEAR <> '2007') BEGIN WHILE (@MONTH <> '13') BEGIN SET @FILENAME = 'Output_' + @YEAR + '_' + @MONTH + '.TXT'; SET @SQL = 'SELECT * FROM Master_DMI.DMI.ALL_DMI_Comp_View WHERE YEAR(Original_Date) = ' + @YEAR + ' AND MONTH(Original_Date) = ' + @MONTH SET @BCP = 'bcp "' + @SQL + '" queryout \\StorageServer\OutData\' + @FILENAME + ' -T -w -t^| -r\n' EXEC xp_cmdshell @BCP; SET @MONTH = @MONTH + 1; END SET @YEAR = @YEAR + 1; SET @MONTH = 1;4. Each file takes from 3.5 to 4 hours to write and each file is about 3.5 to 4GB. (So that means that it will take almost two weeks to write!)Can anyone suggest a better way to write out 1.2 billion rows to flat files? Should I partition by some other key other than data is my query selects by date?Other this I might consider, like modifying my current indexes?I've changed the Recovery Model to Bulk Logged. Is there other Server settings that may help?One issue is bottle neck issues with our direct attached storage. We have four arrays, and with my partition schema, I may not be spreading the data well enough.CREATE Partition FUNCTION pfALL_DMI (Datetime) AS RANGE RIGHT FOR VALUES ('20011231', '20040630', '20061231') GO CREATE PARTITION SCHEME psALL_DMIAS PARTITION pfALL_DMITO ( [PRIMARY], [ALL_DMI_01_20011231], [ALL_DMI_20020101_20040630], [ALL_DMI_20040701_20061231]); CREATE TABLE [DMIALL_DMI [Record_Code] [varchar](4) NULL, [Original_Date] [datetime] NOT NULL, [Contract_Date] [datetime] NULL, [NDC] [char](11) NULL, [Contract_CD] [varchar](15) NULL, [ItemNo] [varchar](20) NULL, [Material_Group_Code] [varchar](1) NULL, [Customer_Number] [varchar](10) NULL, [Code_Unknown] [varchar](10) NULL, [Vendor_CD] [varchar](10) NULL, [Unit_Of_Measure] [char](2) NULL, [Quantity] [int] NULL, [Total_Cost] [money] NULL, [ChargeBack] [money] NULL, [Manufacturer] [varchar](50) NULL, [BW_SOURCE] [numeric](4, 0) NULL, [BW_ID] [numeric](18, 0) IDENTITY(1,1) NOT NULL, CONSTRAINT [PK_ALL_DMI] PRIMARY KEY CLUSTERED ( [Original_Date] ASC, [BW_ID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [psALL_DMI]([Original_Date]))Thanks, Clay |
|
cvraghu
Posting Yak Master
187 Posts |
Posted - 2008-10-31 : 02:20:04
|
I don't have much experience with this. But here are my thoughts - 1. Instead of a single script, why can't multiple scripts( or jobs) with different date ranges run parallely?2. Instead of Year and Month funtion in where clause, use 'Original_Date between ' + @Month + '/01/' + @YEAR + ' AND ' @Month + '/31/' + @YEAR3. NOLOCK used in view definition? |
 |
|