Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Unpivot XLS with dynamic headers

Author  Topic 

ryan.gillies
Starting Member

27 Posts

Posted - 2013-11-13 : 02:48:57
Hi all

I have a rather niche problem that I need to overcome. My company has hundreds of Excel spreadsheets with pivoted data (stores across the top, dates down the left, amounts in the middle) that we need to pull into our SQL Server.

I'm going to automate the process via Powershell, but pieceing the sql together is proving more difficult than I thought it would.

So far I have the following:
DECLARE @y AS INT
DECLARE @m AS INT

SET @y = 2013
SET @m = 1

SELECT dbo.DateSerial(@y, @m, [Day]) AS pDate, [store], [amount]

FROM
(
SELECT * FROM OPENROWSET ('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=Z:\POS\XLS\01.xls;HDR=YES','SELECT * FROM [102$]') WHERE [??] IS NOT NULL
) AS pvt

UNPIVOT
(
amount FOR store IN ([Bade], [Longtan])
) AS unpvt

ORDER BY [Day]

This works successfully and provides me with three columns (pdate, store & amount) for the two stores Bade and Longtan.

However I need to pull all of the stores, but I can't 'hardcode' them as above because from sheet to sheet, the header row can dynamically change with different stores (where some have closed and opened over time). This means I need to dynamically pull the stores from each header row in each xls sheet... but how to do this I don't know.

I previously used the following method to dynamically generate a pivot, but I can't quite figure out if I can use it to get the header row for unpivoting.

DECLARE @query NVARCHAR(4000)
DECLARE @cols NVARCHAR(2000)

SELECT @cols = STUFF(( SELECT DISTINCT TOP 100 PERCENT
'],[' + Store.Store
FROM GenshenHR.dbo.Store
INNER JOIN (SELECT DISTINCT StoreID FROM CTT WHERE YEAR([date]) = @year AND MONTH([date]) = @month) AS aStores
ON Store.StoreID = aStores.StoreID
ORDER BY '],[' + Store.Store
FOR XML PATH('')
), 1, 2, '') + ']'


SET @query = 'SELECT [d], [dw], '+ @cols +'
.......

My thinking is to do something similar and pull the names of each column from each xls sheet and structure in a variable that looks like the following to stuff into my unpivot statement...
[Store1],[Store2],[Store3],[Store4],[Store5],etc etc etc

Does anyone have any thoughts...!?

ryan.gillies
Starting Member

27 Posts

Posted - 2013-11-13 : 03:00:36
Doing the following will return a single row containing the names of each column header that I suppose I could concatenate somehow, but again I'm now stuck with a dynamically changing set of columns from F1 through to Fn

SELECT * FROM OPENROWSET ('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=Z:\POS\XLS\01.xls;HDR=NO','SELECT * FROM [102$B1:Z1]')
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-11-13 : 07:58:47
I previously used the following method to dynamically generate a pivot, but I can't quite figure out if I can use it to get the header row for unpivoting.

you can use the list to UNPIVOT data in same way
something like

DECLARE @SQL varchar(2000)
SET @SQL='SELECT *
FROM TableName t
UNPIVOT(ValueField FOR HeaderField IN(' + @cols + '))u'
EXEC(@SQL)


------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

ryan.gillies
Starting Member

27 Posts

Posted - 2013-11-13 : 08:42:43
quote:
Originally posted by visakh16
you can use the list to UNPIVOT data in same way
something like

DECLARE @SQL varchar(2000)
SET @SQL='SELECT *
FROM TableName t
UNPIVOT(ValueField FOR HeaderField IN(' + @cols + '))u'
EXEC(@SQL)




That's the obvious part, the difficult part is how to generate @cols in the first place.

In my previous example I pull all of the distinct records from one column in a table and stuff them together. This time I need to take the names of all of the columns from a header row within an xls file.

At this point I'm thinking its only possible with powershell, but does anyone else think otherwise...?
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-11-13 : 09:29:29
quote:
Originally posted by ryan.gillies

quote:
Originally posted by visakh16
you can use the list to UNPIVOT data in same way
something like

DECLARE @SQL varchar(2000)
SET @SQL='SELECT *
FROM TableName t
UNPIVOT(ValueField FOR HeaderField IN(' + @cols + '))u'
EXEC(@SQL)




That's the obvious part, the difficult part is how to generate @cols in the first place.

In my previous example I pull all of the distinct records from one column in a table and stuff them together. This time I need to take the names of all of the columns from a header row within an xls file.

At this point I'm thinking its only possible with powershell, but does anyone else think otherwise...?


you can use OPENROWSET based query itself for that

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

ryan.gillies
Starting Member

27 Posts

Posted - 2013-11-13 : 20:11:02
quote:
Originally posted by visakh16
you can use OPENROWSET based query itself for that



visakh16, whilst I appreciate the effort, you've answered twice now with two solutions I had already provided.

I know I can use openrowset, as I previously mentioned:
quote:
Originally posted by ryan.gillies
Doing the following will return a single row containing the names of each column header that I suppose I could concatenate somehow, but again I'm now stuck with a dynamically changing set of columns from F1 through to Fn
SELECT * FROM OPENROWSET ('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=Z:\POS\XLS\01.xls;HDR=NO','SELECT * FROM [102$B1:Z1]')


The issue I have is that there is only ONE row, with a dynamically changing number of columns. I need to combine together into one string (@cols) the value of each column in that row.

Please don't answer for the sake of answering.

Anyone else...?!
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-11-14 : 02:36:13
The issue I have is that there is only ONE row, with a dynamically changing number of columns. I need to combine together into one string (@cols) the value of each column in that row.
Why cant you use OPENROWSET query itself for that? I didnt understand what the difficulty is. Even if columns are dynamic the OPENROWSET will give you column headers right as the first row. Dump the result into a table using SELECT INTO syntax


SELECT * INTO TestTable FROM OPENROWSET ('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=Z:\POS\XLS\01.xls;HDR=YES','SELECT * FROM [102$B1:Z1]')

Now you can use INFORMATION_SCHEMA.COLUMNS Catalog view to get column info

SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TestTable'

to make it into concatenated list you can use FOR XML PATH('') syntax you used and I dont think I need to repeat that

Please don't answer for the sake of answering.

Anyone else...?!

I've tried to understand your scenario with whatever info you provided and given the solution accordingly.
I dont have to answer it just for the sake of it as I've been posting more than enough already. I'm posting this whilst working on something else so it may not always be possible to elaborate it with query etc. What I expect is that you make a start based on pointers given and try to get solution. In case of doubts you can always post back.


------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

ryan.gillies
Starting Member

27 Posts

Posted - 2013-11-14 : 03:41:56
Sorry visakh16, that was a bit too harsh of me, its been a long day and that wasn't the best way for me to let off steam; so my apologies.

I couldn't see where you were going with the OPENROWSET as I couldn't find anyway to read the schema, but you're correct in that creating tables would allow me to do so. However as I'm working with hundreds of xls files with multiple sheets in each the amount of tables created and dropped seems like an unnecessary overhead to me. In the end I went for the following Powershell solution:

$sqlHDR = "SELECT * FROM OPENROWSET ('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=" + $dir + $m + ".xls;HDR=NO','SELECT * FROM [" + $y + "`$B1:Z1]')"
$doHDR = invoke-sqlcmd -query $sqlHDR -ServerInstance $sqlserver -Database $db

foreach($prop in $doHDR.psobject.properties | ? {$_.name.StartsWith("F")} | select value) #Get column headers from XLS
{
$hdr = $hdr + "[" + $prop.Value + "], "
}
$hdr = $hdr -replace "], $", "]"

This loops through all the properties of the dataset, filtering it down to those beginning with F (as that's the default column heading for an excel spreadsheet with no HDR) and then mashes all of them together into a string to use in my UNPIVOT statement.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-11-14 : 05:35:15
quote:
Originally posted by ryan.gillies

Sorry visakh16, that was a bit too harsh of me, its been a long day and that wasn't the best way for me to let off steam; so my apologies.

I couldn't see where you were going with the OPENROWSET as I couldn't find anyway to read the schema, but you're correct in that creating tables would allow me to do so. However as I'm working with hundreds of xls files with multiple sheets in each the amount of tables created and dropped seems like an unnecessary overhead to me. In the end I went for the following Powershell solution:

$sqlHDR = "SELECT * FROM OPENROWSET ('Microsoft.Jet.OLEDB.4.0','Excel 8.0;Database=" + $dir + $m + ".xls;HDR=NO','SELECT * FROM [" + $y + "`$B1:Z1]')"
$doHDR = invoke-sqlcmd -query $sqlHDR -ServerInstance $sqlserver -Database $db

foreach($prop in $doHDR.psobject.properties | ? {$_.name.StartsWith("F")} | select value) #Get column headers from XLS
{
$hdr = $hdr + "[" + $prop.Value + "], "
}
$hdr = $hdr -replace "], $", "]"

This loops through all the properties of the dataset, filtering it down to those beginning with F (as that's the default column heading for an excel spreadsheet with no HDR) and then mashes all of them together into a string to use in my UNPIVOT statement.


No problems
it happens to all
Glad that you got it sorted out with Powershell.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page
   

- Advertisement -