Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 SELECT In SP Based On Multi-Dimension Array Param

Author  Topic 

johnvm
Starting Member

18 Posts

Posted - 2007-07-26 : 04:47:19
Hi, I'm making a C# application that has to query a large # of rows from SQL. Right now I have a situtation where I have a large datatable (500,000+ rows) that contains cell values like:

"ABC","12/12/06"
"DEF","12/13/06"
"GHI","12/14/06"
"JKL","12/15"06"

(The above being representations of the cell values in 2 columns).

I need to run a SELECT query for each row of data in this datatable, such as:

SELECT TOP 1 * FROM [TABLE] WHERE (FIELD1 = "ABC" AND FIELD2 = "12/12/06") ORDER BY FIELD2 DESC
SELECT TOP 1 * FROM [TABLE] WHERE (FIELD1 = "DEF" AND FIELD2 = "12/13/06") ORDER BY FIELD2 DESC
SELECT TOP 1 * FROM [TABLE] WHERE (FIELD1 = "GHI" AND FIELD2 = "12/14/06") ORDER BY FIELD2 DESC
SELECT TOP 1 * FROM [TABLE] WHERE (FIELD1 = "JKL" AND FIELD2 = "12/15/06") ORDER BY FIELD2 DESC

Right now, I have the C# program execute the query procedure for each row individually, and this is very very slow, particularly when I have 500,000+ rows, running the query code 500,000 times. What I'd like to do is pass these two datatable columns as an array/table to a SQL stored procedure, and get a result set back with all of the selected values in it.

Does anyone know where I could find an example SP to do this/how to do this/if this is possible? I'll take any help I can get, even if its a subpar example, but if you know what would be the most efficient way to go about doing this too that'd be greatly appreciated. I've searched this forum + google and have yet to find (as far as I can tell) quite what I'm looking for. Thanks.

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-26 : 05:23:09
maybe you shoul dstart with telling us your main business requirement. what are you trying to do logically?

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

johnvm
Starting Member

18 Posts

Posted - 2007-07-26 : 05:35:50
I have a large database of financial information. In this database I have a large table with fields for

SYMBOL nvarchar(8), DATETIME datetime, PRICE real

I want to look up the prices of stocks at 500,000 datapoints -- for instance, basically my programs asking: "What is the price of GOOG at '12/15/2007 15:15:15'" and doing that 500,000 times.

So, my C# program has a table with these datapoint requests, with 2 columns - symbol and datetime, and want to pass that datatable to a SQL stored procedure and get back a result set from the sql stored procedure with the price of each stock at the datetime specified in the datatable i initially passed.

So if the datatable I initially passed only had 3 rows in it (as mentioned, it really has ~500,000), with the 2 column values of SYMBOL and DATETIME being:
'GOOG','12/15/2007 15:15:15'
'AAPL','12/15/2007 15:15:15'
'MSFT','12/15/2007 15:15:15'

I'd get back a result set from the SQL stored procedure with 3 rows in it (as I passed 3 rows of symbol/datetime pair lookups) containing the individual price values selected from the sql table :
<PRICE OF 'GOOG' AT '12/15/2007 15:15:15'>
<PRICE OF 'AAPL' AT '12/15/2007 15:15:15'>
<PRICE OF 'MSFT' AT '12/15/2007 15:15:15'>

This function would therefore allow me to look up a large # prices for stocks at specific datetimes.
Go to Top of Page

DonAtWork
Master Smack Fu Yak Hacker

2167 Posts

Posted - 2007-07-26 : 07:10:58
So, you are going to pass a HALF MILLION symbol/datetime pairs to your procedure? How many symbols are you going to allow your user to look for at one time? You have to have some kind of limit here...

[Signature]For fast help, follow this link:
http://weblogs.sqlteam.com/brettk/archive/2005/05/25.aspx
Learn SQL
http://www.sql-tutorial.net/
http://www.firstsql.com/tutor.htm
http://www.w3schools.com/sql/default.asp
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-26 : 07:13:12
pass xml doc in and use sql server 2005 xml capabilities to create a table to which you join to.
look up nodes() function for the xml datatype

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

johnvm
Starting Member

18 Posts

Posted - 2007-07-26 : 07:19:56
There are no users except for myself - I am the only person who uses this. I'm self employed and use SQL as my database solution. I do a lot of algorithmic trading and need this functionality to backtest strategies - so yeah, gotta pass half a mil. To put it in perspective, theres about a 400 million new rows of data per month in this database...

spirit1: Any good examples for how to go about doing that?

Thanks everyone for the help so far!
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-26 : 07:30:55
this should give you some ideas:
http://weblogs.sqlteam.com/mladenp/archive/2007/05/22/60213.aspx

you can easily extend it to 2 values

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

johnvm
Starting Member

18 Posts

Posted - 2007-07-26 : 09:20:52
Alright, what I've played with this morning is using .NET CLR assemblies and serialize a dataset and send it to the SP as XML adnd then deserialize it on the other side to work with. Works fine - so thanks for the suggestion on using XML.

I'm still faced with the issue of having to issue 500,000 SELECT statements correct? There's not much else I can think of to do except run a loop with a SELECT in it each time....

AFAIK an IN won't work for single-pair queries like this (where i want 1 result per combination, with multiple combinations). There's no better way to do this than 500k SELECT right? Also, will the .NET CLR method be substantially slower than T-SQL due to the fact that it'd be so many individual SELECTs? If so, how much slower should I expect it to be?
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-26 : 09:44:37
no you don't have to do a select for each row.
you pass the xml that holds the values to a stored procedure.
then turn thax xml to a table with the nodes() function of the xml datatype.
then you simply join that to your table that holds all values and return the resultset.


_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

johnvm
Starting Member

18 Posts

Posted - 2007-07-26 : 12:12:24
Great solution! I've done what you said. At the inner join, I'd like to have it only return ONE result max for the s.datetime > x.datetime. I tried a nested query with a s.datetime = (SELECT TOP 1....) but that still returned multiple b/c there are multiple rows per exact timestamp in many cases, plus it really slowed it down. How can I go about limiting it to a single row join?

Thanks.

SELECT *
FROM XmlResultSet x
INNER JOIN [2006] s
ON (s.SYMBOL = x.symbol) WHERE (s.datetime > x.datetime) ORDER BY s.DATETIME ASC
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-07-26 : 12:20:46
look here how to get the max for each group easily:
http://www.sqljunkies.com/WebLog/madhivanan/default.aspx
point 3

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-07-26 : 12:22:14
Try this, since you are using SQL Server 2005!
SELECT Field1, Field2 FROM (
SELECT Field1, Field2, ROW_NUMBER() OVER (PARTITION BY Field1 ORDER BY Field2 DESC) AS RecID
FROM [Table]) AS d WHERE RecID = 1



E 12°55'05.76"
N 56°04'39.42"
Go to Top of Page
   

- Advertisement -