Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Need to Avoid a Cursor...

Author  Topic 

Stumblewyk
Starting Member

2 Posts

Posted - 2011-01-28 : 10:55:43
I have a hairy issue I can't come up with a decent solution for.

I've built a large table (2.5 million records) using a SELECT INTO that contains a number of fields concerning donors and one field containing a reference to a UDF called "CURRENCY" in a stored procedure.

CURRENCY is populated as a string, like DBName.dbo.UDFName(DonorID). Each donor has about 35 different UDFs that need to be executed on them, and there is a row for each "udfname" in the database for each donor. So, there are roughly 72,000 unique donors, and 35-ish "udfname" records for each.

I'd LOVE to be able to build my "DBName.dbo.UDFName(DonorID)" and execute it during my SELECT INTO, but I can't seem to find a reasonable method to do so. Sadly, I can't anticipate what UDFs will be called for each donor, and the idea is for this process to be extensibile. I.E. if we add new UDFs for donors to another table, this process will automate and read in the new UDFs and incorporate them. That's why I'm building the call to the UDF as a string.

So, my SELECT INTO executes quickly. But then I need to loop back over the records and call that CURRENCY field's value, and then execute it, and UPDATE.

The easiest, and most logical method to me to accomplish this some dynamic SQL building an UPDATE query in a cursor. I've optimized my cursor as much as possible, and I'm still looking at HOURS to complete a full batch of 2.5 million rows.

I *know* there has to be a better way to do this. I'm just striking out finding a solution. So...hopefully someone here can help me.

For completeness' sake, I've included my SQL:

DROP TABLE DBName.dbo.TableName

SELECT row_number() OVER (ORDER BY AA.Sequence) AS 'ID'
,AA.AttributeTypesID
,'DBName.dbo.' + AT.Description + '(' + C.ConsID + ')' AS 'Currency'
,AA.Sequence
,GetDate() AS 'AttributesDate'
,R.ID AS 'ParentID'
,C.ConsID + '*' + AA.Sequence AS 'ImportID'
,AA.Description AS 'Comment'
,AT.Description
INTO DBName.dbo.TableName
FROM DBName.dbo.Table1 C
CROSS JOIN DBName.dbo.Table2 AA
LEFT OUTER JOIN DBName.dbo.Table3 AT ON AT.AttributeTypesID = AA.AttributeTypesID
LEFT OUTER JOIN DBName.dbo.Table4 R ON R.Constituent_ID = C.ConsID

CREATE UNIQUE INDEX IDX_Attributes_ID
ON DBName.dbo.TableName(ID)

DECLARE @Currency nvarchar(100)
DECLARE @ConsID nvarchar(50)
DECLARE @ID nvarchar(50)
DECLARE @SQLcmd nvarchar(max)

DECLARE Currency_Cursor CURSOR
FORWARD_ONLY
FAST_FORWARD
READ_ONLY FOR
SELECT Currency, ID
FROM DBName.dbo.TableName

OPEN Currency_Cursor

FETCH NEXT FROM Currency_Cursor
INTO @Currency, @ID

WHILE @@FETCH_STATUS = 0
BEGIN
SET @SQLcmd =
'UPDATE DBName.dbo.TableName
SET Currency = ' + @Currency + '
WHERE ID = ' + @ID

EXECUTE sp_executesql @SQLcmd

FETCH NEXT FROM Currency_Cursor
INTO @Currency, @ID
END

CLOSE Currency_Cursor;
DEALLOCATE Currency_Cursor;

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2011-01-28 : 13:34:55
>> I've built a large table (2.5 million records [sic: rows are not records [sic]]) using a SELECT INTO that contains a number of fields [sic: columns are not fields] concerning donors and one field [sic] [sic] containing a reference to a UDF [SQL avoid this kind of coding] called "CURRENCY" in a stored procedure. <<

Your real problem is that your mindset is still locked into file systems and procedural code. You think of cursors because they let you mimic magnetic tape files. Also, functions and procedures are given names with the general template "<verb>_<object>" and not simple nouns.

>> CURRENCY is populated as a string, like DBName.dbo.UDF_name(donor_id). Each donor has about 35 different UDFs that need to be executed on them, and there is a row for each "udf_name" in the database for each donor. So, there are roughly 72,000 unique donors, and 35-ish "udf_name" records [sic] for each. <<

This is insane. The purpose of a database is to store data, not to mimic the cells in a spreadsheet, which can store code. I have never seen anything like this mess. This might actually be worse than an EAV design.

>> I'd LOVE to be able to build my "DBName.dbo.udf_name(donor_id)" and execute it during my SELECT INTO, but I can't seem to find a reasonable method to do so. Sadly, I can't anticipate what UDFs will be called for each donor, and the idea is for this process to be extensible. I.E. if we add new UDFs for donors to another table, this process will automate and read in the new UDFs and incorporate them. That's why I'm building the call to the UDF as a string. <<

I'd love to throw this whole thing out the window. This i not how RDBMS is meant to be used. You are plowing a field with a motorboat instead of a tractor.

>> So, my SELECT INTO executes quickly. But then I need to loop back over the records [sic] and call that CURRENCY field [sic]'s value, and then execute it, and UPDATE.

>> The easiest, and most logical method to me to accomplish this some dynamic SQL building an UPDATE query in a cursor. I've optimized my cursor as much as possible, and I'm still looking at HOURS to complete a full batch of 2.5 million rows.

I *know* there has to be a better way to do this. I'm just striking out finding a solution. So...hopefully someone here can help me. <<

>> For completeness' sake, I've included my SQL: <<

This is scary bad code.

Please stop using single quotes for column names; T_SQL supports the ANSI/ISO double quotes if you need them; if you don't need then don't put them in the code.

There is no such thing as "attribute_type_id"; a specific attribute can have a property that is a type encoding or it can be an identifier. Likewise, "sequence" is a property of an attribute, not a generic, magical ghost floating in cyber-space -- "cheque_seq" would be valid. And of course you have the vague, universal "id" that is like a Khabalah number.

Could be more vague than Table1, .. Table4? When we worked with tape files, we would often name scratch tapes with that kind of template, but it was usually "WORKxxxx" because we had an eight character limit on names. In RDBMS, each table models a set of entities or a relationship -- no generics, please. So what entities are modeled by Table1? What does a typical table1 look like?

I gather that "constituent_id" and "cos_id" are the same data element with two different names. I will bet you have no data dictionary to pick up design flaws like that. I also see that you waste space for "DECLARE @cos_id NVARCHAR(50);" and never use it.

The old Sybase getdate() is now CURRENT_TIMESTAMP. T-SQL is becoming more ANSI/ISO compliant as it matures.

Building dynamic SQL is a common newbie misuse of SQL. But the generated SQL is simple statements that work around a bad schema design that hampered or prevented clean, maintainable code. No a good way to code, but at least an "honest kludge" to solve a problem.

Loading UDFs into a table is something I have never seen before. Trust me, after 10 years on the SQL Standards committee, eight books and over 1000 articles on the language and 35+ years of consulting, i have seen a lot of awful code and design.

I am not kidding about throwing this out completely.

--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Go to Top of Page

fan2005
Yak Posting Veteran

85 Posts

Posted - 2011-01-28 : 13:59:35
So what is the solution? programing in .net instead of query?
Go to Top of Page

Stumblewyk
Starting Member

2 Posts

Posted - 2011-01-28 : 14:00:11
quote:
Originally posted by jcelkoThis is scary bad code.


Yeah. I know. I'm supporting some scary bad code.

For the record "AttributeTypesID" is a column name, referring to an ID in another table called, amazingly enough "AttributeTypes". And "Sequence" is, once again, a column name from another table.

Each and every one of these records has to be imported into a 3rd party application, out of my hands. These UDFs that are getting stored in a table are being generated to query giving data based on user-defined criteria, and then the results of those queries are imported into our 3rd party app.

I can stop using single quotes. That's fine. The reference to @ConsID has been removed from my original code. I copied it here before I deleted it and missed it when posting. I've replaced GetDate(). But nothing else you offered me is anything close to helpful, or even constructive criticism. I see a lot of insults, a lot misinformed assumptions, and a bad attitude.

And HOLY CRAP. Can you not tell that I've scrubbed my code to remove actual database and table names? Are you dense? (See? I can be insulting too!)

But hey, thanks for a little better than nothing, I suppose.

>_>
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-01-28 : 14:12:23
quote:
But nothing else you offered me is anything close to helpful, or even constructive criticism. I see a lot of insults, a lot misinformed assumptions, and a bad attitude.
You must keep in mind that the Internet is Joe's psychiatrist.

Some questions about the UDFs:

- What do they do? What values do they output? Are they relatively simple?
- Are there any UDFs that are common to all donors?
- Do they have to run in a specific sequence? (i.e. "UDF_A" must run before "UDF_B") How do you maintain that execution sequence? What happens if they're run out of sequence?

I do agree with Joe that this design is flawed, particularly in storing executable code as data, at least the way you're trying to do it. Unless these UDFs are very complicated (and if they are they're poor candidates for UDFs) it's more likely the overall process can be rewritten to work as a stored procedure.
Go to Top of Page

KenW
Constraint Violating Yak Guru

391 Posts

Posted - 2011-01-28 : 16:40:49
quote:
Originally posted by robvolk
You must keep in mind that the Internet is Joe's psychiatrist.



Maybe Joe should try to read a book instead of boasting about the ones he's written while behaving like an a**, or visit an actual psychiatrist?

Seriously, this site has always been about being helpful to people trying to learn, not about trying to be a pedant and rude boor. If Joe can't respect that and behave appropriately, maybe he shouldn't waste his vast expertise and knowledge on losers here.
Go to Top of Page

fan2005
Yak Posting Veteran

85 Posts

Posted - 2011-01-28 : 17:01:06
look at this post is there any thing can help
if so thanks robvolk

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=155896
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-01-28 : 17:14:03
That's flattering but that solution won't work to solve Stumblewyk's problem.
Go to Top of Page
   

- Advertisement -