Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
kien
Starting Member
27 Posts |
Posted - 2002-06-14 : 16:53:40
|
| Hi everyoneFirst off, thank you to everyone who responded to my previous posting. Your guildance is very much appreciated !!!I'm trying to accomplish the following:stored procedure receives string eg "ontario toronto canada etc" - etc. meaning one can also specify north america or whatever. Note: the input text is in no specific order, they can specify city, country or province in any order desired.The stored procedure has to search through columns "country", "city", "province", "continent" etc. and return records that have ALL matching criteriaplease start me offthanks! |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-06-14 : 17:05:42
|
| To make your life easier, and to have a chance at making this work, change the way the search terms are entered. Do not allow people to enter them in any order. Have separate data entry boxes for city, province, country and continent. Instead of:Enter your search: ontario toronto canadaDo this:Enter city/cities: torontoEnter province(s): ontarioEnter country(ies): canadaYou can still apply the city search term to the country column, for example, in case someone enters the wrong term in the wrong box. At least you won't be stuck trying to parse an indeterminant string of values.You can use the article I linked earlier to apply these search terms to multiple columns:http://www.sqlteam.com/item.asp?ItemID=5857Just add the additional search columns to the WHERE clause. |
 |
|
|
kien
Starting Member
27 Posts |
Posted - 2002-06-17 : 13:14:43
|
Thanks for the reply Rob. I have a quick question. Do you know how search engines work?Do they include all information in one single column?If yes do you suggest I do the same - have country, city, state, continent all in one column and make it like a key word search.The reason why i avoided this method is because i have over 6 million records to search through.I'm trying to make the search as simple as possible, thus I am reluctant at the moment to implement your previous suggestion - to have user enter city, country etc. in different fields.Thanks again |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-06-17 : 13:26:31
|
| Usually some kind of full-text indexing is used, and SQL Server supports this. Take a look in Books Online for more information, I don't know enough about it myself. But it is specifically designed to handle these kinds of searches. I know there is also a way to full-text search multiple columns and keywords, and also rank the results.Also remember that Yahoo! and those sites have huge server farms and/or boxes with 32 or more CPUs, so unless you have a similar setup don't get too discouraged if it takes 10 seconds for the query to run instead of 5. |
 |
|
|
jasper_smith
SQL Server MVP & SQLTeam MVY
846 Posts |
Posted - 2002-06-17 : 17:54:19
|
Whilst SQL2000 FREETEXT and FREETEXTABLE do search across multiple columns they are far less efficent than CONTAINS or CONTAINSTABLE.However CONTAINS and CONTAINSTABLE have a drawback in searching across multiple columns in that in the "straight" form they try and match the search phrase in 1 column only. You can get around this by unioning searches but there is the obvious performance penalty of multiple searches. What I have done in the past in some situations is to create a search table and concatenate the fields into one wide column which is then full text indexed, this allows the use of CONTAINSTABLE and hits the FT index only once. This depends on how dynamic your data is. A couple of easys thing you can do to improve performace is to use CONTAINSTABLE with the top_n_by_rank qualifier to limit the number of rows returned, noise word analysis and use NTFS file compression on your Full Text Catalogs.Here's Hilary Cotter's (FTS Guru) top FTS performace boosts(If you want to pursue FTS and I suggest you do then [with no disrespect to this forum ] check out the microsoft fulltext newsgroup - you will find a wealth of information)1) access should be done through an isapi filter2) check your web server logs and figure out your most frequent queries and precompile them3) dedicate one cpu to mssearch (in task manager go to the processes tab,right click on mssearch select set affiinity set it to 1 cpu)4) dedicate 500 mgs to the OS5) put your catalogs on a seperate controller with a RAID 0, RAID 1, or RAID01, RAID 10 disk array on it6) use ntfs file compression on your catalogs7) do a noise word analysis to increase the speed of your querying8) set resouce_usage to 5 (sp_fulltext_service ('resource_usage',5)HTHJasper Smith |
 |
|
|
kien
Starting Member
27 Posts |
Posted - 2002-06-18 : 09:28:41
|
Thanks Jasper for your guildance!When you say "this depends on how dynamic your data is" - well my database is basically static, but with over 6 million records.If I were to implement the full text search in one column, i would have no more than 10 words per field (city, country, continent), would this be a suitable candidate for this search method?When you say in "some situations", you implemented the full text search, what are the characteristics of these situations?I'm still fairly new with sql server 2000, so a lot of your lingo was beyond my understanding. From your experience, and from the description of my database, which search method would you recommend to be the most efficient (level of coding difficulty would also be considered)?Thank you once again |
 |
|
|
Onamuji
Aged Yak Warrior
504 Posts |
Posted - 2002-06-18 : 09:47:28
|
| I don't know if this will help but it's a ranking search across multiple columns with a space dilimeted keyword search....SELECT p.ProjectID, COUNT(p.ProjectID) FROM Sequences AS s(NOLOCK) INNER JOIN Projects AS p(NOLOCK) ON p.Title LIKE '%' + SUBSTRING(' ' + @Keywords + ' ', s.Value, CHARINDEX(' ', ' ' + @Keywords + ' ', s.Value) - s.Value) + '%' OR p.Scope LIKE '%' + SUBSTRING(' ' + @Keywords + ' ', s.Value, CHARINDEX(' ', ' ' + @Keywords + ' ', s.Value) - s.Value) + '%' OR p.Justification LIKE '%' + SUBSTRING(' ' + @Keywords + ' ', s.Value, CHARINDEX(' ', ' ' + @Keywords + ' ', s.Value) - s.Value) + '%' OR CONVERT(VARCHAR, p.ProjectID) LIKE '%' + SUBSTRING(' ' + @Keywords + ' ', s.Value, CHARINDEX(' ', ' ' + @Keywords + ' ', s.Value) - s.Value) + '%' WHERE s.Value <= LEN(' ' + @Keywords + ' ') AND SUBSTRING(' ' + @Keywords + ' ', s.Value - 1, 1) = ' ' AND CHARINDEX(' ', ' ' + @Keywords + ' ', s.Value) - s.Value > 0 GROUP BY p.ProjectID |
 |
|
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2002-06-18 : 09:50:34
|
quote: Hi everyoneFirst off, thank you to everyone who responded to my previous posting. Your guildance is very much appreciated !!!
I gotta know. What this your post? If so, how did the set based method stack up (performance wise) against the iterative methods. I am very curious to hear if it worked for you....<O> |
 |
|
|
kien
Starting Member
27 Posts |
Posted - 2002-06-18 : 11:59:14
|
Thank you EVERYONE for contributing to this posting and to all my other postings. You help is overwhelming Onamuji - thanks for the code, I will try it out as soon as i get a chance to.Page47 - Yes "this" is also my posting. Forgive me, but what is "stack up" and "interactive method"? I have not had the opportunity to test out the suggestions from this posting yet...will get back to you when i have a chance to.First off, I need to brush up on my sql stills so that i can understand all of these coded =)Thanks again! |
 |
|
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2002-06-18 : 13:03:19
|
| What I mean is how did my method (set based method) perform compared to the other methods suggested (row-by-row iteration...looping thru a cursor)...<O> |
 |
|
|
jasper_smith
SQL Server MVP & SQLTeam MVY
846 Posts |
Posted - 2002-06-18 : 13:08:31
|
| For your situation Full Text search sounds perfect. It should be massively quicker than a table scan of a 6 million row table which is what you will get if you use a standard like '%@word%'and if the data's fairly static, FT index maintenance will be minimal.It should be said that you need pretty decent hardware to get the best performance from FTS.However it depends on how users will be searching. If the idea is to allow a user to enter a string like "toronto canada" and it is to produce results matching ALL inputs thenits simple enough to split it either on the client or in SQL to produce the search phrase '"toronto" AND "canada"' which is used asselect fields... from CONTAINSTABLE(searchtable,search_col,'"toronto" AND "canada"',100)join mytable b on a.[key]=b.Primary_Keyorder by [rank] descIf there are individual fields to enter search criteria then normal indexing would be useful in getting back your results as you could use the "=" operator.HTHJasper Smith |
 |
|
|
|
|
|
|
|