| Author |
Topic |
|
mukherjee12
Starting Member
6 Posts |
Posted - 2008-08-19 : 14:04:13
|
| Ok folks - needs some help here. we are planning to localize our site to 8 languages. The vendor is asking for a work count. I need to find a way (statments)/ scripts that will go through all my tables and give me a distinct word count. I am looking to do this for all our PR Tables, headers tool tips ...basically the entire site. Any help here is much appreciated!!!!! |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2008-08-19 : 14:31:49
|
| Not sure I'm following - Are you saying that you want to go through all tables, all (character based) columns, all rows and store each word. Then get a distinct count of those words? Is a "word" defined as the strings which are seperated by a space, tab, linefeed, period, semicolon, questionmark, etc?Be One with the OptimizerTG |
 |
|
|
mukherjee12
Starting Member
6 Posts |
Posted - 2008-08-19 : 14:42:32
|
| yes I want to go through all tables, all columns and get a distinct count of those words - basically the end result is we have a site (SaaS) that we are looking to translate - all the words on each page to lanuage x,y,z. The tranlation vendor is asking our word count. |
 |
|
|
mukherjee12
Starting Member
6 Posts |
Posted - 2008-08-19 : 14:43:22
|
| since they charge us per word - they have a memory tool- so they dont charge us for the same words twice over.....thanks! |
 |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2008-08-19 : 14:48:03
|
| Are any of the columns of datatype: text, ntext, char(max), nchar(max), varchar(max), nvarchar(max)?Do you have columns that should be ignored like names, emails, urls, etc?EDIT:How big is the database?Be One with the OptimizerTG |
 |
|
|
mukherjee12
Starting Member
6 Posts |
Posted - 2008-08-19 : 15:00:23
|
| YES WE HAVE ALL THOSE COLUMNS. IN TERMS OF SIZE WE HAVE 2 MAIN FILES- BUT THE SIZE WOULD CHANGE ONCE WE EXPORT...WE WANT TO IGNORE NAMES, EMAILS URLS AND ANY DATA THAT WOULD BE ENTERED IN BY THE END USER...(WE HAVE A OOGLE TRANSLATOR FOR THIS) |
 |
|
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2008-08-19 : 15:01:11
|
| Please, dear God, tell me that this vendor is not merely going to do a search and replace on the words?What does a distinct word count have to do with the difficulty of translation? I think this is a red flag that you should find another vendor, lest you find all your string translated into Engrish http://www.engrish.com/Boycott Beijing Olympics 2008 |
 |
|
|
mukherjee12
Starting Member
6 Posts |
Posted - 2008-08-19 : 15:07:42
|
| No no definitly not....they work with an editor to make sure it all makes sense...but were shopping for a vendor- and they want to know a word count....id rather give them a word count then vice versa |
 |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2008-08-19 : 15:16:30
|
The tough part will be to split any value into a set of words. You can probably use any one of the many "split functions" that have been posted here. Here is one thread on the subject:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648I think there are some functions in this thread that split ntext datatypes though I've never used them...ie: for one table, and one column this would get every "word" as delimited by a space assuming you have a function called fnParseString and a table called [words]:insert words (words)select ca.valfrom <table> tcross apply dbo.fnParseString(t.<characterColumn>, ' ') ca Once you have that working it is just a matter of setting up a couple nested loops with information_schema views. For each table, each character column, generate and exec a dynamic statement like above.Get started and post back with any questions...have fun :)Be One with the OptimizerTG |
 |
|
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2008-08-19 : 15:24:55
|
| A word count makes sense. A distinct word count makes no sense at all.There is a huge difference between the word count of a short story and the word count of a 400 page novel. There would only be a small difference in distinct word count between the two.Are you sure they do not need a total word count?Boycott Beijing Olympics 2008 |
 |
|
|
blindman
Master Smack Fu Yak Hacker
2365 Posts |
Posted - 2008-08-19 : 15:25:41
|
| Duplicate post. |
 |
|
|
mukherjee12
Starting Member
6 Posts |
Posted - 2008-08-19 : 15:43:09
|
| ill try! thank you!!!!!!!!!!!!! you the man |
 |
|
|
|