Author |
Topic |
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 12:59:56
|
Hi!I need a query to this column in ACCESS databaseKILLERKILLEDBOYBOYSCARRECALLCALLetc.I wantKILLERKILLEDRECALLCALLetc. simply all records that have a common string of at least 4 characters |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 13:36:33
|
will this do?select *from Table1where ColName like '%kill%' or ColName like '%call%'Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 13:42:13
|
I need not only*CALL**KILL*but *any-duplicated-atleast4character-strings* :-)I was told sql/access cannot do that (?) anyway thanks for a sprint reply |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 13:46:02
|
so you want to get them all without knowing the 4 char word you want??Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 13:47:08
|
yes! |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 13:49:24
|
well i guess it can be done but that is highly resource hungry...you'd need to put all 4 letter words in a temp table then go row by rowlooking for the words from a temp table...way messy...why do you want to do that?Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 13:57:29
|
It's a long story:My company has a database with two colmuns1'English word' 2'Polish translation'1 X 2 Y,Z1 Q 2 R,YWe are looking for the Y string in the 'Polish translation' column |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 14:01:25
|
that's bad db design man.... sorry for you...so what do you want to do then?Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 14:06:57
|
Maybe redesign the DB but my boss would go mad about it because as you said it would be time-consumingI quess the DB should look like1'English word' 2'Polish translation-a' 3'Polish translation-b'1 X 2 Y 3 Z1 Q 2 R 3 Ythen I would simply look for duplicatesbut it would still be difficult because the Polish languge has a lot of flection suffixes (Slavic languages have this problem as you know :-)) |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 14:14:02
|
yeah i know.your best bet would be this:tables:English (Eng_Id, wordName)Polish (Pol_Id, wordName)EnglistToPolish(Eng_Id, Pol_Id) then you simply link the correct words in the intermediate table EnglistToPolish.i don't think that redesign would be that lengthy. i guess the pros would be way up than conssearches way faster, normalized data... redesign man. Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 14:26:58
|
Dislaimer: I am a Newbie to sql/accessI cannot see the reason for creatingEnglistToPolish(Eng_Id, Pol_Id)I just want to search in only one column for duplicated strings of at least 4 characters!!! God help me! I want my Atari back!!!PS Somebody replied to my request like this:You'll have to use the InStr() function.It will have to be worked a lot though to read every 4 characters of a word.Ex.: if you have the words Recall and Caller, it will have to look for these:Is RECA in Caller? No!Is ECAL in Caller? No!Is CALL in Caller? Yes!It sounds simple, but try it with the words "Information" and "Confrontation"...So, you'll have to learned about "Recursive" and "InStr()".Good luck! |
 |
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-11-01 : 14:43:22
|
this will be incredibly slow, but here's one way to do it. It requires a table of numbers, from 1 - 100 or so. (whatever your longest word is). Create this table in your database, call it "Numbers", give it 1 column, call it "N", make it an int. store values from 1-100 in it.ONce you have that numbers table, you can write a query to get all 4-letter combos, and put them into a newly created temp table (this may take a while to run):select distinct mid([Word],N,4) as Segmentinto Segmentsfrom YourTable, Numberswhere Numbers.N <= Len([Word]) - 3Once you have that, you can join that to your words table like this:select Segments.Segment, YourTable.Wordfrom YourTable, Segmentswhere YourTable.Word LIKE "*" & Segments.Segment & "*"Now you have a list of all 4-letter segments, and all of the words that match them. Does this help at all?- Jeff |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 14:51:39
|
OK Jeff I will try to do it now (it's a long night :-))but for now read what I was advised elsewhere:Unfortunately, this is one of "those" problems that even a 3.2 GHz machine won't do fast, because it is a sliding-window-subset-matching problem, in which both the object string AND THE SUBJECT string have sliding windows.If I were to even ATTEMPT this problem for someone, I would advise them that the problem is in the category of "computable, N! * L" - i.e. computable but expensive. 'cause first, it is a factorial combination of every table member against every other table member. (That's the N! part.) And because you can match internal or trailing string elements, you can't even make it (N-1)! by pre-sorting the strings. Second, it is further proportional to the average-length (of the strings) because that governs the number of subsets you can have.Without knowing the full rules, the only way I know to do this is to brute-force it. Since letters in your sub-strings can be duplicated (e.g. KILL duplicates 'L'), I can't even get subtle by using some form of masking generator to show the letters in the string as a pre-filter exercise. |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 14:53:41
|
quote: I cannot see the reason for creatingEnglistToPolish(Eng_Id, Pol_Id)
the reson is normalization say you have an english word: blablathe polish language has two meanings for that word:blablaPL1 and blablaPL2 so in your table you haveEnglistToPolishEng_Id Pol_Id 1 1 1 2 therefore joins are much simpler. no logic is needed for parsing csv strings...any clearer??Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 14:59:32
|
But spirit look:the Eng_Ids will not be the same (X has a different id from Q)1'English word' 2'Polish translations' 1 X 2 Y, Z1 Q 2 R, YDisclaimer: I may be mistaken |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 15:08:02
|
[code]EnglishEng_Id wordName 1 X 2 QPolishPol_Id wordName 1 R 2 Y 3 ZEnglistToPolishEng_Id Pol_Id 1 2 1 3 2 1 2 2[/code]any clearer?Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-01 : 15:18:49
|
I'm beginnig to get the point I will check both ways (spirit's and Jeff's) on the 10000 record of vocabulary DB at my office tommorow. I will be back with a report then.Cheers!Grateful MARIO from Warsaw |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-11-01 : 15:37:09
|
can't wait...Go with the flow & have fun! Else fight the flow |
 |
|
ACCESS-SQL_MARIO
Starting Member
18 Posts |
Posted - 2004-11-02 : 01:39:56
|
JEFF, it works!. The DB must be cut into several smaller DBs. But it is exactly what I need.SPIRIT, Your way is applicable for redesigning of the DB. And I'd better implement it pretty soon.MARIO |
 |
|
|