Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 converting utf8 data to windows 1252

Author  Topic 

munishgupta.asr
Starting Member

4 Posts

Posted - 2014-10-15 : 08:36:01
Hi,

I have UTF8 data (cyrillic characters) in my DB. while searching the same with input, i have to convert it to windows 1252 format and match.
is their any way in sql to do it in query
like
select * from tablename where convert(fieldnameinfo, windows1252) = 'inputstring'

pls help me to resolve it.

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2014-10-15 : 08:37:52
you're confusing character encoding (UTF8) with codepage (1252). These are unrelated concepts.
Go to Top of Page

munishgupta.asr
Starting Member

4 Posts

Posted - 2014-10-15 : 09:04:48
ok, let me post my exact problem.

I have russian characters in SQL DB (UTF8 Encoded) which i want to be searched in case insensitive way with my query.
for that i used collation "Cyrillic_General_CI_AS" in SQL expression.
as per documentation of Cyrillic_General_CI_AS, it supports the characters in windows 1252 format (different bytes) to do case insensitive search.

my input string and info in db are both utf8 encoded and i have to use Cyrillic_General_CI_AS as part of sql expression to do case insensitive search for russian characters, while collation supports windows 1252 format.

can you suggest me any way out, it would be great help.
Go to Top of Page

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2014-10-15 : 10:20:01
why not just:

select * from tablename where fieldnameinfo = 'inputstring' collate Cyrillic_General_CI_AS
Go to Top of Page

munishgupta.asr
Starting Member

4 Posts

Posted - 2014-10-15 : 10:24:22
because as per documentation of Cyrillic_General_CI_AS, it supports the characters in Windows-1251 encoding.
I am doing like that only what you have mentioned and not getting the correct results for all Russian characters.
Go to Top of Page

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2014-10-15 : 11:07:10
So, the DB has collation SQL_Latin1_General_CP1251_CI_AS and the input data has collation Cyrillic_General_CI_AS? Is that the situation?

If so, why not make a temp table with a column using the Cyrillic_General_CI_AS collation, then insert rows from the db with the other collation into it. SQL will convert the encoding IIUC. Then you can compare using the temp table. Is that an option?
Go to Top of Page

munishgupta.asr
Starting Member

4 Posts

Posted - 2014-10-15 : 11:18:11
Yes, that could work.
but can we avoid creating temp db? is their any way to handle it as part of query itself.
kind of
select * from tablename where convert(utf8fieldnameinfo, 1251encoding) = 'inputstring' collate Cyrillic_General_CI_AS
Go to Top of Page

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2014-10-15 : 11:34:52
I know of no built-in conversion function like that. You could try to make a scalar-valued function to do it, but it would need to create a temp table with the right collation and will probably run really slowly for any significant input.
Go to Top of Page

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2014-10-16 : 09:34:41
Have you tried

select * from tablename where cast(fieldnameinfo collate Cyrillic_General_CI_AS as nvarchar(len(fieldnameinfo))) = 'inputstring'
Go to Top of Page
   

- Advertisement -