Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Why are spaces treated as blank?

Author  Topic 

Marioi
Posting Yak Master

132 Posts

Posted - 2005-11-30 : 21:49:13
I think I saw somewhere that there is a setting that controls how spaces are evaluated in T-SQL. In my SS2000 DB, they are evaluated as blank and this is probably the default. This is driving me nuts. The following code illustrates the point. To me '' is NOT ' ' and SS says it is. Also, SS says that a space has a length of 0. Please point me to the relevant reading.

DECLARE @Space nvarchar(10)
SET @Space = ' '
PRINT 'Len() of a space: '''+CAST(Len(@Space) AS varchar(20))+''' (ouch)'
IF @Space = ' '
PRINT 'Space = space: True (as expected)'
ELSE
PRINT 'Space = space: False'
GO
DECLARE @Space nvarchar(10)
SET @Space = ''
PRINT 'Len() of a blank: '''+CAST(Len(@Space) AS varchar(20))+''' (as expected)'
IF @Space = ' '
PRINT 'Blank = space: True (ouch)'
ELSE
PRINT 'Blank = space: False'
GO

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2005-11-30 : 23:34:00
When SQL Server compares two strings, it makes a comparison from left to right and treats the shorter expression as though it was padded with spaces to equal the length of the longer string. SQL Server does this with =, but with LIKE all characters are compared. If you convert the strings to varbinary, the = works the way you were expecting.

LEN Returns the number of characters, rather than the number of bytes, of the given string expression, excluding trailing blanks.

This code illustrates all of this:

declare @Space nvarchar(10)
declare @Space2 nvarchar(10)

set @Space = ''
set @Space2 = ' '

select [len] = len(@Space2),[datalength]=datalength(@Space2)

if @Space = @Space2
print '@Space = @Space2'
else
print '@Space <> @Space2'

if convert(varbinary(20),@Space) = convert(varbinary(20),@Space2)
print 'Binary @Space = Binary @Space2'
else
print 'Binary @Space <> Binary @Space2'


if @Space like @Space2
print '@Space Like @Space2'
else
print '@Space Not Like @Space2'



len datalength
----------- -----------
0 2

(1 row(s) affected)

@Space = @Space2
Binary @Space <> Binary @Space2
@Space Not Like @Space2


CODO ERGO SUM
Go to Top of Page

druer
Constraint Violating Yak Guru

314 Posts

Posted - 2005-12-01 : 10:02:41
Per books on line:
LEN Returns the number of characters, rather than the number of bytes, of the given string expression, excluding trailing blanks.

DATALENGTH Returns the number of bytes used to represent any expression.

I would imply that behind the scenes the LEN function does an RTRIM before it takes the length. And I'm not sure why they would automatically do that since it is exactly the opposite of what I would think a TSQL developer would want, but when you you return a result set and it makes no sense to have all of the trailing blanks coming across the network you have to trim them yourself. C'est la vie.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-12-01 : 10:35:03
RTRIM = LEFT(MyColumn, LEN(MyColumn))

Q.E.D.

Kristen
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2005-12-01 : 10:35:09
quote:
but with LIKE all characters are compared


It's not symmetrical: as BOM explains, LIKE treats trailing spaces in the pattern (RHS) as significant, but not in the match_expression (LHS).


declare @Space nvarchar(10)
declare @Space2 nvarchar(10)

set @Space = ''
set @Space2 = ' '

if @Space like @Space2
print '@Space Like @Space2'
else
print '@Space Not Like @Space2'

if @Space2 like @Space
print '@Space2 Like @Space'
else
print '@Space2 Not Like @Space'

@Space Not Like @Space2
@Space2 Like @Space

Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 10:40:11
quote:
Originally posted by Michael Valentine Jones

When SQL Server compares two strings, it makes a comparison from left to right and treats the shorter expression as though it was padded with spaces to equal the length of the longer string.


All much clearer now - thank you! It sounds like LIKE is more like an equality in other programming languages and = is interpreted. So far I learned that '' = ' ' and NULL <> NULL. To make it even more interesting, there are sentences like this one in Books Online: "Unicode LIKE is compatible with the SQL-92 standard. ASCII LIKE is compatible with earlier versions of SQL Server." Makes me want to do all the work in the middle tier.

The question remains: how do I find a correct number of characters in the case of one or more spaces? datalength() and len() of a binary value both return the number of bytes, not characters. Do I have to divide bytes by 2 for all unicode types? This must be a huge stumbling block for most novice SQL programmers.

By the way, how can I retrieve the datatype of a table column? Is there a way to retrive the datatype of a variable?
Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 10:47:30
quote:
Originally posted by Kristen

RTRIM = LEFT(MyColumn, LEN(MyColumn))


These behaviors (space equals blank, which is not a null, which doesn't equal null) make me wonder how people build robust queries, let alone more generic solutions.
Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 10:52:26
quote:
Originally posted by Arnold Fribble
It's not symmetrical: as BOM explains, LIKE treats trailing spaces in the pattern (RHS) as significant, but not in the match_expression (LHS).


I am getting symmetrical results in SS2000:

@Space Not Like @Space2
@Space2 Not Like @Space
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-12-01 : 11:00:33
"make me wonder how people build robust queries"

They learn from experience, and then become consultants and charge Big Bucks

"'' = ' ' and NULL <> NULL"

That's too funny!

FWIW we don't [in the main] store trailing spaces in our databases (and no CHAR columns, only VARCHAR)

Kristen
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2005-12-01 : 11:04:50
quote:
Originally posted by Marioi

quote:
Originally posted by Kristen

RTRIM = LEFT(MyColumn, LEN(MyColumn))


These behaviors (space equals blank, which is not a null, which doesn't equal null) make me wonder how people build robust queries, let alone more generic solutions.



The main thing is: why is a space important in your data? Typically, it is helpful for the database to ignore the case of the characters and/or trailing spaces in comparisons and joins. I suspect that the "trailing spaces don't matter" idea was implemented due to the existance of the CHAR() datatype, in which all strings are a fixed width and padded with spaces as needed.

How about an example of some of the data you are working with, and why it is important that trailing spaces and such are recognized? If your data has some values of "XYZ" and some of "XYZ " in key columns in your tables and these need to be recognized as being different, I might suggest that maybe that is not the best database design.

Also -- there *may* be a collation available that you can use that takes spaces into account during compares, but I am not 100% sure about this (check BOL).
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2005-12-01 : 11:07:00
Oops, should have tested that, shouldn't I?
I rashly assumed that because

SELECT CASE WHEN '' LIKE ' ' THEN 1 ELSE 0 END
SELECT CASE WHEN ' ' LIKE '' THEN 1 ELSE 0 END

gives you 0 and 1 respectively, it would work for the example.
Hmm... now why is that different...

Erk! It's the difference between nvarchar and varchar.

SELECT CASE WHEN N'' LIKE N' ' THEN 1 ELSE 0 END
SELECT CASE WHEN N' ' LIKE N'' THEN 1 ELSE 0 END

returns 0 and 0.

Which is also documented in BOL

Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 11:31:15
quote:
Originally posted by jsmith8858
The main thing is: why is a space important in your data?


I am writing sprocs and UDFs in which spaces are significant. An example is a UDF that splits a string based on a delimiter (I found several examples through a FAQ thread on this forum). Since the delimiter can be a space I have to deal with it. Or a UDF that counts words.
Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 11:43:25
quote:
Originally posted by Arnold Fribble

Oops, should have tested that, shouldn't I?


I played with nchar and char and there the results are predictable: a blank string and a space are treated as 10 spaces or 20 bytes.

But with nvarchar, why is len of varbinary returning 2 (the line in code is

Print 'len of @Space2 as varbinary: '''+CAST(Len(CAST(@Space2 AS varbinary(20))) AS varchar(20))+''''


and the line in results is

len of @Space2 as varbinary: '2'


(I figured it out - see my reply to this message.)

The complete code is:

declare @Space nchar(10)
declare @Space2 nchar(10)

set @Space = ''
set @Space2 = ' '

Print 'EXAMPLE WITH nchar:'
Print 'len of @Space2: '''+CAST(len(@Space2) AS varchar(20))+''''
Print 'datalength of @space2: '''+CAST(datalength(@Space2) AS varchar(20))+''''
Print '-------------------'

Print 'len of @Space2 as varbinary: '''+CAST(Len(CAST(@Space2 AS varbinary(20))) AS varchar(20))+''''
Print 'datalength of @Space2 as varbinary: '''+CAST(datalength(CAST(@Space2 AS varbinary(20))) AS varchar(20))+''''
Print '-------------------'

if @Space = @Space2
print '@Space = @Space2'
else
print '@Space <> @Space2'

if CAST(@Space AS varbinary(20)) = CAST(@Space2 AS varbinary(20))
print 'Binary @Space = Binary @Space2'
else
print 'Binary @Space <> Binary @Space2'


if @Space like @Space2
print '@Space Like @Space2'
else
print '@Space Not Like @Space2'

PRINT '-------------------'

-- Testing whether LIKE is symmetrical.

if @Space like @Space2
print '@Space Like @Space2'
else
print '@Space Not Like @Space2'

if @Space2 like @Space
print '@Space2 Like @Space'
else
print '@Space2 Not Like @Space'
GO

I get:
EXAMPLE WITH nchar:
len of @Space2: '0'
datalength of @space2: '20'
-------------------
len of @Space2 as varbinary: '20'
datalength of @Space2 as varbinary: '20'
-------------------
@Space = @Space2
Binary @Space = Binary @Space2
@Space Like @Space2
-------------------
@Space Like @Space2
@Space2 Like @Space

EXAMPLE WITH nvarchar:
len of @Space2: '0'
datalength of @space2: '2'
-------------------
len of @Space2 as varbinary: '2'
datalength of @Space2 as varbinary: '2'
-------------------
@Space = @Space2
Binary @Space <> Binary @Space2
@Space Not Like @Space2
-------------------
@Space Not Like @Space2
@Space2 Not Like @Space
Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 12:03:47
quote:
Originally posted by Marioi
But with nvarchar, why is len of varbinary returning 2


I got it - because of the padding SS does in string comparisons of unequal lengths. This is a problem because I can't anticipate which side of the comparison the string will be shorter.

My conclusions are:
- Use the same type of string variable throughout (nvarchar in my case)
- If you have to use char() and nchar(), RTrim it before you blink
- Instead of = always use LIKE
- Instead of Len() always use Datalength() and divide the number by 2 (what happens if the number happens to be odd - that will never happen, right?)

Anything else?

Was it George Orwell who wrote 'All spaces are created equal, but some are more equal than others?'
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-12-01 : 12:44:58
"I am writing sprocs and UDFs in which spaces are significant"

Where trailing spaces are significant??

Kristen
Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 13:46:02
quote:
Originally posted by Kristen
Where trailing spaces are significant??


All spaces are significant because I am trying to programmatically manipulate strings supplied by any consumer, not only retrieved from SS data. The problem is I don't know when I will receive a string of one or more spaces, or characters followed by one or more spaces. Whatever methodology I develop, I want to be sure that spaces (Char(32) will be treated just like any other character.
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2005-12-01 : 15:41:57
A specific example would be helpful. I can't think of a scenerio in which the way SQL handles spaces would cause difficulty; I know I've written dozens of UDFs/procs along these lines and I've never had a problem or needed to do anythign specific to deal with spaces.

How about an example UDF -- give us some sample input, a description of what the UDF needs to do, and then the desired output. I suspect you may be making things more complicated than needed.
Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 18:30:41
quote:
Originally posted by jsmith8858

A specific example would be helpful.


I started really looking into this problem when I wanted to support space as the delimiter in this UDF, but I also got stung earlier when dealing with our own metadata (don't remember the details). I wrote this before I learned about datalength() and using LIKE instead of = so now I think I have a solution, albeit not as elegant as I would prefer.

To test, change the second parameter in the SELECT code on the very bottom to any number of space characters.

IF EXISTS (SELECT * 
FROM sysobjects
WHERE name = N'fnSplit')
DROP FUNCTION fnSplit
GO

------------------ fnSplit ----------------------

CREATE FUNCTION dbo.fnSplit
(@RowData nvarchar(4000),
@SplitOn nvarchar(5))

-- AUTHOR: MI 11/30/05
-- Adapted from Seventhnight's posting on http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648.
-- The function preserved starting and trailing spaces.
-- IMPORTANT: Current does NOT support spaces as delimiters.
-- PURPOSE:
-- Split a uniformly delimited string into a table.
-- PARAMETERS:
-- @RowData: the string to split.
-- @SplitOn: the delimiter.
-- RETURNS:
-- Table with 2 columns: counter and Data.
-- EXAMPLE 1:
-- SELECT * FROM dbo.fnSplit
-- ('Test|string||that needs to|be||split',
-- '|')
-- GO
-- EXAMPLE 2:
-- SELECT * FROM dbo.fnSplit
-- (' Test\\\ string\\\that needs to \\\\\\ be \\\ split \\\\\\',
-- '\\\')
-- GO

RETURNS @RtnValue table
(Id int identity(1,1),
Data nvarchar(4000))
AS
BEGIN
-- Ensure that the SplitOn parameter has at least 1 character. If not,
-- we would loop infinitely
-- Len returns 0 when only spaces are in @SplitOn. Spaces are not
-- not supported in @SplitOn because they cause an infinite loop
-- further down.
IF Len(@SplitOn) < 1
BEGIN
-- RAISERROR ('Please provide a valid character in the %d parameter.',
-- 16, 1, '@SplitOn')
Return
END
ELSE
BEGIN
Declare @Cnt int
Declare @sValue nvarchar(4000)
Set @Cnt = 1

While (Charindex(@SplitOn,@RowData)>0)
Begin
(Select @sValue = Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)) -- removed LTrim(RTrim())

Insert Into @RtnValue (data)
Select Data = @sValue

Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+Len(@SplitOn),len(@RowData))
Set @Cnt = @Cnt + 1
End

Select @sValue = @RowData -- removed LTrim(RTrim())
Insert Into @RtnValue (data)
Select Data = @sValue
END
Return
END

GO


-- =============================================
-- Example to execute function
-- =============================================
SELECT * FROM dbo.fnSplit
(' Test\\\ string\\\that needs to \\\\\\ be \\\ split \\\\\\',
'\\\')
GO
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2005-12-01 : 18:48:52
The problem here is a logic one: If you want to support spaces as a delimiter, but also "preserve" all spaces, then you have conflicting things you are trying to do.

How should
"This  is  a string"
be delimited into words, if you want to use a space as a delimiter, but also "preserve" all spaces?

1) "This ", "is ", "a", "string"

or

2) "This", "", "is", "", "a","string"

or

3) "This", " is", " a","string"

or

4) "This", " is ", "a", "string"

etc

?

What about
"hello   there"
(3 spaces) ?

Should it be

1) "hello"," ","there"
2) "hello ","there"
3) "hello "," there"
4) "hello"," there"
5) "hello"," "," "," ","there"
6) "hello","","","","there"
etc

?

You can't preserve spaces and also allow a space character to be a delimiter! By definition, the delimiter character is removed from the string, isn't it? What happens in your example of delimiting by "///" if you say you wish to preserve all instances of "///" ?

Your problem isn't a T-SQL problem, it's a problem with what you want to do. Your first step before dealing with *any* T-SQL code is clearly defining your specs with logic that makes sense.
Go to Top of Page

Marioi
Posting Yak Master

132 Posts

Posted - 2005-12-01 : 19:12:15
quote:
Originally posted by jsmith8858
What about
"hello   there"
(3 spaces) ?


I don't want to preserve spaces, I want to properly interpret them. Your example above is the same as 'hello,,,there' if the comma was a separator so the result is:

"hello"+""+""+"there"

And if the delimiter were ' ' (2 spaces), then the result would be:

"hello"+" there"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2005-12-01 : 19:32:41
OK, now we are getting somewher. Remember this from earlier in the thread:
quote:

LEN Returns the number of characters, rather than the number of bytes, of the given string expression, excluding trailing blanks.

DATALENGTH Returns the number of bytes used to represent any expression.



And let's get back to the problem: your fn doesn't work for spaces. Yet, twice in your function you use the LEN() function around the delimiter variable, where you'd like to allow a space!

quote:
CREATE FUNCTION dbo.fnSplit
(@RowData nvarchar(4000),
@SplitOn nvarchar(5))

-- AUTHOR: MI 11/30/05
-- Adapted from Seventhnight's posting on http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648.
-- The function preserved starting and trailing spaces.
-- IMPORTANT: Current does NOT support spaces as delimiters.
-- PURPOSE:
-- Split a uniformly delimited string into a table.
-- PARAMETERS:
-- @RowData: the string to split.
-- @SplitOn: the delimiter.
-- RETURNS:
-- Table with 2 columns: counter and Data.
-- EXAMPLE 1:
-- SELECT * FROM dbo.fnSplit
-- ('Test|string||that needs to|be||split',
-- '|')
-- GO
-- EXAMPLE 2:
-- SELECT * FROM dbo.fnSplit
-- (' Test\\\ string\\\that needs to \\\\\\ be \\\ split \\\\\\',
-- '\\\')
-- GO

RETURNS @RtnValue table
(Id int identity(1,1),
Data nvarchar(4000))
AS
BEGIN
-- Ensure that the SplitOn parameter has at least 1 character. If not,
-- we would loop infinitely
-- Len returns 0 when only spaces are in @SplitOn. Spaces are not
-- not supported in @SplitOn because they cause an infinite loop
-- further down.
IF Len(@SplitOn) < 1
BEGIN
-- RAISERROR ('Please provide a valid character in the %d parameter.',
-- 16, 1, '@SplitOn')
Return
END
ELSE
BEGIN
Declare @Cnt int
Declare @sValue nvarchar(4000)
Set @Cnt = 1

While (Charindex(@SplitOn,@RowData)>0)
Begin
(Select @sValue = Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)) -- removed LTrim(RTrim())

Insert Into @RtnValue (data)
Select Data = @sValue

Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+Len(@SplitOn),len(@RowData))
Set @Cnt = @Cnt + 1
End

Select @sValue = @RowData -- removed LTrim(RTrim())
Insert Into @RtnValue (data)
Select Data = @sValue
END
Return
END




As mentioned, to allow for spaces, you need to use DATALENGTH. And you also need to remember that unicode uses 2 bytes per character, not 1, and DATALENGTH returns the # of bytes in the string. So what happens if you replace the two instances of

LEN(@SplitOn)

with

DATALENGTH(@SplitOn)/2

?
Go to Top of Page
    Next Page

- Advertisement -