Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Scrambling Production Data

Author  Topic 

DBA_nupe
Starting Member

11 Posts

Posted - 2008-04-01 : 13:21:04
This is probably an easy solution for some of you seasoned DBA Vets but here is my problem.

I have to take production data and scramble certain sensitive columns such as SSN, DOB, Address, First Name so that our Management team can use it as demo material. Is there a quick solution to this issue?

Thanks,

JC

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-04-01 : 13:35:26
Have a look at this. i think it refers to your problem:-

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=35880
Go to Top of Page

DBA_nupe
Starting Member

11 Posts

Posted - 2008-04-01 : 13:55:05
Unfortunately it doesnt help. No one on that thread provides a clear push in the right direction...
Go to Top of Page

jhocutt
Constraint Violating Yak Guru

385 Posts

Posted - 2008-04-01 : 13:56:55
[code]
--Random SSN
select
[SSN] =
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)
from
(select ch =
replicate('0123456789',8)) a


--Change the Date
declare @BIRTHDAY_COLUMN datetime
set @BIRTHDAY_COLUMN = '06/15/1961'
select
DateAdd(yy, -1 * cast(substring(ch,convert(int,rand()*len(ch)-1),1) as int), @BIRTHDAY_COLUMN )
from
(select ch =
replicate('123456789',8)) a

-- Random Name (Not really a name but a string of characters)
select
[Name] =
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)
from
(select ch =
replicate('ABCDEFGHJKLMNPQURSUVWXYZ',8)+
replicate('abcdefghjkmnpqursuvwxyz',8)) a
[/code]

"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking
Go to Top of Page

DBA_nupe
Starting Member

11 Posts

Posted - 2008-04-01 : 14:42:44
quote:
Originally posted by jhocutt


--Random SSN
select
[SSN] =
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)
from
(select ch =
replicate('0123456789',8)) a


--Change the Date
declare @BIRTHDAY_COLUMN datetime
set @BIRTHDAY_COLUMN = '06/15/1961'
select
DateAdd(yy, -1 * cast(substring(ch,convert(int,rand()*len(ch)-1),1) as int), @BIRTHDAY_COLUMN )
from
(select ch =
replicate('123456789',8)) a

-- Random Name (Not really a name but a string of characters)
select
[Name] =
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-1),1)
from
(select ch =
replicate('ABCDEFGHJKLMNPQURSUVWXYZ',8)+
replicate('abcdefghjkmnpqursuvwxyz',8)) a


"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking



Thank you very much for showing me that bit of code. I ran it against the table and it brought back a random number...my next question is in order to perform an UPDATE against the table for so that it updates all SSNs...would i say something like

Update TableName
set SSN = a

and then provide a Where condition?

Thanks...

FYI - i am not a strong developer, so i really appreciate all of your help.
Go to Top of Page

jhocutt
Constraint Violating Yak Guru

385 Posts

Posted - 2008-04-01 : 14:57:44
-- Try something like this
declare @mytable table (pk int identity(1,1), ssn varchar(10), fname varchar(10), birthdate datetime )
insert into @mytable
select '111223333', 'Fred', '03/10/1960' union
select '111223334', 'Mary', '04/11/1961' union
select '111223335', 'Paul', '05/12/1962' union
select '111223336', 'George', '06/13/1963'

-- See the date
select * from @MyTable

--Update the SSN to Null we will update one row at a time till all of them are done
update @MyTable set ssn = NULL

while (select count(*) from @MyTable where ssn is null or len(ssn) < 9 ) > 0
BEGIN
set rowcount 1
update @MyTable
set ssn =
-- --Random SSN
(
select
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-2),1)+
substring(ch,convert(int,rand()*len(ch)-3),1)+
substring(ch,convert(int,rand()*len(ch)-4),1)+
substring(ch,convert(int,rand()*len(ch)-5),1)+
substring(ch,convert(int,rand()*len(ch)-6),1)+
substring(ch,convert(int,rand()*len(ch)-7),1)+
substring(ch,convert(int,rand()*len(ch)-8),1)+
substring(ch,convert(int,rand()*len(ch)-9),1)
from
(select ch =
replicate('0123456789',12)) a
)
-- --Change the Date
, birthdate = (
select
DateAdd(yy, -1 * cast(substring(ch,convert(int,rand()*len(ch)-1),1) as int), birthdate )
from
(select ch =
replicate('123456789',8)) a
)
-- -- Random Name (Not really a name but a string of characters)
, fname = (
select
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-2),1)+
substring(ch,convert(int,rand()*len(ch)-3),1)+
substring(ch,convert(int,rand()*len(ch)-4),1)+
substring(ch,convert(int,rand()*len(ch)-5),1)+
substring(ch,convert(int,rand()*len(ch)-6),1)+
substring(ch,convert(int,rand()*len(ch)-7),1)+
substring(ch,convert(int,rand()*len(ch)-8),1)
from
(select ch =
replicate('ABCDEFGHJKLMNPQURSUVWXYZ',8)+
replicate('abcdefghjkmnpqursuvwxyz',8)) a
)
where ssn is null
or len(ssn) < 9 -- Sometimes thge SSN does not get 9 chars
set rowcount 0
END

select * from @MyTable





"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking
Go to Top of Page

DBA_nupe
Starting Member

11 Posts

Posted - 2008-04-01 : 15:20:31
quote:
Originally posted by jhocutt

-- Try something like this
declare @mytable table (pk int identity(1,1), ssn varchar(10), fname varchar(10), birthdate datetime )
insert into @mytable
select '111223333', 'Fred', '03/10/1960' union
select '111223334', 'Mary', '04/11/1961' union
select '111223335', 'Paul', '05/12/1962' union
select '111223336', 'George', '06/13/1963'

-- See the date
select * from @MyTable

--Update the SSN to Null we will update one row at a time till all of them are done
update @MyTable set ssn = NULL

while (select count(*) from @MyTable where ssn is null or len(ssn) < 9 ) > 0
BEGIN
set rowcount 1
update @MyTable
set ssn =
-- --Random SSN
(
select
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-2),1)+
substring(ch,convert(int,rand()*len(ch)-3),1)+
substring(ch,convert(int,rand()*len(ch)-4),1)+
substring(ch,convert(int,rand()*len(ch)-5),1)+
substring(ch,convert(int,rand()*len(ch)-6),1)+
substring(ch,convert(int,rand()*len(ch)-7),1)+
substring(ch,convert(int,rand()*len(ch)-8),1)+
substring(ch,convert(int,rand()*len(ch)-9),1)
from
(select ch =
replicate('0123456789',12)) a
)
-- --Change the Date
, birthdate = (
select
DateAdd(yy, -1 * cast(substring(ch,convert(int,rand()*len(ch)-1),1) as int), birthdate )
from
(select ch =
replicate('123456789',8)) a
)
-- -- Random Name (Not really a name but a string of characters)
, fname = (
select
substring(ch,convert(int,rand()*len(ch)-1),1)+
substring(ch,convert(int,rand()*len(ch)-2),1)+
substring(ch,convert(int,rand()*len(ch)-3),1)+
substring(ch,convert(int,rand()*len(ch)-4),1)+
substring(ch,convert(int,rand()*len(ch)-5),1)+
substring(ch,convert(int,rand()*len(ch)-6),1)+
substring(ch,convert(int,rand()*len(ch)-7),1)+
substring(ch,convert(int,rand()*len(ch)-8),1)
from
(select ch =
replicate('ABCDEFGHJKLMNPQURSUVWXYZ',8)+
replicate('abcdefghjkmnpqursuvwxyz',8)) a
)
where ssn is null
or len(ssn) < 9 -- Sometimes thge SSN does not get 9 chars
set rowcount 0
END

select * from @MyTable





"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking



Thank you soo much for your help. I am going to run this against one of my dev databases and let you know the results tomorrow 4/2.
Go to Top of Page

jhocutt
Constraint Violating Yak Guru

385 Posts

Posted - 2008-04-01 : 16:01:20
[code]
-- Faster on larger sets
declare @mytable table (pk int identity(1,1), ssn varchar(10), fname varchar(10), birthdate datetime )
insert into @mytable
select '535629876', 'Fred', '03/10/1960' union
select '846842684', 'Mary', '04/11/1961' union
select '977616541', 'Paul', '05/12/1962' union
select '687687135', 'George', '06/13/1963'

-- See the data
select * from @MyTable

--Update Data
update @MyTable
set
ssn = cast(right(ssn,2) as varchar)+cast(left(ssn,2) as varchar)+reverse(substring(ssn,3,5))
, fname = (
select
upper(substring(ch,convert(int,rand()*len(ch)-substring(ssn,1,1)),1))+
substring(ch,convert(int,rand()*len(ch)-substring(ssn,2,1)),1)+
substring(ch,convert(int,rand()*len(ch)-substring(ssn,3,1)),1)+
substring(ch,convert(int,rand()*len(ch)-substring(ssn,4,1)),1)+
substring(ch,convert(int,rand()*len(ch)-substring(ssn,5,1)),1)+
substring(ch,convert(int,rand()*len(ch)-substring(ssn,6,1)),1)+
substring(ch,convert(int,rand()*len(ch)-substring(ssn,7,1)),1)+
substring(ch,convert(int,rand()*len(ch)-substring(ssn,8,1)),1)
from
(select ch =
replicate('abcdefghjkmnpqursuvwxyz',8)) a
)
, birthdate=DateAdd(m, convert(int,case substring(ssn,2,1) when 0 then substring(ssn,3,1) else substring(ssn,2,1) end), birthdate)

--See the changes
select * from @MyTable
[/code]

"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking
Go to Top of Page

DBA_nupe
Starting Member

11 Posts

Posted - 2008-04-02 : 15:09:44
Thank you...the SSN part worked like a charm. I dont need to scramble around the name however when I run the update for DOB, this is what I get:

"The data type datetime is invalid for the substring function. Allowed types are: char/varchar, nchar/nvarchar, and binary/varbinary."

this is the statement

Update AMER_DOB
SET AMER_DOB_DT =
DateAdd(m, convert(int,case substring(AMER_DOB_DT,2,1)
when 0 then substring(AMER_EMP_DOB_DT,3,1)
else substring(AMER_DOB_DT,2,1) end), AMER_DOB_DT)

Any ideas...
Go to Top of Page

jhocutt
Constraint Violating Yak Guru

385 Posts

Posted - 2008-04-02 : 15:37:42
Yeah, use the SSN column instead of the date.
I was using ssn to get a pseudo random number, to alter the date field.
You cant use a date time field for this unless you convert it to a charachter string of numeric values.

"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking
Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2008-04-03 : 11:17:53
I think red-gate.com have a product which does this sort of thing (for a minor fee)...or as I read their blobs/forums this seems to be the product functionality.
Go to Top of Page

kranpura
Starting Member

1 Post

Posted - 2011-05-20 : 19:01:36
Thanks this post helped me.
Go to Top of Page
   

- Advertisement -