Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 SSN Masking - Performance Issue

Author  Topic 

8022421
Starting Member

45 Posts

Posted - 2009-08-18 : 01:27:34
Actually we have an entire email content stored in a single column in a table. Our aim here is mask the SSN numbers in the email content like (123456789 to XXXXXXXXX) OR (123-12-1234 like xxx-xx-xxxx). The table has two million records, we have 10 different formats occuring in the email content. for example the SSN pattern can be like this.
[1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
[ ][1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][ ]

We have used PATINDEX to do this function. But what we found that it was taking very long time to execute. I have provided the script below.

SELECT DISTINCT MLIEEID, EECLIENT, EEEMPIDCD, PATINDEX('%[0-9][0-9][0-9][./,_^*+\][0-9][0-9][./,_^*+\][0-9][0-9][0-9][0-9]%', [EEEMPIDCD])
FROM [EEISTAT]
WHERE PATINDEX('%[0-9][0-9][0-9][./,_^*+\][0-9][0-9][./,_^*+\][0-9][0-9][0-9][0-9]%', [EEEMPIDCD]) > 0

The above script is taking a long time to execute. Can you please guide us in reframing the above query so that it takes a minimum amount of time to execute.

Thanks in advance.
   

- Advertisement -