By a strange quirk of fate I have not really had to do too much with CLR functions in SQL Server until I started hanging out on Twitter. Then I saw this question..
“How do I remove non-numeric characters from a VARCHAR?”
Easy I thought until I realised that T-SQL isn’t really that good with string functions. Anyway I have cracked it but it isn’t going to be fast against a huge table, because I have a loop inside a scalar function ..
ALTER FUNCTION production.ufnChartoNumeric
DECLARE @CleanNumber int — the final product
DECLARE @CleanString nvarchar(50) =” — the inputstring with only numbers in
DECLARE @StringLength int = len(@String)+1 — the length of the string +1
DECLARE @Loop int = 1 — Loop counter
— If the string is already a number we can assign it to the result and finish
IF isNumeric(@String) = 1
SET @CleanString = @String
— If itsn’t a number we need to loop through the string a character at a time
— and test if each character is a number.
— if it is append it to our clean string if it isn’t move on
WHILE @Loop < @StringLength — loop for each character in the string
IF isNumeric(Substring(@String,@Loop,1)) = 1 SET @CleanString += Substring(@String,@Loop,1)
SET @Loop += 1
— if there’s a decimal point and you want to keep it, (but which one if there are 2!)
— you’ll need more code to detect it
— and maybe return a float, but the first test will keep it
— so for this example I want to eliminate it.
SET @CleanString = Replace(@CleanString, ‘-‘,”)
SET @CleanString = Replace(@CleanString, ‘+’,”)
SET @CleanString = Replace(@CleanString,’.’,”)
SET @CleanNumber = Cast(@CleanString as int)
As you can see I also had to deal with decimal points and plus and minus signs as these could occur many times in a string. BTW there are little bits of T-SQL 2008 in here such as ‘+=’ and DECLARE .. = .. which you’ll need to change for older versions.
It does at least work, (I tested it against a few random string and against production.product.productnumber in AdventureWorks), but the only way I can think of to improve the speed is to rewrite it as a CLR which is much better at handling in row string manipulation.
But I leave that for another day as I am not sure what this is exactly needed for, and often something that works now is better than something fast tomorrow.