Published: Wednesday, March 10, 2004
Creating a User Defined Function in SQL for Comma-Delimited Searches
By Corey Aldebol
Introduction
In a previous article, Using Sets with SQL Server, author
Scott Mitchell looks at how to use the IN operator in a SQL WHERE
clause. The IN operator can be used to filter records that match a specified comma-delimited list, or the results
of another SQL query. For example:
Select
employeeId, -- integer (int)
FirstName,
LastName
from Employees
Where
employeeId IN (1,3,6,7,13)
|
Will return employees from the Employees table where the employeId field equals 1, 3, 6, 7, or
13. Similarly, we could just get employees Bob, Mary, and John using the following query:
Select
employeeId,
FirstName, -- character data (nvarchar)
LastName
from Employees
Where
FirstName IN ('Bob','Mary','John')
|
The IN operator also can be used with a separate SQL SELECT statement. For example, if we had a separate
table firedEmployees, which had a record for each fired employee, along with information about their
termination (when they were fired, why they were let go, etc.).
If we wanted to obtain employee information about those who had been fired, we could use:
Select *
from Employees
Where
employeeId IN (Select employeeId From firedEmployees)
|
Armed with this knowledge about the IN operator, you might want to create a stored procedure that takes in a comma-delimited
string and retrieves those records that match the string. Something like:
CREATE PROCEDURE getEmployeesByFirstName
(
@CommaDelimitedListOfFirstNames nvarchar(1000)
)
AS
Select *
from Employees
Where
FirstName IN (@CommaDelimitedListOfFirstNames)
|
However, the IN operator cannot accept a comma-delimited string. It only accepts comma-delimited constants or the results
of a SQL query. There are two workarounds that I'm familiar with to overcome this hurdle. One is to create an nvarchar
local variable in the stored procedure, and concatenate the SQL query in with the comma-delimited list to use with the IN
operator. An example of this is discussed in this SQL Guru post. While this
option is viable, it leads to rather unreadable stored procedure syntax that can be difficult to debug.
A cleaner approach is to create a user-defined function (UDF) that can take a comma-delimited string and convert it into
the table format the IN operator can work with. In this article we'll examine how to implement this latter approach.
Creating a Split UDF
What we need to do is to create a function that takes a comma-delimited string and splits it into a recordset representation.
Since this functionality is similar to the Split function in various programming languages, which splits
a delimited string into a string array, I decided to call this UDF Split.
There are generally two parameters to a split function: the list
to split and the charatcer(s) to split on, the delimiter. In the following function we begin by declaring
our input variables - @List, the list to split, and @SplitOn, the delimiter(s) to split on.
The return value of this UDF is a table with two fields: Id, an identity column, and Value,
an nvarchar(100) column.
What Split will do is turn a delimited list into a table with a row for each delimited "piece" of the
input string. So, if you passed in the string "1,3,6,7,10" and specified the delimiter as ",", you'd get back a five-record
table with the row values: "1", "3", "6", "7", and "10".
Now, to create a user-defined function for Microsoft SQL Server, use Enterprise Manager and expand the database for
which you want to add a UDF. You should see a 'User Defined Functions' label expanded underneath the database.
Right-click and select 'New User Defined Function' (see the screenshot below). From here, enter the code
for the UDF. (Note: we'll be examining the entire UDF in pieces over the remainder of this article - you need to add the
entire code to the UDF. You can view the complete UDF code here. For more
information on UDFs, be sure to read: User Defined Functions.)
CREATE FUNCTION dbo.Split
(
@List nvarchar(2000),
@SplitOn nvarchar(5)
)
RETURNS @RtnValue table
(
Id int identity(1,1),
Value nvarchar(100)
)
AS
BEGIN
|
The main body of the function simply loops through the string finding the first occurrence of the delimiter on each
pass. Once the delimiter has been found, the string is broken into two pieces. The first piece is inserted into
the result table while the second piece replaces the original list. The loop continues until no more occurrences of
the delimiter are found. Lastly, the remainder of the list is added to the result table.
To start the loop, we want our while condition to check for the existence of the delimiter in our list string. The
T-SQL function Charindex will return the character position of the first occurrence of the delimiter.
To use the Charindex function we pass first the delimiter (@SplitOn) and second the list
string (@List). (For more information on CharIndex be sure to consult the
technical documentation.)
While (Charindex(@SplitOn,@List)>0)
Begin
|
Once we are inside the loop, we know that the delimiter still exists in the string. Using the Substring
function, which requires the 'source string', the 'starting character number', and the 'number of characters', we can
retrieve all of the characters before the delimiter. The 'number of characters' is determined by using the
Charindex function. We use the same parameters we used in the while condition and are returned the
position of the first character of the delimiter. Since we do not want to include any part
of the delimiter in our value we subtract one from the position to get the length. (For more information on the
Substring function, consult the technical
docs.)
Insert Into @RtnValue (value)
Select
Value = ltrim(rtrim(Substring(@List,1,Charindex(@SplitOn,@List)-1)))
|
Now that we have stored the first value in the list, we want to remove the first value and the delimiter that follows
it. We will again use the Substring function. This time the 'starting character number' is found by
adding the first position to the length of the delimiter. The 'number of characters' is the length of the
list, which will exceed the length of the string (In SQL this is not a problem for Substring, but not
a good practice across the board). The End continues the loop.
Set @List = Substring(@List,Charindex(@SplitOn,@List)+len(@SplitOn),len(@List))
End
|
Finally, after finishing the loop there may be a value left in the list, so we store what ever is left just in case.
Return the table and you have a split function.
Insert Into @RtnValue (Value)
Select Value = ltrim(rtrim(@List))
Return
END
|
Using the Split UDF
Now that we've created the Split user-defined function, let's look at using it in a SQL query or stored
procedure. First, before seeing an example, realize that the Split UDF returns a recordset with the
Value as type nvarchar. Depending on the data type of the column that's participating in
the IN operator, you might need to use the T-SQL CONVERT method to cast Value to that data type.
For example, in the following syntax you can see how Value must be cast to an int because
employeeId is an int.
CREATE PROCEDURE getEmployeesByID(@employeeList nvarchar(1000))
AS
Select
employeeId, -- integer (int)
FirstName,
LastName
from #myTable
Where
employeeId IN (Select convert(int,Value) from dbo.Split(@employeeList,','))
|
Notice how this function is used - Split returns a recordset, and we pick out the Value
portion of this recordset using a SELECT statement where the table name is Split.
Split receives two input parameters - the delimited list (@employeeList, in this example),
and the delimiter.
An ASP/HTML Example
One of the projects I have worked on was a user management system that maintained the access rights to a several
internal Web applications as well as various permissions in each application. I stored the access rights in a table
called hasAccess that contained the columns userId and permissionId. To manage a
user's access, I used an ASP page with two multiple <select> lists.
One list provides a list of permissions the user doesn't have, the other a list of what permissions they do
have. The user interface, then, allows the user to move permissions back and forth between this two disparate lists.
The HTML markup sent to the browser for a user with no access to permissions #1, #5, #7, and #8, but access for
permissions #2, #3, #4, and #6, would look like:
<select name="noAccessList">
<option value="1">Permission #1</option>
<option value="5">Permission #5</option>
<option value="7">Permission #7</option>
<option value="8">Permission #8/option>
</select>
...
<select name="hasAccessList">
<option value="2">Permission #2</option>
<option value="3">Permission #3</option>
<option value="4">Permission #4/option>
<option value="6">Permission #6/option>
</select>
|
With such permissions there would be four rows in the hasAccess table for this user:
userId permissionId
----------- ------------
1 2
1 3
1 4
1 6
If I want to delete multiple permissions from the hasAccess table - say permissions 2, 3, and 6 - then I
will select those three permissions in the list box and then submit my form. On the other side of the post,
Request.Form("hasAccess") will return the comma separate list "2,3,6". I can then pass this string to a
stored procedure and have it use the Split UDF to remove those permissions.
CREATE PROCEDURE DeletePermissions(@UserId int, @removeAccessList nvarchar(255))
AS
Delete From [hasAccess]
Where userId = @UserId and
permissionId in (Select convert(int,Value) From dbo.Split(@removeAccessList,','))
|
Happy Programming!
By Corey Aldebol