It is often necessary to associate multiple rows within one table to a single row within another table, for example several contacts associated with one company, or several categories mapped to one company or contact.
This raises the problem of how to allow your end users to search on multiple items rather than just one, and only display the main route item once.
Let me explain a scenario that you might encounter: –
You have a contacts table, which is linked to a companies table, which is again liked to a keywords table. Any one company can have multiple contacts and multiple keywords associated with it.
Now say you want to allow a multi keyword search against the keywords table, but the search must contain both keywords for the same company and you only want one row per contact to be returned. You may also simple want to display all the keywords associated with a company when viewing a list of contacts.
Traditionally this would have been quite complex to achieve. However with Microsoft SQL 2005 and above’s support for the CROSS APPLY syntax, this becomes a whole lot easier to achieve.
CROSS APPLY
Put simply CROSS APPLY is an INNER or OUTER JOIN between a table and a table-valued function.
A simple example is one commonly shown for the Microsoft AdventureWorks smaple database: –
USE
AdventureWorks
GOCREATE
FUNCTION
Sales.fnTopNOrders(
@CustomerID AS int, @n AS INT) RETURNS TABLE AS RETURNSELECT TOP(@n) SalesOrderID, ShipDate = convert(char(10), ShipDate,112),
TotalDue=convert(varchar,TotalDue,1)
FROM AdventureWorks.Sales.SalesOrderHeader
WHERE CustomerID = @CustomerID ORDER BY TotalDue DESC
GOSELECT
StoreName=s.Name, [Top].ShipDate, [Top].SalesOrderID,
TotalDue=‘$’+[Top].TotalDue
FROM AdventureWorks.Sales.Store AS s
JOIN AdventureWorks.Sales.Customer AS c
ON s.CustomerID = c.CustomerID
CROSS APPLY
AdventureWorks.Sales.fnTopNOrders(c.CustomerID, 5) AS [Top]
WHERE CustomerType=‘S’
ORDER BY StoreName, convert(money,TotalDue) DESC
GOIn this sample the table valued function returns the top ‘n’ (largest) orders for a store. The SELECT statement would normally be limited to the stores, but with the CROSS APPLY join in place the store information can be combined (joined) with the results for the top orders (in this case the top 5 orders).
The same result could have been achieved previously by using a temporary table and joining with that, but that can be tedious to code and maintain. Using CROSS APPLY to join the table to the function’s results is much neater and can be quite powerful.
Back to the problem…
So how does CROSS APPLY help us list all our keywords in one column for our company?
Well the answer comes from SQL’s XML features… which can return results as a table-value, with just one row and one column… and we can manipulate the format of the output so we have something nice and neat to work with.
FOR XML Clause
The FOR XML Clause has several modes that can run, which will determine the shape of that resulting XML. These modes are RAW, AUTO, EXPLICIT or PATH.
Without going into too much detail the mode we are interested in to help us with our problem is the PATH mode. So using the FOR XML PATH clause in our query.
The PATH mode provides a simple way to mix elements and attributes, in order to output some XML based on our data (for more in-depth details take a look at http://msdn2.microsoft.com/en-us/library/ms189885.aspx and http://msdn2.microsoft.com/en-us/library/ms190922.aspx)
Now in our problem we are looking to retrieve all the keywords for a company and display them in one column. So lets see what we can get from SQL using the FOR XML Clause
If we simply used the following: –
SELECT Keyword FROM cont_keyword WHERE CompanyID = 19
FOR XML PATHWe would be returned one table with one row that had one column. The contents of which would be something like: –
<
row>
<Keyword>Reseller</Keyword>
</row>
<row>
<Keyword>Ireland</Keyword>
</row>This is not quite what we are looking for so we shall modify the PATH mode clause to exclude the row element: –
SELECT
Keyword FROM cont_keyword WHERE CompanyID = 19
FOR XML PATH(”)Now we get an output something like: –
<
Keyword>Reseller</Keyword>
<Keyword>Ireland</Keyword>Which is getting much closer to something we can use. Now when outputting something as XML in SQL we can tell it to output any column as plain text without any element wrappers. This would be done as follows: –
SELECT
Keyword AS [text()] FROM cont_keyword
WHERE CompanyID = 19 FOR XML PATH(”)But this isn’t much use as it would just return one string with every keywords joined together.
ResellerIreland
Modifying our query a little more will allow us to separate our keywords: –
SELECT
‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword WHERE CompanyID = 19
FOR XML PATH(”)I have chosen to surround my keywords with "{" and "}", this is to allow me to easily search for specific categories in a LIKE statement. If I simply separated them with commas then if there are categories that are similar (for example "Architect", "Structural Architect", etc), I would run into problems when constructing my LIKE clause. The query above would give a result looking something like: –
{Reseller},{Ireland},
Now this is something that we can use… well almost, you’ll notice if you run it, the table column is given a random XML name, as is the table the result is presented in.
SIDE NOTE: – You may have been tempted to use square brackets instead ("[" and "]") but we must remember that these are reserved characters in LIKE statements and would have to be escaped in any subsequent queries.
We could use something like: –
SELECT (SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword WHERE CompanyID = 19
FOR XML PATH(”)) AS keywordlistIf we just need the column to have a legible name, but for our purposes we also need it defined as a table, which is where part of the CROSS APPLY syntax comes in…
Back to the problem
Now we shall use a simple CROSS APPLY join to join our companies table and keywords table-value together.
SELECT
cont_companies.*,
LEFT(kw.keywordlist, LEN(kw.keywordlist) – 1) AS keywords
FROM
cont_companies
CROSS APPLY
(SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword
WHERE COMPANYID = cont_companies.Company_ID
FOR XML PATH(”)) kw(keywordlist)Now in the code above you will notice a few things, first of, we only select all the columns from the companies table, then we actually take the output from our XML table and trim of the last comma character. The CROSS APPLY join is done based on the CompanyID filed, so we will get a list of keywords for each company.
In this sample the keyword output is put into a table called ‘kw’, with the column ‘keywordlist’. ‘keywordlist’ contains the full output from the XML query we built above, which includes an additional comma on the end. In our select statement we disregard this because it is not needed, and should not be displayed at anypoint to end users.
We only select all the columns from the companies table, plus the trimmed keywords column, because we are not interested in any of the other data that will be in the ‘kw’ temporary table, so why return it.
For my purposes I’d actually define this query as a view… you need to watch however, because SQL 2005 Management Studio will tell you that it cannot parse the query if you edit or create this through the designed… Along with a couple of other features (like ‘newsequentialid()’) the SQL 2005 Management Studio isn’t able to parse this type of query and will tell you so.
Because the FOR XML clause will only ever return one row an alternative to using the CROSS APPLY join would be to call the FOR XML clause SELECT twice, for example: –
SELECT
cont_companies.*,
LEFT((
SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword
WHERE CompanyID = cont_companies.Company_ID
FOR XML PATH(”))
,
LEN((
SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword
WHERE CompanyID = cont_companies.Company_ID
FOR XML PATH(”))
) –1 ) AS keywords
FROM cont_companiesNow this will give the same result but would run the FOR XML clause query twice, and if we have several thousand companies with several keywords associated with each, then the overhead of running the FOR XML twice as many times as is needed will soon add up.
Using the CROSS APPLY join should therefore be more efficient and easier to manage.
Update: – 22nd March 2008
======================
On further usage I found that the resulting "keywords" column from both the queries shown above would, on occasion, return the data as binary data. As such they should have a CONVERT statement applied to them in order to correctly return the required data as a TEXT data-type, i.e.: –
SELECT
cont_companies.*,
CONVERT(TEXT,
LEFT(kw.keywordlist, LEN(kw.keywordlist) – 1)
) AS keywords
FROM
cont_companies
CROSS APPLY
(SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword
WHERE COMPANYID = cont_companies.Company_ID
FOR XML PATH(”)) kw(keywordlist)
or
SELECT
cont_companies.*,
CONVERT(TEXT, LEFT((
SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword
WHERE CompanyID = cont_companies.Company_ID
FOR XML PATH(”))
,
LEN((
SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword
WHERE CompanyID = cont_companies.Company_ID
FOR XML PATH(”))
) –1 )) AS keywords
FROM cont_companies