How to… Showing a sub-query containing multiple rows within one column of a SELECT statement (using CROSS APPLY and SQL’s XML abilities)

It is often necessary to associate multiple rows within one table to a single row within another table, for example several contacts associated with one company, or several categories mapped to one company or contact.

This raises the problem of how to allow your end users to search on multiple items rather than just one, and only display the main route item once.

Let me explain a scenario that you might encounter: –

You have a contacts table, which is linked to a companies table, which is again liked to a keywords table. Any one company can have multiple contacts and multiple keywords associated with it.

Now say you want to allow a multi keyword search against the keywords table, but the search must contain both keywords for the same company and you only want one row per contact to be returned. You may also simple want to display all the keywords associated with a company when viewing a list of contacts.

Traditionally this would have been quite complex to achieve. However with Microsoft SQL 2005 and above’s support for the CROSS APPLY syntax, this becomes a whole lot easier to achieve.


CROSS APPLY

Put simply CROSS APPLY is an INNER or OUTER JOIN between a table and a table-valued function.

A simple example is one commonly shown for the Microsoft AdventureWorks smaple database: –

USE

AdventureWorks
GO

CREATE

FUNCTION
   
Sales
.fnTopNOrders(
        @CustomerID
AS int, @n AS INT) RETURNS TABLE AS RETURN

    SELECT TOP(@n) SalesOrderID, ShipDate = convert(char(10), ShipDate,112),
       
TotalDue=convert(varchar,TotalDue,1) 
   
FROM AdventureWorks.Sales.SalesOrderHeader
   
WHERE CustomerID = @CustomerID ORDER BY TotalDue DESC
GO

SELECT
   

StoreName=s.Name, [Top].ShipDate, [Top].SalesOrderID, 
    TotalDue
=‘$’+[Top].TotalDue
FROM AdventureWorks.Sales.Store AS s
JOIN AdventureWorks.Sales.Customer AS c
ON s.CustomerID = c.CustomerID
CROSS APPLY
    AdventureWorks.Sales.fnTopNOrders(c.CustomerID, 5) AS [Top]
WHERE CustomerType=‘S’
ORDER BY StoreName, convert(money,TotalDue) DESC
GO

In this sample the table valued function returns the top ‘n’ (largest) orders for a store. The SELECT statement would normally be limited to the stores, but with the CROSS APPLY join in place the store information can be combined (joined) with the results for the top orders (in this case the top 5 orders).

The same result could have been achieved previously by using a temporary table and joining with that, but that can be tedious to code and maintain. Using CROSS APPLY to join the table to the function’s results is much neater and can be quite powerful.


Back to the problem…

So how does CROSS APPLY help us list all our keywords in one column for our company?

Well the answer comes from SQL’s XML features… which can return results as a table-value, with just one row and one column… and we can manipulate the format of the output so we have something nice and neat to work with.


FOR XML Clause

The FOR XML Clause has several modes that can run, which will determine the shape of that resulting XML. These modes are RAW, AUTO, EXPLICIT or PATH.

Without going into too much detail the mode we are interested in to help us with our problem is the PATH mode. So using the FOR XML PATH clause in our query.

The PATH mode provides a simple way to mix elements and attributes, in order to output some XML based on our data (for more in-depth details take a look at http://msdn2.microsoft.com/en-us/library/ms189885.aspx and http://msdn2.microsoft.com/en-us/library/ms190922.aspx)

Now in our problem we are looking to retrieve all the keywords for a company and display them in one column. So lets see what we can get from SQL using the FOR XML Clause

If we simply used the following: –

SELECT Keyword FROM cont_keyword WHERE CompanyID = 19
FOR XML PATH

We would be returned one table with one row that had one column. The contents of which would be something like: –

<

row>
    <
Keyword>Reseller</Keyword>
</
row>
<
row>
    <
Keyword>Ireland</Keyword>
</
row>

This is not quite what we are looking for so we shall modify the PATH mode clause to exclude the row element: –

SELECT

Keyword FROM cont_keyword WHERE CompanyID = 19
FOR XML PATH() 
 

Now we get an output something like: –

<

Keyword>Reseller</Keyword>
<
Keyword>Ireland</Keyword>

Which is getting much closer to something we can use. Now when outputting something as XML in SQL we can tell it to output any column as plain text without any element wrappers. This would be done as follows: –

SELECT

Keyword AS [text()] FROM cont_keyword
WHERE CompanyID = 19 FOR XML PATH()

But this isn’t much use as it would just return one string with every keywords joined together.

ResellerIreland

Modifying our query a little more will allow us to separate our keywords: –

SELECT

‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword WHERE CompanyID = 19
FOR XML PATH()
 

I have chosen to surround my keywords with "{" and "}", this is to allow me to easily search for specific categories in a LIKE statement. If I simply separated them with commas then if there are categories that are similar (for example "Architect", "Structural Architect", etc), I would run into problems when constructing my LIKE clause. The query above would give a result looking something like: –

{Reseller},{Ireland},

Now this is something that we can use… well almost, you’ll notice if you run it, the table column is given a random XML name, as is the table the result is presented in.

SIDE NOTE: – You may have been tempted to use square brackets instead ("[" and "]") but we must remember that these are reserved characters in LIKE statements and would have to be escaped in any subsequent queries.

We could use something like: –

SELECT (SELECT ‘{’ + Keyword + ‘},’ AS [text()]
FROM cont_keyword WHERE CompanyID = 19
FOR XML PATH()) AS keywordlist
 

If we just need the column to have a legible name, but for our purposes we also need it defined as a table, which is where part of the CROSS APPLY syntax comes in…


Back to the problem

Now we shall use a simple CROSS APPLY join to join our companies table and keywords table-value together.

SELECT
    cont_companies
.*, 
    LEFT(kw.keywordlist, LEN(kw.keywordlist) 1) AS keywords
FROM 
    cont_companies
CROSS APPLY
    (SELECT ‘{’ + Keyword + ‘},’ AS [text()]
        FROM cont_keyword
        WHERE COMPANYID = cont_companies.Company_ID
        FOR XML PATH()) kw(keywordlist)

Now in the code above you will notice a few things, first of, we only select all the columns from the companies table, then we actually take the output from our XML table and trim of the last comma character. The CROSS APPLY join is done based on the CompanyID filed, so we will get a list of keywords for each company.

In this sample the keyword output is put into a table called ‘kw’, with the column ‘keywordlist’. ‘keywordlist’ contains the full output from the XML query we built above, which includes an additional comma on the end. In our select statement we disregard this because it is not needed, and should not be displayed at anypoint to end users.

We only select all the columns from the companies table, plus the trimmed keywords column, because we are not interested in any of the other data that will be in the ‘kw’ temporary table, so why return it.

For my purposes I’d actually define this query as a view… you need to watch however, because SQL 2005 Management Studio will tell you that it cannot parse the query if you edit or create this through the designed… Along with a couple of other features (like ‘newsequentialid()’) the SQL 2005 Management Studio isn’t able to parse this type of query and will tell you so.

Because the FOR XML clause will only ever return one row an alternative to using the CROSS APPLY join would be to call the FOR XML clause SELECT twice, for example: –

SELECT

cont_companies.*,
    LEFT((
        SELECT ‘{’ + Keyword + ‘},’ AS [text()]
        FROM cont_keyword
        WHERE CompanyID = cont_companies.Company_ID
        FOR XML PATH())
        ,
        LEN((
            SELECT ‘{’ + Keyword + ‘},’ AS [text()]
            FROM cont_keyword
            WHERE CompanyID = cont_companies.Company_ID
            FOR XML PATH())
        ) 1 ) AS keywords
FROM cont_companies

Now this will give the same result but would run the FOR XML clause query twice, and if we have several thousand companies with several keywords associated with each, then the overhead of running the FOR XML twice as many times as is needed will soon add up.

Using the CROSS APPLY join should therefore be more efficient and easier to manage.

Update: – 22nd March 2008
======================

On further usage I found that the resulting "keywords" column from both the queries shown above would, on occasion, return the data as binary data. As such they should have a CONVERT statement applied to them in order to correctly return the required data as a TEXT data-type, i.e.: –

SELECT
    cont_companies
.*, 
    CONVERT(TEXT,
    LEFT(
kw.keywordlist, LEN(kw.keywordlist) 1)
   
)
AS keywords
FROM 
    cont_companies
CROSS APPLY
    (SELECT ‘{’ + Keyword + ‘},’ AS [text()]
        FROM cont_keyword
        WHERE COMPANYID = cont_companies.Company_ID
        FOR XML PATH()) kw(keywordlist)

or

SELECT

cont_companies.*,
    CONVERT(TEXT, LEFT((
        SELECT ‘{’ + Keyword + ‘},’ AS [text()]
        FROM cont_keyword
        WHERE CompanyID = cont_companies.Company_ID
        FOR XML PATH())
        ,
        LEN((
            SELECT ‘{’ + Keyword + ‘},’ AS [text()]
            FROM cont_keyword
            WHERE CompanyID = cont_companies.Company_ID
            FOR XML PATH())
        ) 1 )) AS keywords
FROM cont_companies

The SQL & MS Access LIKE statement

Anyone who deals with SQL and Access queries is probably familiar with the LIKE statement and the use of the standard wildcards (% or *).

Less documented however is the additional wild cards that you should be aware of if you are building queries that will be run against your database when constructing more complicated LIKE statements.

The following describes the wildcard’s that are available in Microsoft SQL.

Microsoft T-SQL LIKE statement wildcards

Wildcard character

Description

Example

%

Any string of zero or more characters.

SELECT * FROM [myTable] WHERE [title] LIKE ‘%computer%’This will find all the entries the column ‘title’ from ‘myTable’ contains the word ‘computer’ anywhere in the field.

_ (underscore)

Any single character.

SELECT

* FROM [myTable] WHERE [surname] LIKE ‘sm_th’

This will find all five-letter last names that start with ‘sm’ and end with ‘th’ (for example Smith, Smeth, etc).

[…]

Any single character within the specified range ([a-f]) or set ([abcdef]) or a numeric range [0-9].

SELECT

* FROM [myTable] WHERE [surname] LIKE ‘[C-P]arsen’

This will find all the last names ending with arsen and starting with any single character between C and P, for example Carsen, Larsen, Karsen, Paresen, etc.

[^]

Any single character not within the specified range ([^a-f]) or set ([^abcdef]).

SELECT

* FROM [myTable] WHERE [surname] LIKE ‘de[^l]%’

This will get all the last names starting with de and where the following letter is not l, for example it will return Desmond, but not Delimetry.

In addition after an MS SQL LIKE statement you can include an additional definition of an escape character, which can be placed in front of a wildcard character so that it is treated as a regular character or you could the square brackets as escape characters for your query, for example: –

SELECT * FROM [discounts] WHERE [discount] LIKE ’30[%]’ 
                    or
SELECT * FROM [discounts] WHERE [discount]
   
LIKE ’30!%’ ESCAPE ‘!’

This will select any discounts that are 30%.

Reference: – http://msdn2.microsoft.com/en-us/library/ms179859.aspx


In Microsoft Access the wildcards and syntax is slightly different but similar

The following describes the wildcard’s that are available in Microsoft Access.

Microsoft Access Query LIKE statement wildcards

Wildcard character

Description

Example

*

Any string of zero or more characters.

SELECT

* FROM [myTable] WHERE [title] LIKE ‘*computer*’

This will find all the entries the column ‘title’ from ‘myTable’ contains the word ‘computer’ anywhere in the field.

?

Any single character.

SELECT * FROM [myTable] WHERE [surname] LIKE ‘sm?th’

This will find all five-letter last names that start with ‘sm’ and end with ‘th’ (for example Smith, Smeth, etc).

[…]

Any single character within the specified range ([a-f]) or set ([abcdef]).

SELECT

* FROM [myTable] WHERE [surname] LIKE ‘[C-P]arsen’ 

This will find all the last names ending with arsen and starting with any single character between C and P, for example Carsen, Larsen, Karsen, Paresen, etc.

[!…]

Any single character not within the specified range ([!a-f]) or set ([!abcdef]).

SELECT

* FROM [myTable] WHERE [surname] LIKE ‘de[!l]*’

This will get all the last names starting with de and where the following letter is not l, for example it will return Desmond, but not Delimetry.

#

Any single numeric character.

SELECT

* FROM [myTable] WHERE [postcode] LIKE ‘eh## #??’

This will get all the postcodes starting with EH followed by two numeric characters a space another numeric character and a further two characters, for example it would return EH54 6LY, EH55 8TZ, etc but not EH4 1NJ.

In Microsoft Access there is no ESCAPE definition so you must use square brackets, for example: –

SELECT

* FROM [discounts] WHERE [discount] LIKE ’30[%]’

This will select any discounts that are 30%.


Query Syntax in Microsoft Access on Linked SQL tables 

Now as anyone who has worked with Microsoft Access linked SQL tables will likely know, performing like queries can be complicated… this is because it can sometimes be difficult to determine where a SQL statement/query might be executed (on the SQL Server or my the local Microsoft Access engine).

As a general rule of thumb, if you are running the statement (Microsoft Access query or Access DAO Recordset) directly against the linked table, it will use the SQL Server syntax. For example: –

SELECT

* FROM tblContacts
   
WHERE Firstname LIKE ‘Pa%’ –[WILL WORK]

If however your WHERE clause is in a join query or your query runs some kind of function on some of the data. For example : –

[a bad example at that, yes I would never recommend a statement like this]
SELECT Nz(contactid, 1), * FROM tblContacts
   
WHERE Firstname LIKE ‘Pa%’ –[WILL NOT WORK]

This statement would not return the correct results and would have to be modified to the following: –

SELECT

Nz(contactid, 1), * FROM tblContacts
   
WHERE Firstname LIKE ‘Pa*’ –[WILL WORK]

A more likely scenario would be where you are joining two tables within your Microsoft Access query and you needed to perform a like statement against it. For example: –

SELECT

* FROM tblContacts
   
LEFT JOIN tblCompanies ON
    tblContacts
.CompanyID = tblCompanies.CompanyID
   
WHERE Firstname LIKE ‘Pa%’ –[WILL NOT WORK]

In MS Access this statement would not return the correct results, as the join is performed by MS Access and would have to be modified to the following: –

SELECT

* FROM tblContacts
   
LEFT JOIN tblCompanies ON
        tblContacts
.CompanyID = tblCompanies.CompanyID
   
WHERE Firstname LIKE ‘Pa*’ –[WILL WORK]

This also applies to any MS Access Queries based other queries, even if the other Query is a direct Query against a linked SQL table.


Something to remember when generating queries from code

Now you are most likely writing code that will generate your LIKE statements based on a user’s selections. This applies whether you are coding from MS Access or through ASP, ASP.NET, VB, etc… but not if you are running through parameterised objects, for example a .NET 2.0 dataset query.

If you have to code in your SQL you need to remember to change key characters, this means these reserved wildcards as well as the standard single quotes. For example (in VBA): –

SET

myRS = CurrentDB.OpenRecordset( _
   
"SELECT * FROM tblContacts WHERE FirstName LIKE ‘" & _
    Replace(someVariable,
"’","”",1,-1,1) & "%’", _
    dbOpenDynaset, dbSeeChanges)
‘[WILL NOT WORK]

Though this will work in most situations, if "someVariable" contained any of the wildcard characters, you will experience unexpected results. So you should really do something more like: –

SET

myRS = CurrentDB.OpenRecordset( _
  
"SELECT * FROM tblContacts WHERE FirstName LIKE ‘" & _
    Replace( _
     Replace( _
      Replace( _
       Replace( Replace( _
        Replace(someVariable,
"’","”",1,-1,1) _
       ,
"%","[%]",1,-1,1), "#","[#]",1,-1,1) _
      ,
"[","[[]",1,-1,1) _
     ,
"]","[]]",1,-1,1) _
    ,
"_","[_]",1,-1,1) & "%’", _
    dbOpenDynaset, dbSeeChanges)
‘[WILL WORK IF RAN AGAINST SQL LINKED TABLE]

Or if ran against MS Access table or situation described above: –

SET

myRS = CurrentDB.OpenRecordset( _
    "SELECT * FROM tblContacts WHERE FirstName LIKE ‘" & _
     Replace( _
      Replace( _
       Replace( _
        Replace( Replace( _
         Replace(someVariable,
"’","”",1,-1,1) _
        , "*","[*]",1,-1,1), "#","[#]",1,-1,1)
_
       ,
"[","[[]",1,-1,1) _
      ,
"]","[]]",1,-1,1) _
     ,
"?","[?]",1,-1,1) & "*’", _
    dbOpenDynaset, dbSeeChanges) 
‘[WILL WORK IF RAN AGAINST ACCESS TABLE/QUERY]

As with any form of coding it is usually a good idea to extract out common functions into a small library that can be used throughout your applications.

As such if you do build queries based on user input I’d recommend using a standard function to tidy up any users data and prevent possible conflicts or security flaws. The function could look something like (VBA): –

Function

fnTidyLikeQuery(ByVal myString As Object) As String

        On Error Goto ExitHere
   
‘as it does no harm to escape all characters 
    ‘(SQL or Access specific) this function can 
    ‘take care if both).

    If isNull(myString) Then
       
Goto ExitHere
   
Else
       
fnTidyLikeQuery = _
            Replace( Replace( _
             Replace( Replace( _
               Replace( Replace( _
                Replace( Replace( _
                 myString _
                ,
"’","”",1,-1,1), "#","[#]",1,-1,1) _
               ,
"*","[*]",1,-1,1), "[","[[]",1,-1,1) _
              ,
"%","[%]",1,-1,1), "_","[_]",1,-1,1) _
            
, "]","[]]",1,-1,1), "?","[?]",1,-1,1) 
   
End If

    Exit Funtion
ExitHere:
    fnTidyLikeQuery =
""
End Function

Function

fnTidyQueryString(byVal myString as Variant) as String

    On Error Goto ExitHere

    If isNull(myString) Then
       
Goto ExitHere
   
Else
        fnTidyQueryString= Replace(myString, "’","”",1,-1,1) 
   
End If

    Exit Funtion
ExitHere:
    fnTidyQueryString =
""
End Function

A sample query based on the above functions would look something like: –

SET

myRS = CurrentDB.OpenRecordset( _
    "SELECT * FROM tblContacts WHERE FirstName LIKE ‘*" & _
    fnTidyLikeQuery(someVariable) &
"*’ AND Surname = ‘" & _
    fnTidyQueryString(someOtherVariable) &
"’", _
    dbOpenDynaset, dbSeeChanges) 
‘[WILL WORK IF RAN AGAINST ACCESS TABLE/QUERY (uses * as wildcard)]


Some more samples of LIKE statements

SQL Clause

Access Equivalent

Result

Like ‘%’

Like ‘*’

All except blanks

Like ‘A%’

Like ‘A*’

Starting with ‘A’

Like ‘%A%’

Like ‘*A*’

‘A’ somewhere in the field

Like ‘[A,B,D]’
  or
Like ‘[ABD]’

Like ‘[A,B,D]’
  or
Like ‘[ABD]’

One character an ‘A’ or ‘B’ or ‘D’

Like ‘[A-C]%’

Like ‘[A-C]*’

One character ‘A’ through ‘C’ as the first character

Like ‘[A-C][A-H]%’

Like ‘[A-C][A-H]*’

‘A’ through ‘C’ as the 1st character and ‘A’ through ‘H’ as the 2nd character

Like ‘SM_TH’

Like ‘SM?TH’

Starting with ‘Sm’, ending with ‘th’, and anything for the 3rd character

Like ‘[0-9]%’

Like ‘[0-9]*’
  or
Like ‘#*’

Digit for the 1st character

Like ‘[^a-c]’

Like ‘[!a-c]’

Not in a range of letters ‘a’ through to ‘c’

Like ‘[^a-c]%’

Like ‘[!a-c]*’

Not start with a range of letters

Like ‘[^0-9]%’

Like ‘[!0-9]*’

Not start with a number

How To… Referencing The DAO Engine

In VB Script (or sometimes in general application development and office application development where late binding is needed), it is necessary to create a DAO engine object. Either to access database through DAO, or given there are better ways to access the data more likely to access properties within an Access Database file, and manage tables or hidden file properties for a legacy system.
 
Traditionally you may have used something like: –

Dim

DBEngine As Object
Set DBEngine = CreateObject("DAO.DBEngine")

Which you would expect to get the latest installed/registered version of the engine. This however nolonger works, so we must be a little more indepth and use a function to get our object. Such as: –

Public

Function DAO_DBEngine() As Object

On Error Resume Next

‘try Access 2007 (DAO Engine version 120)
Set GetDBEngine = CreateObject("DAO.DBEngine.120")

If Err.Number <> 0 Then

‘try DAO version 3.6
Err.Clear
Set GetDBEngine = CreateObject("DAO.DBEngine.36")

If Err.Number <> 0 Then

Set GetDBEngine = CreateObject("DAO.DBEngine.35")

End If

End If

End Function

If the function returns nothing the object has not been created, otherwise it will return the latest available version of DAO, and allow the querying of the database engine.