Parse delimited values in SQL Server with STRING_SPLIT

September 24, 2020

Have you ever needed to get values from a delimited string in SQL Server? This post will look at how to do this with the SQL Server function, STRING_SPLIT.

Before the inclusion of STRING_SPLIT, there were a couple of options to achieve this. Define a string splitting function yourself, or use recursive CTEs. Thankfully there's a way that's simpler and clearer.

A note on availability: STRING_SPLIT is only available in SQL SERVER 2016 and later and requires the compatibility to be 130 or above.

STRING_SPLIT

STRING_SPLIT takes two required inputs, a string (varchar, nvarchar, nchar, or char type), and a separator. The function has the following syntax.

STRING_SPLIT(string, seperator)

The function returns a table with a single column called value with a row per separated value.

Let's see a simple example.

DECLARE @big_five VARCHAR(MAX) = 'Apple,Amazon,Alphabet,Facebook,Microsoft'
SELECT * FROM STRING_SPLIT(@big_five, ',')

Because the function returns a table, we can use it as we would any other table. Here we'll split the values from our simple example and use them to filter a table.

INSERT INTO #large_companies(company_name, market_cap)
SELECT
    company_name
    ,market_cap
FROM
(
VALUES
('Saudi Aramco',1684.8),('Microsoft',1359),('Apple', 1285.5)
,('Amazon', 1233.4),('Alphabet', 919.3),('Facebook', 583.7)
,('Alibaba', 545.4),('Tencent Holdings', 509.7),('Berkshire Hathaway', 455.4)
) as data(company_name, market_cap)

SELECT
    #large_companies.company_name
    ,#large_companies.market_cap
FROM
    #large_companies
WHERE
    #large_companies.company_name IN (SELECT [value] FROM STRING_SPLIT(@big_five, ','))

A quick note about spaces: the STRING_SPLIT function won't trim spaces before or after the delimiter. For example, a string "A, B, C" will include the space before B and C.

Uses

STRING_SPLIT can be used in different ways. We'll set up some dummy data in a temp table as our data source to demonstrate these.

DROP TABLE IF EXISTS #world_cup_players
CREATE TABLE #world_cup_players (
    player VARCHAR(30)
    ,tournaments VARCHAR(MAX)
)

INSERT INTO #world_cup_players(player, tournaments)
SELECT
    data.player
    ,data.tournaments
FROM
(
VALUES
('Lionel Messi','2006,2010,2014,2018')
,('Thierry Henry','1998,2002,2006,2010')
,('Ronaldo','1998,2002,2006')
,('Paolo Maldini','1990,1994,1998,2002')
,('Oliver Kahn','2002,2006')
) as data(player,tournaments)

Use STRING_SPLIT with a table.

If the separated values are in a column in a table, we can use CROSS APPLY to apply the function per row. The result is a row per delimited value.

SELECT 
    #world_cup_players.player
    ,tournament.[value]
FROM 
    #world_cup_players
        CROSS APPLY STRING_SPLIT(#world_cup_players.tournaments, ',') as tournament

As an aggregator

We can take the above technique further and use it to aggregate the values returned by STRING_SPLIT.

SELECT 
    #world_cup_players.player
    ,COUNT(tournament.[value]) as num_played
FROM 
    #world_cup_players
        CROSS APPLY STRING_SPLIT(#world_cup_players.tournaments, ',') as tournament
GROUP BY
    #world_cup_players.player

Filter value in a separated list

We can use the function to filter records that have a specific value inside the delimited column.

Below, we filter for all players who played at the 1998 World Cup tournament.

SELECT 
    #world_cup_players.player
FROM 
    #world_cup_players
WHERE
    '1998' IN (SELECT value FROM STRING_SPLIT(#world_cup_players.tournaments, ','))

Combine with ROW_NUMBER

Using ROW_NUMBER, we can give each value a row number to act as an index. This is useful for finding the Nth value in the delimited string.

SELECT
    base_data.player
    ,base_data.[value] as second_tournament
FROM
(
SELECT 
    #world_cup_players.player
    ,ROW_NUMBER() OVER (PARTITION BY #world_cup_players.player ORDER BY tournament.[value]) as tournament_num    ,tournament.[value]
FROM 
    #world_cup_players
        CROSS APPLY STRING_SPLIT(#world_cup_players.tournaments, ',') as tournament
) as base_data
WHERE
    base_data.tournament_num = 2

Conclusion

And there we have it! A brief look at STRING_SPLIT, a simple but useful function.

See also: