Working with concatenated strings in SQL can be a pain, especially when all the information you need is smashed into a single column. Thankfully, SQL Server provides a solution to cleanly split strings into individual parts using the STRING_SPLIT
function. Let’s walk through how it works, how to join it with other tables, and even how to rearrange split data into separate columns.
Why Use the STRING_SPLIT Function?
Imagine you’ve got a single column in your database jam-packed with multiple values, all separated by a comma (or some other delimiter). Parsing and separating these values manually can be frustrating and error-prone. While you could use other methods such as CHARINDEX
to locate delimiters, that approach quickly becomes tedious, especially with long strings or multiple rows.
Here’s where STRING_SPLIT
saves the day. This function breaks a concatenated string into smaller pieces based on a specified delimiter. The output? It’s not just split strings, but an entire table with each split value appearing as a row.
How STRING_SPLIT Works
The STRING_SPLIT
function requires two inputs:
- The string you want to split.
- The delimiter that separates the values.
Here’s a quick example. Let’s say you’ve got a string, "Adam,Smith"
:
SELECT value
FROM STRING_SPLIT('Adam,Smith', ',');
When running this query, the output will be:
value |
---|
Adam |
Smith |
Each part of the string appears as a separate row in the value
column.
In real-world cases, though, you’ll typically use it on data from a table. Let’s see how that’s done and tackle more advanced scenarios.
Joining STRING_SPLIT with Table Data
Say you want to split employee names stored in a single column within a table called dbo.Employee
. Here’s how you can do it:
-
Setup: Write a
SELECT
statement that pulls values fromSTRING_SPLIT
and the employee table. -
Link with Cross Apply: Since
STRING_SPLIT
returns a table, you’ll need to use theCROSS APPLY
operator to join it with your existing table. This allows each string in the column to be split into separate rows.
Here’s an example query:
SELECT EmployeeID, value
FROM dbo.Employee
CROSS APPLY STRING_SPLIT(Name, ',');
What Happens?
For each EmployeeID
, this query generates separate rows for each value in the Name
column:
EmployeeID | value |
---|---|
1 | Adam |
1 | Smith |
Notice that the function splits the string "Adam,Smith"
into two rows. However, this raises a challenge: What if you need these parts to appear as columns instead?
Turning Rows into Columns
Splitting strings into rows is useful, but sometimes you need those pieces as columns like FirstName
and LastName
. To achieve this, use the PIVOT
function.
Preparing the Data
First, assign each split piece a unique identifier using the ROW_NUMBER
function. This will help you map parts of the string (e.g., first name or last name) to specific columns later.
Here’s how it works:
WITH NameCTE AS (
SELECT
EmployeeID,
value,
ROW_NUMBER() OVER (PARTITION BY EmployeeID ORDER BY EmployeeID) AS RowNum
FROM dbo.Employee
CROSS APPLY STRING_SPLIT(Name, ',')
)
In this Common Table Expression (CTE)
, each split value now has a RowNum
:
EmployeeID | value | RowNum |
---|---|---|
1 | Adam | 1 |
1 | Smith | 2 |
Applying the PIVOT Function
Next, use PIVOT
to rearrange the rows into columns:
SELECT
EmployeeID,
[1] AS LastName,
[2] AS FirstName
FROM NameCTE
PIVOT (
MAX(value)
FOR RowNum IN ([1], [2])
) AS PivotTable;
The PIVOT
function transforms the RowNum
values into column headers, organizing the split parts as LastName
and FirstName
in the output:
EmployeeID | LastName | FirstName |
---|---|---|
1 | Adam | Smith |
Why Use MAX?
The MAX
function ensures the query correctly selects values from the pivoted rows while ignoring nulls. Think of it like picking the most meaningful value for each row-column combination.
Common Issues and Fixes
-
Extra Spaces: After splitting strings, you might notice unwanted spaces in values. Run the
TRIM
function to remove them:TRIM(value)
-
Multiple Delimiters: If your strings use multiple delimiters, you’ll need additional logic to handle them. This might require
REPLACE
functions or other custom setups. -
Dynamic Columns: If a string has a variable number of parts, dynamic SQL may be useful to handle differing numbers of split values.
Conclusion
The STRING_SPLIT
function is a powerful tool for breaking down concatenated strings in SQL Server. When paired with CROSS APPLY
and PIVOT
, it becomes even more versatile, allowing you to restructure and clean tabular data in meaningful ways. Whether you’re working with long-winded columns or just need a better way to parse text, this combination of SQL functions will save you time and frustration.
Got questions or tips for working with STRING_SPLIT
? Drop them in the comments! And don’t forget to check out our full guide on dynamic pivots for even more advanced use cases.
Learn More
For any questions, reach out to learn@knowstar.org.
Consider expanding your expertise with these certifications:
- Google Data Analytics Professional Certificate: https://imp.i384100.net/OR37oQ
- Google Advanced Data Analytics Professional Certificate: https://imp.i384100.net/eK1WmQ
- Best SQL and Data Analytics Books
- T-SQL Fundamentals (By Itzik Ben-Gan) - https://amzn.to/4koKGdC
- Ace the Data Science Interview - https://amzn.to/3D2ne5n