SQL Server: Splitting Strings

Today I was asked how to extract the directional component from a street address. For example, the following 3 street addresses should have a directional value of ‘East’:

  1. E Street
  2. East Road
  3. Big River Blvd E

My first approach was to do the old fashioned case statement.

Select street_address
, case when address like ‘E%’ then ‘East’
when address like ‘%East%’ then ‘East’
when …
else ‘Unknown’
end as address_direction
From table_address;

I started realizing this was going to get complicated fast and prone to error. If I look for patterns for E%, I’ll be associating a directional value of East to a street named Eastwood. Oops! So spaces are important. But then if I look for a ‘E %’, I will only capture where E is at the beginning of the string. What about a street address where E is in the middle of the string?

There’s actually a much simpler way. Use the STRING_SPLIT function. (All of a sudden I have a hankering for a banana split…).

Select street_address
,value
From table_address
Cross Apply STRING_SPLIT(street_address, ‘ ‘);

This will split up your street_address field into individual records using a space ‘ ‘ as the character to split on. So I would end up with a table like:

street_address value
E Street E
E Street Street
East Road East
East Road Road
Big River Blvd E Big
Big River Blvd E River
Big River Blvd E Blvd
Big River Blvd E E

 

You can see that your address is repeated for each split value. Now you can simply create a lookup table and join to it to derive the directional component. For sake of this scenario, I’ll just limit to 4 directions (East, West, North, South), but you should consider including Northwest, etc.

;with cte_direction as (
Select ‘East’ as street_address_direction
,’E’ as street_address_split_value
union
Select ‘East’ as street_address_direction
,’East’ as street_address_split_value
union
Select ‘West’ as street_address_direction
,’W’ as street_address_split_value
union
Select ‘West’ as street_address_direction
,’West’ as street_address_split_value
union
Select ‘North’ as street_address_direction
,’N’ as street_address_split_value
union
Select ‘North’ as street_address_direction
,’North’ as street_address_split_value
union
Select ‘South’ as street_address_direction
,’S’ as street_address_split_value
union
Select ‘South’ as street_address_direction
,’South’ as street_address_split_value
)
SELECT ta.property_id
,ta.street_address
,value
,d.street_address_direction
FROM table_address ta
cross apply string_split(ta.street_address,’ ‘)
inner join cte_direction d on d.street_address_split_value = value;

This gave me a record per property for each street address that contained directions of East, West, North or South. Mind you, there might still be some issues with duping. If there is a street address called E Sound Dr N, you’ll get 2 records for that property. You might want to locate those addresses that are duping and setup some prioritization/sorting rules.

OK – banana split time!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.

Up ↑

%d