TheGeekery

The Usual Tech Ramblings

SQL 2008 and MERGE

A feature I’ve long awaited in Microsoft SQL (TSQL) has been a function similar to MySQL’s REPLACE function. For those that haven’t used it, the REPLACE function allows you to do an insert of a new record, or replace an existing record1 if one matches the primary key. In SQL 2008, Microsoft introduced a function called MERGE. This is like REPLACE, but a whole bunch of extra goodies…

About MySQL Replace

Before I get into the fun stuff of MERGE, lets look at an example of MySQL’s REPLACE function. We’ll do this by example, as everybody loves a good sample. We’ll take a company that has a ticket tracking system, with a table containing the employee ID number, and their names. We have an employee, Susan Johnson, who is about to get married to her long time high-school sweetheart, Scott Drew. Susan’s employee ID is 55458. There are two ways which we could update her last name in the database, the basic UPDATE statement, which will look like this:

1
UPDATE employees set last_name = 'Drew' where employee_id = 55458

Then there is the REPLACE method, which looks similar to an INSERT:

1
REPLACE INTO employees (employee_id, last_name) VALUES (55458, 'Drew')

In the event that HR had not done a good job, the REPLACE into statement would create a record if one did not exist for Susan, or if a record did exist for her, it would update her last name. Pretty cool huh? There are some caveats to this function, so take a look at the documentation, and the user comments.

T-SQL MERGE

Microsoft did a good job with this new function, making it quite flexible. For a detailed look into it, you should read the documentation, it’s quite extensive, and gives a complex example, but for us, we’ll stick with our simple example of Susan Johnson.

1
2
3
4
5
6
7
8
MERGE employees AS target
  USING (55458, 'Drew') AS source (employee_id, last_name)
  ON (target.employee_id = source.employee_id)
  WHEN MATCHED THEN
    UPDATE SET last_name = source.last_name
  WHEN NOT MATCHED THEN
    INSERT (employee_id, last_name)
    VALUES (source.employee_id, source.last_name)

Okay, that’s an awful lot of code that you managed to do with a single line in MySQL, but unlike MySQL the T-SQL code allows you to match on any field, not just a primary key like employee ID. Lets boost up the example, and show you what we can really do here…

For this, I’m going to use a sample of what I do nearly every day. We get requests to add vehicles to a store’s data, or update data if it already exists. Originally this would require two separate runs, an insert (using LEFT JOINs to find the missing target data), and an update.

1
2
3
4
5
6
7
8
MERGE vehicle_data AS target
  USING new_customer_vehicles AS source
  ON (target.vin = source.vin and target.stock_number = source.stock_number and target.in_date = source.in_date)
  WHEN MATCHED THEN
    UPDATE SET target.retail_price = source.retail_price
  WHEN NOT MATCHED THEN
    INSERT (vin, stock_number, in_date, retail_price,status_id)
      VALUES (source.vin, source.stock_number, source.in_date, source.retail_price,13)

Now with this example, we can see I do a lot more than match on the primary ID of the table (which in our case is actually vehicle_id). This update would not have been possible with MySQL because it would have required you to use the vehicle_id from the vehicle_data table, which the new_customer_vehicles table did not have. Whilst it still looks like a fair bit of code, it is now a single statement over 2 different statements to achieve the same thing. There is something else handy about the MERGE function too, you get to do something with the unmatched data in the target side. Taking the same as above, we’ll assume the customer sent us data straight from their system, and what they gave us must match, which means stuff we have showing, should no longer be showing. We do this by changing the status ID. Appending the following code allows us to tweak the records that don’t match…

1
2
3
  WHEN NOT MATCHED BY SOURCE
    THEN
      SET target.status_id = 48

Now that is a handy, and powerful function. I dare say I’ll be using it a lot more now I’ve discovered it, and might even start poking our developers to start integrating it into places where it’d work better than what they are currently using.

This is just one of the cool things they added to SQL 2008 that I like, what other things have they added that you like too?


  1. This is really a bit false, the term replace would hint it does an update, however it actually does a DELETE then an INSERT, so if you have triggers, be careful.

Comments