Querying external databases in Azure SQL

By | January 25, 2018

When I’ve started working with Azure SQL there were some differences to SQL Server that I’ve needed to get used to. One of the first was that you just can’t query other databases that you’re already in, querying for [OtherDB].[dbo].[SomeTable] just wasn’t possible anymore. It appears that it may not be possible, but you actually can query for data in other databases.

Let’s start with some explanation and context. Personally I think that if you need to use data from two different databases in a single query on T-SQL level on a usual basis, there is probably something wrong with your database design, you’re doing something you shouldn’t and generally, you should rethink what’re you trying to do at all. If you need to do anything I’m writing about bellow in your regular queries, a warning should appear in your head because it’s not your everyday SELECT something FROM somewhere. But we all know how the real world works, that sometimes you just must do something you really shouldn’t fast and maybe you need to use this just once or twice like I did a while ago.

Let’s start with a simple scenario and create a database on Azure. It will be named UserDb, and it would store some user data.

It’s pretty simple for now. Imagine our database grow, our software evolves and one day we need to create another DB for an entire new application. We need to use a separate database, but we need to seed it with existing data. We could just export them, write a simple console app for that or find some solution. In SQL Server we would probably query for [UserDb].[dbo].[Users] and it would be done easily, on Azure SQL it’ll be a little harder than that but it is possible to achieve that entirely on T-SQL level so you can just write a short script and send it to anyone who’s managing production Azure SQL databases.

So let’s create our second database called OrderDb.

So, how do we fill Customers table with data from our other database? That’s where the real fun begins. Those scripts below are going to be executed on our target database, in our case, it’s going to be OrderDb. Before doing anything our database must have Master Key. We can open it in session or create a new one. In our case we’re going to create new, temporary one, I guess if you already have one you know how to open it.

Now we can create another piece of the puzzle. Credentials to access our external database. Those will be exactly same ones you’re using to perform SQL Authentication. Remember that login you’re going to use should have all permissions required to access any resources you’re going to use.

Now let’s define source that we’re going to fetch data from. Be warned that we can also access Azure Blob or Hadoop (which I haven’t tried yet) in a similar way. In our case, we’re accessing other SQL DB so in TYPE we must write RDBMS. Other parameters are server address, database name and credentials we’ve created a second ago. And yes – you can access databases outside server you’re already on.

We’re almost finished now. So let’s use our newly created DATA SOURCE to create EXTERNAL TABLE. And be warned – table name and column types must match those in your source. It is possible to skip columns you’re not going to use but anything you’re declaring now must be exactly same as in an external database. Also, remember about WITH clause where you’re going to pass DATA SOURCE you’re just created.

Everything should work now so let’s fill our table with data.

Everything worked so now, to be on safe side we can drop created external sources, tables etc. (Or we can play a little with it, in that case, don’t do it just yet)

Before we start something more, here’s entire script that creates external stuff in one part.

So, let’s play. And before we start doing stuff, I must repeat – if you really going to do that think twice or thrice because if you really need to, there is a possibility you’re mistaken. We’ll get back to this at the end of this post.

So, can we, for example, insert data into an external table? Let’s try that.

Sadly we’re going to get an error – DML Operations are not supported with external tables. What are those DML operations? Those are Data Manipulation Language statements like INSERT, DELETE, SELECT and some other. And yes, I know I included SELECT , which we’ve used a while ago because it is, in fact, DML statement, it’s because error we’ve seen hasn’t been 100% accurate.

So we can’t insert things. But can we somehow JOIN our Customers table with external Users table? You’ll never know until you try.

Hooray, we have our results so it’s possible. So let’s create a schemabound SQL View from this and check if it’s possible and if it’ll prevent us from dropping external table (schemabinding should do that).

And it appears schemabinding is not available with external tables. But if we remove the second line of this code we’re going to have a working SQL View that’ll access data from an external table.

Let’s go a little bit further. So how about creating User Defined Function that will fetch user age from an external table when we’ll pass user id into id?

It works so now when we’ll try this or similar query we’re going to get an age of any user.

It just works, if you don’t believe me – you can use this code yourself check this and play with it yourself. I’m encouraging you to that. So let’s do one little final thing. Let’s add a computed column to our Customer table that will use our function and display the age of Customer in its table.

And … it works! Holy potatoes, how cool is that?!

Well, it isn’t. Results are really fun, you can access data really fast and if you know how to use SQL tools like UDFs, procedures and/or views you can achieve a lot without rebuilding you’re entire schema and database design. Sometimes it’s important, it’s not always a bad way to go but there are two reasons that this would be a smelly solution in a long run.

  1. Those databases were separate because of some reason.
  2. Forget about maxing your performance, queries could be slow.
  3. The first query in session will have some performance overhead because it does need to authenticate against external database first.

Still, playing with it was kinda fun. Hope you like it.