When I’ve started working with Azure SQL there were some differences to SQL Server that I’ve needed to get used to. One of the first was that you just can’t query other databases that you’re already in, querying for [OtherDB].[dbo].[SomeTable] just wasn’t possible anymore. It appears that it may not be possible, but you actually can query for data in other databases.
Let’s start with some explanation and context. Personally I think that if you need to use data from two different databases in a single query on T-SQL level on a usual basis, there is probably something wrong with your database design, you’re doing something you shouldn’t and generally, you should rethink what’re you trying to do at all. If you need to do anything I’m writing about bellow in your regular queries, a warning should appear in your head because it’s not your everyday SELECT something FROM somewhere. But we all know how the real world works, that sometimes you just must do something you really shouldn’t fast and maybe you need to use this just once or twice like I did a while ago.
Let’s start with a simple scenario and create a database on Azure. It will be named UserDb, and it would store some user data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
--UserDb CREATE TABLE [dbo].[Users] ( [Id] bigint IDENTITY(1,1) NOT NULL, [Name] nvarchar(400) NOT NULL, [Email] nvarchar(400) NOT NULL, [Age] int NOT NULL, CONSTRAINT [PK_Users] PRIMARY KEY CLUSTERED ([ID] ASC) ) GO INSERT INTO [dbo].[Users] ([Name], [Email], [Age]) VALUES ('Rafal Hryniewski', 'my@email.com', 30), ('John Smith', 'notmy@email.com', 40), ('Gandalf the Grey', 'some@email.com', 50) GO |
It’s pretty simple for now. Imagine our database grow, our software evolves and one day we need to create another DB for an entire new application. We need to use a separate database, but we need to seed it with existing data. We could just export them, write a simple console app for that or find some solution. In SQL Server we would probably query for [UserDb].[dbo].[Users] and it would be done easily, on Azure SQL it’ll be a little harder than that but it is possible to achieve that entirely on T-SQL level so you can just write a short script and send it to anyone who’s managing production Azure SQL databases.
So let’s create our second database called OrderDb.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
--OrderDb CREATE TABLE [dbo].[Customers] ( [Id] bigint IDENTITY(1,1) NOT NULL, [Name] nvarchar(400) NOT NULL, [UserId] bigint NOT NULL, CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED ([ID] ASC) ) CREATE TABLE [dbo].[Orders] ( [Id] bigint IDENTITY(1,1) NOT NULL, [Shipped] bit NOT NULL, [CustomerId] bigint NOT NULL, CONSTRAINT [PK_Orders] PRIMARY KEY CLUSTERED ([ID] ASC), CONSTRAINT [FK_Orders_Customers] FOREIGN KEY ([CustomerId]) REFERENCES [dbo].[Customers]([Id]) ) |
So, how do we fill Customers table with data from our other database? That’s where the real fun begins. Those scripts below are going to be executed on our target database, in our case, it’s going to be OrderDb. Before doing anything our database must have Master Key. We can open it in session or create a new one. In our case we’re going to create new, temporary one, I guess if you already have one you know how to open it.
1 |
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'MasterKeyPassword1'; |
Now we can create another piece of the puzzle. Credentials to access our external database. Those will be exactly same ones you’re using to perform SQL Authentication. Remember that login you’re going to use should have all permissions required to access any resources you’re going to use.
1 2 3 |
CREATE DATABASE SCOPED CREDENTIAL ExternalAccessCredential WITH IDENTITY = 'admin', --Login that we're using to access database (must match login in external db and must have permissions necessary to access resources we're using) SECRET = 'Password1234'; --Pasword for this login |
Now let’s define source that we’re going to fetch data from. Be warned that we can also access Azure Blob or Hadoop (which I haven’t tried yet) in a similar way. In our case, we’re accessing other SQL DB so in TYPE we must write RDBMS. Other parameters are server address, database name and credentials we’ve created a second ago. And yes – you can access databases outside server you’re already on.
1 2 3 4 5 6 7 8 |
CREATE EXTERNAL DATA SOURCE ExternalDataSource WITH ( TYPE = RDBMS, LOCATION = 'dbserver.database.windows.net', --External db server address DATABASE_NAME= 'UserDb', --External db name CREDENTIAL = ExternalAccessCredential --Credential we've created before ); |
We’re almost finished now. So let’s use our newly created DATA SOURCE to create EXTERNAL TABLE. And be warned – table name and column types must match those in your source. It is possible to skip columns you’re not going to use but anything you’re declaring now must be exactly same as in an external database. Also, remember about WITH clause where you’re going to pass DATA SOURCE you’re just created.
1 2 3 4 5 6 7 8 9 10 11 |
CREATE EXTERNAL TABLE [dbo].[Users] -- Table name and schema must match external table ( [Id] bigint NOT NULL, [Name] nvarchar(400) NOT NULL, [Email] nvarchar(400) NOT NULL, [Age] int NOT NULL ) WITH ( DATA_SOURCE = ExternalDataSource --Previously created DATA SOURCE ); |
Everything should work now so let’s fill our table with data.
1 2 3 4 |
INSERT INTO [dbo].[Customers] --Target table ([Name], [UserId]) SELECT [Name], [Id] FROM [dbo].[Users] -- External table |
Everything worked so now, to be on safe side we can drop created external sources, tables etc. (Or we can play a little with it, in that case, don’t do it just yet)
1 2 3 4 |
DROP EXTERNAL TABLE [dbo].[Users] DROP EXTERNAL DATA SOURCE ExternalDataSource DROP DATABASE SCOPED CREDENTIAL ExternalAccessCredential DROP MASTER KEY |
Before we start something more, here’s entire script that creates external stuff in one part.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'MasterKeyPassword1'; CREATE DATABASE SCOPED CREDENTIAL ExternalAccessCredential WITH IDENTITY = 'admin', --Login that we're using to access database (must match login in external db and must have permissions necessary to access resources we're using) SECRET = 'Password1234'; --Pasword for this login CREATE EXTERNAL DATA SOURCE ExternalDataSource WITH ( TYPE = RDBMS, LOCATION = 'dbserver.database.windows.net', --External db server address DATABASE_NAME= 'UserDb', --External db name CREDENTIAL = ExternalAccessCredential --Credential we've created before ); CREATE EXTERNAL TABLE [dbo].[Users] -- Table name and schema must match external table ( [Id] bigint NOT NULL, [Name] nvarchar(400) NOT NULL, [Email] nvarchar(400) NOT NULL, [Age] int NOT NULL ) WITH ( DATA_SOURCE = ExternalDataSource --Previously created DATA SOURCE ); |
So, let’s play. And before we start doing stuff, I must repeat – if you really going to do that think twice or thrice because if you really need to, there is a possibility you’re mistaken. We’ll get back to this at the end of this post.
So, can we, for example, insert data into an external table? Let’s try that.
1 2 |
INSERT INTO [dbo].[Users] ([Name], [Email], [Age]) VALUES ('Some Outsider', 'notfromthere@email.com', 60) |
Sadly we’re going to get an error – DML Operations are not supported with external tables. What are those DML operations? Those are Data Manipulation Language statements like INSERT, DELETE, SELECT and some other. And yes, I know I included SELECT , which we’ve used a while ago because it is, in fact, DML statement, it’s because error we’ve seen hasn’t been 100% accurate.
So we can’t insert things. But can we somehow JOIN our Customers table with external Users table? You’ll never know until you try.
1 2 3 4 |
SELECT * FROM [dbo].[Customers] c JOIN [dbo].[Users] u ON c.[UserId] = u.[Id] |
Hooray, we have our results so it’s possible. So let’s create a schemabound SQL View from this and check if it’s possible and if it’ll prevent us from dropping external table (schemabinding should do that).
1 2 3 4 5 6 7 8 9 10 11 12 |
CREATE VIEW [dbo].[CustomerWithAge] WITH SCHEMABINDING AS ( SELECT c.[Id] AS CustomerId, c.[Name] AS CustomerName, u.[Age] AS Age FROM [dbo].[Customers] c JOIN [dbo].[Users] u ON c.[UserId] = u.[Id] ) |
And it appears schemabinding is not available with external tables. But if we remove the second line of this code we’re going to have a working SQL View that’ll access data from an external table.
Let’s go a little bit further. So how about creating User Defined Function that will fetch user age from an external table when we’ll pass user id into id?
1 2 3 4 5 6 7 8 9 |
CREATE FUNCTION [dbo].[GetUserAge](@userId bigint) RETURNS int AS BEGIN DECLARE @age int; SELECT @age = FIRST_VALUE([Age]) OVER (ORDER BY [Id]) FROM [dbo].[Users] WHERE [Id] = @userId RETURN @age; END GO |
It works so now when we’ll try this or similar query we’re going to get an age of any user.
1 |
SELECT [dbo].[GetUserAge](1) |
It just works, if you don’t believe me – you can use this code yourself check this and play with it yourself. I’m encouraging you to that. So let’s do one little final thing. Let’s add a computed column to our Customer table that will use our function and display the age of Customer in its table.
1 2 |
ALTER TABLE [dbo].[Customers] ADD [Age] AS [dbo].[GetUserAge]([UserId]) |
And … it works! Holy potatoes, how cool is that?!
Well, it isn’t. Results are really fun, you can access data really fast and if you know how to use SQL tools like UDFs, procedures and/or views you can achieve a lot without rebuilding you’re entire schema and database design. Sometimes it’s important, it’s not always a bad way to go but there are two reasons that this would be a smelly solution in a long run.
- Those databases were separate because of some reason.
- Forget about maxing your performance, queries could be slow.
- The first query in session will have some performance overhead because it does need to authenticate against external database first.
Still, playing with it was kinda fun. Hope you like it.