Support database sharding
[idea moved from Windows Azure Feature Voting]
9 comments
-
Cihan
commented
This is done. Here is the annoucement!
http://blogs.msdn.com/b/cbiyikoglu/archive/2011/12/12/sql-azure-federations-is-open-for-business.aspx -
Rodney Willis
commented
For hosting/developing multi-tenant applications, this is critical. The only real workaround is to create a new database for each customer.
-
CuriousGeorge
commented
This would be truly awesome. I wonder if this is too much to ask though?
-
Martin
commented
This would be fantastic! Are there any news on this?
-
Jason
commented
I keep finding videos and other resources from Microsoft that mention the plan to have 'automatic partitioning'. Where can we learn more?
The strategy suggested sounds possible. We have a multi-tenant database where every table has a CustomerID field. It would be fantastic if SQL Azure could know to partition data based on that. Especially if we can write normal queries without knowledge of the partitions.
-
pita.o
commented
I think I have a plausible solution to the relational integrity concern for shards: You can make partitionKey an inductive field. By which I mean that it's value per entity is resolved by evaluating the entities with which this candidate entities share a relationship. The implication of this is that all entities that relate to one another will have the same partitionKey. Re: my last comment (below), code tables will be treated differently
-
pita.o
commented
@Jamie: This might be true. Or, maybe this problem can be solved by
a layer of abstraction
+ a vtable-like manager that takes a PartitionKey argument and tells you how many physical databases of the same schema type to UNION for a particular query to be UNIONed
+ a fabric-aware organizer to know how and when to move data around physical spaces and update the vtableTwo impacts I see:
1. Sql Azure 'per database' pricing structure will have to be modified to accommodate that complexity.then
--------------- OPTION A ---------
2. The application will have to take on a bit more responsibility: Entities (records) that share a domain relationship (a customer and his purchases) must have a consistent partitionKey. This should be enforced in insert (and so partitionKey should not be updateable).
3. Code tables and system tables should be replicated and synchronized. I don't mind of you add a syntax for CREATE TABLE &TABLENAME (.....) and make azure specially treat the & as code for this is a code table.--------------OPTION B ---------
let Azure make a radical break and figure out a way to re-implement relational integrity that departs from the traditional (may impact performance but not as much as you think). -
Jamie Thomson
commented
Difficulty with this is that a databse has foreign keys thus you can't just partition a table, you have to partition (or "shard") the entire database. The problem you then get is that reference/lookup data has to be replicated across all shards.
SQL Azure then should support automatic sharding where it does 2 things:
1) Decides which shard a pirce of data goes onto
2) Replicates reference/lookup data across all shardsIf this "feature vote" encompasses what I just said then I'm all in - 3 votes!