I suggest you ...

'Provide an easy or automatic methanism for horrizontal partitioning' has been merged into this idea

Support database sharding

[idea moved from Windows Azure Feature Voting]

362 votes
Vote 0 votes Vote Vote
Vote
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    I agree to the terms of service

    You'll receive a confirmation email with a link to create a password (optional).

    Signed in as (Sign out)
    You have left! (?) (thinking…)
    Haris MajeedAdminHaris Majeed (Admin, Microsoft Windows Azure) shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →
    pita.opita.o shared a merged idea: Provide an easy or automatic methanism for horrizontal partitioning  ·   ·  Show description

    9 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      I agree to the terms of service

      You'll receive a confirmation email with a link to create a password (optional).

      Signed in as (Sign out)
      Submitting...
      • Rodney WillisRodney Willis commented  ·   ·  Flag as inappropriate

        For hosting/developing multi-tenant applications, this is critical. The only real workaround is to create a new database for each customer.

      • JasonJason commented  ·   ·  Flag as inappropriate

        I keep finding videos and other resources from Microsoft that mention the plan to have 'automatic partitioning'. Where can we learn more?

        The strategy suggested sounds possible. We have a multi-tenant database where every table has a CustomerID field. It would be fantastic if SQL Azure could know to partition data based on that. Especially if we can write normal queries without knowledge of the partitions.

      • pita.opita.o commented  ·   ·  Flag as inappropriate

        I think I have a plausible solution to the relational integrity concern for shards: You can make partitionKey an inductive field. By which I mean that it's value per entity is resolved by evaluating the entities with which this candidate entities share a relationship. The implication of this is that all entities that relate to one another will have the same partitionKey. Re: my last comment (below), code tables will be treated differently

      • pita.opita.o commented  ·   ·  Flag as inappropriate

        @Jamie: This might be true. Or, maybe this problem can be solved by
        a layer of abstraction
        + a vtable-like manager that takes a PartitionKey argument and tells you how many physical databases of the same schema type to UNION for a particular query to be UNIONed
        + a fabric-aware organizer to know how and when to move data around physical spaces and update the vtable

        Two impacts I see:
        1. Sql Azure 'per database' pricing structure will have to be modified to accommodate that complexity.

        then
        --------------- OPTION A ---------
        2. The application will have to take on a bit more responsibility: Entities (records) that share a domain relationship (a customer and his purchases) must have a consistent partitionKey. This should be enforced in insert (and so partitionKey should not be updateable).
        3. Code tables and system tables should be replicated and synchronized. I don't mind of you add a syntax for CREATE TABLE &TABLENAME (.....) and make azure specially treat the & as code for this is a code table.

        --------------OPTION B ---------
        let Azure make a radical break and figure out a way to re-implement relational integrity that departs from the traditional (may impact performance but not as much as you think).

      • Jamie ThomsonJamie Thomson commented  ·   ·  Flag as inappropriate

        Difficulty with this is that a databse has foreign keys thus you can't just partition a table, you have to partition (or "shard") the entire database. The problem you then get is that reference/lookup data has to be replicated across all shards.
        SQL Azure then should support automatic sharding where it does 2 things:
        1) Decides which shard a pirce of data goes onto
        2) Replicates reference/lookup data across all shards

        If this "feature vote" encompasses what I just said then I'm all in - 3 votes!

      Knowledge Base and Helpdesk