How can we improve Windows Azure Data Management: Tables?

Add an Upsert command in Azure tables

It is quite typical in data warehousing scenarios to have to insert a row if it doesn't exist, and update it if it does ("Upsert").
In the current version, if you want to do that for a batch of 100 entities, you have to retrieve the entity first, so you know wether to update it or insert it. And since there is no way of retrieving a batch of 100 entities when you know their Partition Key/Row Key, you pretty much have to do 1 request for every single entity, costing you 100 transactions before you are able to do one entity group transaction of 100 entities.

If Azure supports Upsert, there would be no need to retrieve the entity prior to the update/insert, so batching 100 upsert would be possible for the cost of only 1 transaction.

Since datawarehousing is one of the only scenario where an azure type storage is really interesting compared to relational databases, the upsert operation is really a must.

71 votes
Vote
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    I agree to the terms of service
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    Flavien CharlonFlavien Charlon shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

    8 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      Submitting...
      • Julian DominguezJulian Dominguez commented  ·   ·  Flag as inappropriate

        Although this is now implemented in Azure, it would be great if it was also supported in the storage emulator.

      • Konstantin IsaevKonstantin Isaev commented  ·   ·  Flag as inappropriate

        I have a question about this - what should service return after such request? Especially in case of batched one? a "bool?" value, an enumeration (inserted, updated, error)?
        The format of this site does not allow to submit a well-descriptive detailed
        tech-offer...

      • vermorelvermorel commented  ·   ·  Flag as inappropriate

        Upsert is really a must indeed, without it, it is not even possible to achieve a decent idempotent semantic (idempotency of operations is really key property for async distributed apps).

      • zmorriszmorris commented  ·   ·  Flag as inappropriate

        Not having this feature is a huge performance drain. Could someone from MS please comment on where this might be in the roadmap?

      • Stuart (Cirrious)Stuart (Cirrious) commented  ·   ·  Flag as inappropriate

        Totally agree and Upsert command would be great - it would save a lot of querying time - 3 votes from me

      • JasonJason commented  ·   ·  Flag as inappropriate

        Agreed. Perhaps if the ability to use LINQ's Contains was permitted then you could get around the multiple single queries. You could basically perform a "where RowKey in (1,6,20,55)" sort of query. But that only solves half the problem.

        Upsert is definitely a must!

      Feedback and Knowledge Base