Add an Upsert command in Azure tables
It is quite typical in data warehousing scenarios to have to insert a row if it doesn't exist, and update it if it does ("Upsert").
In the current version, if you want to do that for a batch of 100 entities, you have to retrieve the entity first, so you know wether to update it or insert it. And since there is no way of retrieving a batch of 100 entities when you know their Partition Key/Row Key, you pretty much have to do 1 request for every single entity, costing you 100 transactions before you are able to do one entity group transaction of 100 entities.
If Azure supports Upsert, there would be no need to retrieve the entity prior to the update/insert, so batching 100 upsert would be possible for the cost of only 1 transaction.
Since datawarehousing is one of the only scenario where an azure type storage is really interesting compared to relational databases, the upsert operation is really a must.
8 comments
-
Julian Dominguez
commented
Although this is now implemented in Azure, it would be great if it was also supported in the storage emulator.
-
Takekazu Omi commented
thanks
-
Azure
commented
Upsert was added in the Sept 2011 release:
-
Konstantin Isaev commented
I have a question about this - what should service return after such request? Especially in case of batched one? a "bool?" value, an enumeration (inserted, updated, error)?
The format of this site does not allow to submit a well-descriptive detailed
tech-offer... -
vermorel
commented
Upsert is really a must indeed, without it, it is not even possible to achieve a decent idempotent semantic (idempotency of operations is really key property for async distributed apps).
-
zmorris
commented
Not having this feature is a huge performance drain. Could someone from MS please comment on where this might be in the roadmap?
-
Stuart (Cirrious)
commented
Totally agree and Upsert command would be great - it would save a lot of querying time - 3 votes from me
-
Jason
commented
Agreed. Perhaps if the ability to use LINQ's Contains was permitted then you could get around the multiple single queries. You could basically perform a "where RowKey in (1,6,20,55)" sort of query. But that only solves half the problem.
Upsert is definitely a must!