@@ -200,3 +200,52 @@ is no explicit prohibition on SRFs in UPDATE, but the net effect will be
200
200
that only the first result row of an SRF counts, because all subsequent
201
201
rows will result in attempts to re-update an already updated target row.
202
202
This is historical behavior and seems not worth changing.)
203
+
204
+ Speculative insertion
205
+ ---------------------
206
+
207
+ Speculative insertion is a process that the executor manages for the
208
+ benefit of INSERT...ON CONFLICT UPDATE... . The basic idea is that
209
+ values within AMs (that do not currently exist) are "speculatively
210
+ locked". If a consensus to insert emerges among all unique indexes,
211
+ we proceed with physical index tuple insertion for each unique index
212
+ in turn, releasing value locks as each physical insertion is
213
+ performed. Otherwise, we must UPDATE the existing value (or IGNORE).
214
+ "Value locks" are implemented using special "speculative heap tuples",
215
+ that represent an attempt to lock values (with special handling for
216
+ race conditions).
217
+
218
+ "Speculative insertion" is prepared to release "value locks" when a
219
+ conflict occurs. This prevents "unprincipled deadlocks". In essence,
220
+ we cannot allow other xacts to wait on our speculatively-inserted
221
+ tuple as if it was a properly inserted tuple. They'd have to wait
222
+ until xact end, which might be too long, while also implying
223
+ "unprincipled deadlocks". We are prepared for conflicts both when
224
+ "value locking", and when row locking.
225
+
226
+ When we UPDATE, value locks are released before an opportunistic
227
+ attempt at locking a conclusively visible conflicting tuple occurs. If
228
+ this process fails, we retry. We may retry indefinitely. Failing to
229
+ release value locks serves no practical purpose, since they don't
230
+ prevent many types of conflicts that the UPDATE case must care about,
231
+ and is actively harmful, since it will result in unprincipled
232
+ deadlocking under high concurrency.
233
+
234
+ The representation of the UPDATE query tree is as a separate query
235
+ tree, auxiliary to the main INSERT query tree, and its plan is not
236
+ formally a subplan of the parent INSERT's. Rather, the plan's state
237
+ is used selectively by its parent.
238
+
239
+ Having successfully locked a definitively visible tuple, we update it,
240
+ applying the EvalPlanQual() query execution mechanism to the latest
241
+ (at just determined by an amcanunique AM) conclusively visible, now
242
+ locked tuple. Earlier versions are not evaluated against our qual,
243
+ and we never directly walk the update chain in the event of the tuple
244
+ being deleted/updated (which is conceptually a conflict). The process
245
+ simply restarts without making useful progress in the present
246
+ iteration. It is sometimes necessary to UPDATE a row where no row
247
+ version is visible, so it seems inconsistent to require that earlier
248
+ versions (including a version that may exist that is visible to our
249
+ command's MVCC snapshot) must satisfy the qual just because there
250
+ happened to be a version visible, where otherwise no evaluation would
251
+ occur.
0 commit comments