petergeoghegan
diff --git a/‎src/backend/executor/README
Copy file name to clipboardExpand all lines: src/backend/executor/README
+49Lines changed: 49 additions & 0 deletions b/‎src/backend/executor/README
Copy file name to clipboardExpand all lines: src/backend/executor/README
+49Lines changed: 49 additions & 0 deletions
@@ -200,3 +200,52 @@ is no explicit prohibition on SRFs in UPDATE, but the net effect will be
 that only the first result row of an SRF counts, because all subsequent
 rows will result in attempts to re-update an already updated target row.
 This is historical behavior and seems not worth changing.)
+
+Speculative insertion
+---------------------
+
+Speculative insertion is a process that the executor manages for the
+benefit of INSERT...ON CONFLICT UPDATE... .  The basic idea is that
+values within AMs (that do not currently exist) are "speculatively
+locked".  If a consensus to insert emerges among all unique indexes,
+we proceed with physical index tuple insertion for each unique index
+in turn, releasing value locks as each physical insertion is
+performed.  Otherwise, we must UPDATE the existing value (or IGNORE).
+"Value locks" are implemented using special "speculative heap tuples",
+that represent an attempt to lock values (with special handling for
+race conditions).
+
+"Speculative insertion" is prepared to release "value locks" when a
+conflict occurs.  This prevents "unprincipled deadlocks".  In essence,
+we cannot allow other xacts to wait on our speculatively-inserted
+tuple as if it was a properly inserted tuple.  They'd have to wait
+until xact end, which might be too long, while also implying
+"unprincipled deadlocks".  We are prepared for conflicts both when
+"value locking", and when row locking.
+
+When we UPDATE, value locks are released before an opportunistic
+attempt at locking a conclusively visible conflicting tuple occurs. If
+this process fails, we retry.  We may retry indefinitely.  Failing to
+release value locks serves no practical purpose, since they don't
+prevent many types of conflicts that the UPDATE case must care about,
+and is actively harmful, since it will result in unprincipled
+deadlocking under high concurrency.
+
+The representation of the UPDATE query tree is as a separate query
+tree, auxiliary to the main INSERT query tree, and its plan is not
+formally a subplan of the parent INSERT's.  Rather, the plan's state
+is used selectively by its parent.
+
+Having successfully locked a definitively visible tuple, we update it,
+applying the EvalPlanQual() query execution mechanism to the latest
+(at just determined by an amcanunique AM) conclusively visible, now
+locked tuple.  Earlier versions are not evaluated against our qual,
+and we never directly walk the update chain in the event of the tuple
+being deleted/updated (which is conceptually a conflict).  The process
+simply restarts without making useful progress in the present
+iteration.  It is sometimes necessary to UPDATE a row where no row
+version is visible, so it seems inconsistent to require that earlier
+versions (including a version that may exist that is visible to our
+command's MVCC snapshot) must satisfy the qual just because there
+happened to be a version visible, where otherwise no evaluation would
+occur.