I have a very simple table for a task queue and keep taking records out one by one. If I add the filter_selector
after filling/resetting the table, postgresql will use it and run super fast. But if I reset the table by calling UPDATE "public"."testqueue" T SET "status"=0
then it will start using the primary key which is 10 times slower.
How can I make sure postgresql keeps using the filter_selector
?
Table:
CREATE TABLE PUBLIC."testqueue" (
"id" BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY,
"guid" "uuid" NOT NULL DEFAULT GEN_RANDOM_UUID (),
"created" TIMESTAMP WITHOUT TIME ZONE NOT NULL DEFAULT NOW(),
"status" SMALLINT NOT NULL DEFAULT 0,
"notbefore" TIMESTAMP WITHOUT TIME ZONE NOT NULL DEFAULT NOW(),
"task" CHARACTER VARYING(255) NOT NULL,
"data" "text" NULL,
"completed" TIMESTAMP WITHOUT TIME ZONE NULL,
PRIMARY KEY (ID)
)
INSERT INTO "public"."testqueue" ("task") SELECT 'DoSomething' FROM GENERATE_SERIES(1, 1000000)
Filter index:
CREATE INDEX "filter_selector" ON "public"."testqueue" ("id") INCLUDE ("status","notbefore") WHERE "status"=0
Explain before updating every status to 0, after ~100000 runs:
Update on public.testqueue t (cost=0.98..9.02 rows=1 width=40) (actual time=0.063..0.065 rows=1 loops=1)
Output: t.id
Buffers: shared hit=16
CTE cte
-> Limit (cost=0.42..0.55 rows=1 width=14) (actual time=0.032..0.032 rows=1 loops=1)
Output: testqueue.id, testqueue.ctid
Buffers: shared hit=6
-> LockRows (cost=0.42..129018.39 rows=999967 width=14) (actual time=0.031..0.032 rows=1 loops=1)
Output: testqueue.id, testqueue.ctid
Buffers: shared hit=6
-> Index Scan using filter_selector on public.testqueue (cost=0.42..119018.72 rows=999967 width=14) (actual time=0.027..0.028 rows=1 loops=1)
Output: testqueue.id, testqueue.ctid
Filter: ((testqueue.status = 0) AND (testqueue.notbefore <= now()))
Buffers: shared hit=5
-> Nested Loop (cost=0.42..8.46 rows=1 width=40) (actual time=0.047..0.047 rows=1 loops=1)
Output: '1'::smallint, t.ctid, cte.*
Inner Unique: true
Buffers: shared hit=10
-> CTE Scan on cte (cost=0.00..0.02 rows=1 width=40) (actual time=0.037..0.037 rows=1 loops=1)
Output: cte.*, cte.id
Buffers: shared hit=6
-> Index Scan using testqueue_pkey on public.testqueue t (cost=0.42..8.44 rows=1 width=14) (actual time=0.008..0.008 rows=1 loops=1)
Output: t.ctid, t.id
Index Cond: (t.id = cte.id)
Buffers: shared hit=4
Planning Time: 0.121 ms
Execution Time: 0.088 ms
Explain after updating every status to 0, after ~100000 runs:
Update on public.testqueue t (cost=0.98..9.02 rows=1 width=40) (actual time=25.514..25.517 rows=1 loops=1)
Output: t.id
Buffers: shared hit=97078 dirtied=9
CTE cte
-> Limit (cost=0.42..0.56 rows=1 width=14) (actual time=25.478..25.479 rows=1 loops=1)
Output: testqueue.id, testqueue.ctid
Buffers: shared hit=97072 dirtied=9
-> LockRows (cost=0.42..133659.01 rows=999967 width=14) (actual time=25.477..25.478 rows=1 loops=1)
Output: testqueue.id, testqueue.ctid
Buffers: shared hit=97072 dirtied=9
-> Index Scan using testqueue_pkey on public.testqueue (cost=0.42..123659.34 rows=999967 width=14) (actual time=25.466..25.467 rows=1 loops=1)
Output: testqueue.id, testqueue.ctid
Filter: ((testqueue.status = 0) AND (testqueue.notbefore <= now()))
Rows Removed by Filter: 100038
Buffers: shared hit=97071 dirtied=9
-> Nested Loop (cost=0.42..8.46 rows=1 width=40) (actual time=25.498..25.499 rows=1 loops=1)
Output: '1'::smallint, t.ctid, cte.*
Inner Unique: true
Buffers: shared hit=97076 dirtied=9
-> CTE Scan on cte (cost=0.00..0.02 rows=1 width=40) (actual time=25.487..25.488 rows=1 loops=1)
Output: cte.*, cte.id
Buffers: shared hit=97072 dirtied=9
-> Index Scan using testqueue_pkey on public.testqueue t (cost=0.42..8.44 rows=1 width=14) (actual time=0.007..0.007 rows=1 loops=1)
Output: t.ctid, t.id
Index Cond: (t.id = cte.id)
Buffers: shared hit=4
Planning:
Buffers: shared hit=6 dirtied=1
Planning Time: 0.147 ms
Execution Time: 25.547 ms
Select:
WITH CTE AS (
SELECT "id"
FROM "public"."testqueue"
WHERE "status"=0
AND "notbefore"<=NOW()
ORDER BY "id"
LIMIT 1
FOR UPDATE SKIP LOCKED
)
UPDATE "public"."testqueue" T
SET "status"=1
FROM CTE
WHERE CTE."id"=T."id"
RETURNING T."id"
Update on comments:
I have included the entire explain with verbose, buffer and settings checked. I have changed the guid
back to id
. That was an attempt to trigger postgresql into not using the pkey, but it didn't work. The testqueue6
was a simple token I forgot to remove because I have been running multiple test.
Changing SELECT *
to SELECT "id"
in the CTE
doesn't change anything.
The whole thing is a follow up on the same question on SQL server which included a tip about the filtered index. The index does work very well and does not seem to be the problem on it's own. It's just that postgresql doesn't use it all the time.
When using the filtered index it can handle roughly 2500 tasks per second no matter how many have been handled. With the primary key it slows down over time, starting at 800 a second and ending on about 10 per second when the list gets used more.
This gets reflected in the EXPLAIN
as well. When using the filter_selector
it always says Rows Removed by Filter: 0
. But when using the primary key it changes to Rows Removed by Filter: 100009
when running it over 100000 times.
explain(analyze, verbose, buffers, settings)
for this statement? All in plain text, as an update to your question.explain
. I can do that first thing tomorrow.Index Scan using testqueue6_pkey on testqueue as testqueue6
where istestqueue6
coming from? Are you using some partitioning scheme? 2) In questions relating to software, always include the software version, the OS and the OS version. Also where you got the software from or where you are running it for SaaS versions. Add that information as text update to question text.SELECT *
when you only need theguid
? And why is there no index on theguid
? That would speed up the UPDATE