Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 84dc6a4

Browse filesBrowse files
author
Amit Kapila
committed
Doc: Hash Indexes.
A new chapter for Hash Indexes, designed to help users understand how they work and when to use them. Backpatch-through 10 where we have made hash indexes durable. Author: Simon Riggs Reviewed-By: Justin Pryzby, Amit Kapila Discussion: https://postgr.es/m/CANbhV-HRjNPYgHo--P1ewBrFJ-GpZPb9_25P7=Wgu7s7hy_sLQ@mail.gmail.com
1 parent f3b1321 commit 84dc6a4
Copy full SHA for 84dc6a4

File tree

3 files changed

+164
-0
lines changed
Filter options

3 files changed

+164
-0
lines changed

‎doc/src/sgml/filelist.sgml

Copy file name to clipboardExpand all lines: doc/src/sgml/filelist.sgml
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@
8989
<!ENTITY spgist SYSTEM "spgist.sgml">
9090
<!ENTITY gin SYSTEM "gin.sgml">
9191
<!ENTITY brin SYSTEM "brin.sgml">
92+
<!ENTITY hash SYSTEM "hash.sgml">
9293
<!ENTITY planstats SYSTEM "planstats.sgml">
9394
<!ENTITY indexam SYSTEM "indexam.sgml">
9495
<!ENTITY nls SYSTEM "nls.sgml">

‎doc/src/sgml/hash.sgml

Copy file name to clipboard
+162Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
<!-- doc/src/sgml/hash.sgml -->
2+
3+
<chapter id="hash-index">
4+
<title>Hash Indexes</title>
5+
6+
<indexterm>
7+
<primary>index</primary>
8+
<secondary>Hash</secondary>
9+
</indexterm>
10+
11+
<sect1 id="hash-intro">
12+
<title>Overview</title>
13+
14+
<para>
15+
<productname>PostgreSQL</productname>
16+
includes an implementation of persistent on-disk hash indexes,
17+
which are fully crash recoverable. Any data type can be indexed by a
18+
hash index, including data types that do not have a well-defined linear
19+
ordering. Hash indexes store only the hash value of the data being
20+
indexed, thus there are no restrictions on the size of the data column
21+
being indexed.
22+
</para>
23+
24+
<para>
25+
Hash indexes support only single-column indexes and do not allow
26+
uniqueness checking.
27+
</para>
28+
29+
<para>
30+
Hash indexes support only the <literal>=</literal> operator,
31+
so WHERE clauses that specify range operations will not be able to take
32+
advantage of hash indexes.
33+
</para>
34+
35+
<para>
36+
Each hash index tuple stores just the 4-byte hash value, not the actual
37+
column value. As a result, hash indexes may be much smaller than B-trees
38+
when indexing longer data items such as UUIDs, URLs, etc. The absence of
39+
the column value also makes all hash index scans lossy. Hash indexes may
40+
take part in bitmap index scans and backward scans.
41+
</para>
42+
43+
<para>
44+
Hash indexes are best optimized for SELECT and UPDATE-heavy workloads
45+
that use equality scans on larger tables. In a B-tree index, searches must
46+
descend through the tree until the leaf page is found. In tables with
47+
millions of rows, this descent can increase access time to data. The
48+
equivalent of a leaf page in a hash index is referred to as a bucket page. In
49+
contrast, a hash index allows accessing the bucket pages directly,
50+
thereby potentially reducing index access time in larger tables. This
51+
reduction in "logical I/O" becomes even more pronounced on indexes/data
52+
larger than shared_buffers/RAM.
53+
</para>
54+
55+
<para>
56+
Hash indexes have been designed to cope with uneven distributions of
57+
hash values. Direct access to the bucket pages works well if the hash
58+
values are evenly distributed. When inserts mean that the bucket page
59+
becomes full, additional overflow pages are chained to that specific
60+
bucket page, locally expanding the storage for index tuples that match
61+
that hash value. When scanning a hash bucket during queries, we need to
62+
scan through all of the overflow pages. Thus an unbalanced hash index
63+
might actually be worse than a B-tree in terms of number of block
64+
accesses required, for some data.
65+
</para>
66+
67+
<para>
68+
As a result of the overflow cases, we can say that hash indexes are
69+
most suitable for unique, nearly unique data or data with a low number
70+
of rows per hash bucket.
71+
One possible way to avoid problems is to exclude highly non-unique
72+
values from the index using a partial index condition, but this may
73+
not be suitable in many cases.
74+
</para>
75+
76+
<para>
77+
Like B-Trees, hash indexes perform simple index tuple deletion. This
78+
is a deferred maintenance operation that deletes index tuples that are
79+
known to be safe to delete (those whose item identifier's LP_DEAD bit
80+
is already set). If an insert finds no space is available on a page we
81+
try to avoid creating a new overflow page by attempting to remove dead
82+
index tuples. Removal cannot occur if the page is pinned at that time.
83+
Deletion of dead index pointers also occurs during VACUUM.
84+
</para>
85+
86+
<para>
87+
If it can, VACUUM will also try to squeeze the index tuples onto as
88+
few overflow pages as possible, minimizing the overflow chain. If an
89+
overflow page becomes empty, overflow pages can be recycled for reuse
90+
in other buckets, though we never return them to the operating system.
91+
There is currently no provision to shrink a hash index, other than by
92+
rebuilding it with REINDEX.
93+
There is no provision for reducing the number of buckets, either.
94+
</para>
95+
96+
<para>
97+
Hash indexes may expand the number of bucket pages as the number of
98+
rows indexed grows. The hash key-to-bucket-number mapping is chosen so that
99+
the index can be incrementally expanded. When a new bucket is to be added to
100+
the index, exactly one existing bucket will need to be "split", with some of
101+
its tuples being transferred to the new bucket according to the updated
102+
key-to-bucket-number mapping.
103+
</para>
104+
105+
<para>
106+
The expansion occurs in the foreground, which could increase execution
107+
time for user inserts. Thus, hash indexes may not be suitable for tables
108+
with rapidly increasing number of rows.
109+
</para>
110+
111+
</sect1>
112+
113+
<sect1 id="hash-implementation">
114+
<title>Implementation</title>
115+
116+
<para>
117+
There are four kinds of pages in a hash index: the meta page (page zero),
118+
which contains statically allocated control information; primary bucket
119+
pages; overflow pages; and bitmap pages, which keep track of overflow
120+
pages that have been freed and are available for re-use. For addressing
121+
purposes, bitmap pages are regarded as a subset of the overflow pages.
122+
</para>
123+
124+
<para>
125+
Both scanning the index and inserting tuples require locating the bucket
126+
where a given tuple ought to be located. To do this, we need the bucket
127+
count, highmask, and lowmask from the metapage; however, it's undesirable
128+
for performance reasons to have to have to lock and pin the metapage for
129+
every such operation. Instead, we retain a cached copy of the metapage
130+
in each backend's relcache entry. This will produce the correct bucket
131+
mapping as long as the target bucket hasn't been split since the last
132+
cache refresh.
133+
</para>
134+
135+
<para>
136+
Primary bucket pages and overflow pages are allocated independently since
137+
any given index might need more or fewer overflow pages relative to its
138+
number of buckets. The hash code uses an interesting set of addressing
139+
rules to support a variable number of overflow pages while not having to
140+
move primary bucket pages around after they are created.
141+
</para>
142+
143+
<para>
144+
Each row in the table indexed is represented by a single index tuple in
145+
the hash index. Hash index tuples are stored in bucket pages, and if
146+
they exist, overflow pages. We speed up searches by keeping the index entries
147+
in any one index page sorted by hash code, thus allowing binary search to be
148+
used within an index page. Note however that there is *no* assumption about
149+
the relative ordering of hash codes across different index pages of a bucket.
150+
</para>
151+
152+
<para>
153+
The bucket splitting algorithms to expand the hash index are too complex to
154+
be worthy of mention here, though are described in more detail in
155+
<filename>src/backend/access/hash/README</filename>.
156+
The split algorithm is crash safe and can be restarted if not completed
157+
successfully.
158+
</para>
159+
160+
</sect1>
161+
162+
</chapter>

‎doc/src/sgml/postgres.sgml

Copy file name to clipboardExpand all lines: doc/src/sgml/postgres.sgml
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -258,6 +258,7 @@
258258
&spgist;
259259
&gin;
260260
&brin;
261+
&hash;
261262
&storage;
262263
&bki;
263264
&planstats;

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.