You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+47-13Lines changed: 47 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -4,13 +4,32 @@
4
4
5
5
## About
6
6
7
-
ZSON is a PostgreSQL extension for transparent JSONB compression. Compression is based on a shared dictionary of strings most frequently used in specific JSONB documents (not only keys, but also values, array elements, etc).
7
+
ZSON is a PostgreSQL extension for transparent JSONB compression. Compression is
8
+
based on a shared dictionary of strings most frequently used in specific JSONB
9
+
documents (not only keys, but also values, array elements, etc).
8
10
9
-
In some cases ZSON can save half of your disk space and give you about 10% more TPS. Memory is saved as well. See [docs/benchmark.md](docs/benchmark.md). Everything depends on your data and workload though. Don't believe any benchmarks, re-check everything on your data, configuration, hardware, workload and PostgreSQL version.
11
+
In some cases ZSON can save half of your disk space and give you about 10% more
12
+
TPS. Memory is saved as well. See [docs/benchmark.md](docs/benchmark.md).
13
+
Everything depends on your data and workload, though. Don't believe any
14
+
benchmarks, re-check everything on your data, configuration, hardware, workload
15
+
and PostgreSQL version.
10
16
11
-
ZSON was originally created in 2016 by [Postgres Professional](https://postgrespro.ru/) team: researched and coded by [Aleksander Alekseev](http://eax.me/); ideas, code review, testing, etc by [Alexander Korotkov](http://akorotkov.github.io/) and [Teodor Sigaev](http://www.sigaev.ru/).
17
+
ZSON was originally created in 2016 by [Postgres Professional][pgpro] team:
18
+
researched and coded by [Aleksander Alekseev][me]; ideas, code review, testing,
19
+
etc by [Alexander Korotkov][ak] and [Teodor Sigaev][ts].
12
20
13
-
See also discussions on [pgsql-general@](https://www.postgresql.org/message-id/flat/20160930185801.38654a1c%40e754), [Hacker News](https://news.ycombinator.com/item?id=12633486), [Reddit](https://www.reddit.com/r/PostgreSQL/comments/55mr4r/zson_postgresql_extension_for_transparent_jsonb/) and [HabraHabr](https://habrahabr.ru/company/postgrespro/blog/312006/).
21
+
[me]: http://eax.me/
22
+
[ak]: http://akorotkov.github.io/
23
+
[ts]: http://www.sigaev.ru/
24
+
[pgpro]: https://postgrespro.ru/
25
+
26
+
See also discussions on [pgsql-general@][gen], [Hacker News][hn], [Reddit][rd]
You can create a temporary table and write some common JSONB documents to it manually or use existing tables. The idea is to provide a subset of real data. Lets say some document *type* is twice as frequent as some other document type. ZSON expects that there will be twice as many documents of the first type as those of the second one in a learning set.
97
+
You can create a temporary table and write some common JSONB documents into it
98
+
manually or use the existing tables. The idea is to provide a subset of real
99
+
data. Let's say some document *type* is twice as frequent as another document
100
+
type. ZSON expects that there will be twice as many documents of the first type
101
+
as those of the second one in a learning set.
79
102
80
103
Resulting dictionary could be examined using this query:
81
104
82
105
```
83
106
select * from zson_dict;
84
107
```
85
108
86
-
Now ZSON type could be used as a complete and transparent replacement of JSONB type:
109
+
Now ZSON type could be used as a complete and transparent replacement of JSONB
110
+
type:
87
111
88
112
```
89
113
zson_test=# create table zson_example(x zson);
@@ -99,15 +123,20 @@ zson_test=# select x -> 'aaa' from zson_example;
99
123
100
124
## Migrating to a new dictionary
101
125
102
-
When schema of JSONB documents evolve ZSON could be *re-learned*:
126
+
When a schema of JSONB documents evolves ZSON could be *re-learned*:
This time *second* dictionary will be created. Dictionaries are cached in memory so it will take about a minute before ZSON realizes that there is a new dictionary. After that old documents will be decompressed using the old dictionary and new documents will be compressed and decompressed using the new dictionary.
132
+
This time *second* dictionary will be created. Dictionaries are cached in memory
133
+
so it will take about a minute before ZSON realizes that there is a new
134
+
dictionary. After that old documents will be decompressed using the old
135
+
dictionary and new documents will be compressed and decompressed using the new
136
+
dictionary.
109
137
110
-
To find out which dictionary is used for a given ZSON document use zson\_info procedure:
138
+
To find out which dictionary is used for a given ZSON document use zson\_info
139
+
procedure:
111
140
112
141
```
113
142
zson_test=# select zson_info(x) from test_compress where id = 1;
@@ -119,13 +148,15 @@ zson_test=# select zson_info(x) from test_compress where id = 2;
119
148
zson_info | zson version = 0, dict version = 0, ...
120
149
```
121
150
122
-
If **all** ZSON documents are migrated to the new dictionary the old one could be safely removed:
151
+
If **all** ZSON documents are migrated to the new dictionary the old one could
152
+
be safely removed:
123
153
124
154
```
125
155
delete from zson_dict where dict_id = 0;
126
156
```
127
157
128
-
In general it's safer to keep old dictionaries just in case. Gaining a few KB of disk space is not worth the risk of losing data.
158
+
In general, it's safer to keep old dictionaries just in case. Gaining a few KB
159
+
of disk space is not worth the risk of losing data.
129
160
130
161
## When it's a time to re-learn?
131
162
@@ -137,7 +168,10 @@ A good heuristic could be:
137
168
select pg_table_size('tt') / (select count(*) from tt)
138
169
```
139
170
140
-
... i.e. average document size. When it suddenly starts to grow it's time to re-learn.
171
+
... i.e. average document size. When it suddenly starts to grow it's time to
172
+
re-learn.
141
173
142
-
However, developers usually know when they change a schema significantly. It's also easy to re-check whether current schema differs a lot from the original using zson\_dict table.
174
+
However, developers usually know when they change a schema significantly. It's
175
+
also easy to re-check whether the current schema differs a lot from the original
0 commit comments