Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit ecb6965

Browse filesBrowse files
author
Amit Kapila
committed
Allow logical replication to copy tables in binary format.
This patch allows copying tables in the binary format during table synchronization when the binary option for a subscription is enabled. Previously, tables are copied in text format even if the subscription is created with the binary option enabled. Copying tables in binary format may reduce the time spent depending on column types. A binary copy for initial table synchronization is supported only when both publisher and subscriber are v16 or later. Author: Melih Mutlu Reviewed-by: Peter Smith, Shi yu, Euler Taveira, Vignesh C, Kuroda Hayato, Osumi Takamichi, Bharath Rupireddy, Hou Zhijie Discussion: https://postgr.es/m/CAGPVpCQvAziCLknEnygY0v1-KBtg%2BOm-9JHJYZOnNPKFJPompw%40mail.gmail.com
1 parent 4f1882b commit ecb6965
Copy full SHA for ecb6965

File tree

5 files changed

+216
-18
lines changed
Filter options

5 files changed

+216
-18
lines changed

‎doc/src/sgml/logical-replication.sgml

Copy file name to clipboardExpand all lines: doc/src/sgml/logical-replication.sgml
+3-1
Original file line numberDiff line numberDiff line change
@@ -251,7 +251,9 @@
251251
column of type <type>bigint</type>. The target table can also have
252252
additional columns not provided by the published table. Any such columns
253253
will be filled with the default value as specified in the definition of the
254-
target table.
254+
target table. However, logical replication in binary format is more
255+
restrictive. See the <link linkend="sql-createsubscription-binary"><literal>binary</literal>
256+
option</link> of <command>CREATE SUBSCRIPTION</command> for details.
255257
</para>
256258

257259
<sect2 id="logical-replication-subscription-slot">

‎doc/src/sgml/ref/alter_subscription.sgml

Copy file name to clipboardExpand all lines: doc/src/sgml/ref/alter_subscription.sgml
+5
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,11 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
177177
how <literal>copy_data = true</literal> can interact with the
178178
<literal>origin</literal> parameter.
179179
</para>
180+
<para>
181+
See the <link linkend="sql-createsubscription-binary"><literal>binary</literal>
182+
option</link> of <command>CREATE SUBSCRIPTION</command> for details
183+
about copying pre-existing data in binary format.
184+
</para>
180185
</listitem>
181186
</varlistentry>
182187
</variablelist></para>

‎doc/src/sgml/ref/create_subscription.sgml

Copy file name to clipboardExpand all lines: doc/src/sgml/ref/create_subscription.sgml
+22-6
Original file line numberDiff line numberDiff line change
@@ -185,15 +185,25 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
185185

186186
<variablelist>
187187

188-
<varlistentry>
188+
<varlistentry id="sql-createsubscription-binary" xreflabel="binary">
189189
<term><literal>binary</literal> (<type>boolean</type>)</term>
190190
<listitem>
191191
<para>
192-
Specifies whether the subscription will request the publisher to
193-
send the data in binary format (as opposed to text).
194-
The default is <literal>false</literal>.
195-
Even when this option is enabled, only data types having
196-
binary send and receive functions will be transferred in binary.
192+
Specifies whether the subscription will request the publisher to send
193+
the data in binary format (as opposed to text). The default is
194+
<literal>false</literal>. Any initial table synchronization copy
195+
(see <literal>copy_data</literal>) also uses the same format. Binary
196+
format can be faster than the text format, but it is less portable
197+
across machine architectures and <productname>PostgreSQL</productname>
198+
versions. Binary format is very data type specific; for example, it
199+
will not allow copying from a <type>smallint</type> column to an
200+
<type>integer</type> column, even though that would work fine in text
201+
format. Even when this option is enabled, only data types having binary
202+
send and receive functions will be transferred in binary. Note that
203+
the initial synchronization requires all data types to have binary
204+
send and receive functions, otherwise the synchronization will fail
205+
(see <xref linkend="sql-createtype"/> for more about send/receive
206+
functions).
197207
</para>
198208

199209
<para>
@@ -203,6 +213,12 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
203213
such a case, data transfer will fail, and
204214
the <literal>binary</literal> option cannot be used.
205215
</para>
216+
217+
<para>
218+
If the publisher is a <productname>PostgreSQL</productname> version
219+
before 16, then any initial table synchronization will use text format
220+
even if <literal>binary = true</literal>.
221+
</para>
206222
</listitem>
207223
</varlistentry>
208224

‎src/backend/replication/logical/tablesync.c

Copy file name to clipboardExpand all lines: src/backend/replication/logical/tablesync.c
+16-1
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@
101101
#include "catalog/pg_type.h"
102102
#include "commands/copy.h"
103103
#include "miscadmin.h"
104+
#include "nodes/makefuncs.h"
104105
#include "parser/parse_relation.h"
105106
#include "pgstat.h"
106107
#include "replication/logicallauncher.h"
@@ -1090,6 +1091,7 @@ copy_table(Relation rel)
10901091
CopyFromState cstate;
10911092
List *attnamelist;
10921093
ParseState *pstate;
1094+
List *options = NIL;
10931095

10941096
/* Get the publisher relation info. */
10951097
fetch_remote_table_info(get_namespace_name(RelationGetNamespace(rel)),
@@ -1168,6 +1170,19 @@ copy_table(Relation rel)
11681170

11691171
appendStringInfoString(&cmd, ") TO STDOUT");
11701172
}
1173+
1174+
/*
1175+
* Prior to v16, initial table synchronization will use text format even
1176+
* if the binary option is enabled for a subscription.
1177+
*/
1178+
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 160000 &&
1179+
MySubscription->binary)
1180+
{
1181+
appendStringInfoString(&cmd, " WITH (FORMAT binary)");
1182+
options = list_make1(makeDefElem("format",
1183+
(Node *) makeString("binary"), -1));
1184+
}
1185+
11711186
res = walrcv_exec(LogRepWorkerWalRcvConn, cmd.data, 0, NULL);
11721187
pfree(cmd.data);
11731188
if (res->status != WALRCV_OK_COPY_OUT)
@@ -1184,7 +1199,7 @@ copy_table(Relation rel)
11841199
NULL, false, false);
11851200

11861201
attnamelist = make_copy_attnamelist(relmapentry);
1187-
cstate = BeginCopyFrom(pstate, rel, NULL, NULL, false, copy_read_data, attnamelist, NIL);
1202+
cstate = BeginCopyFrom(pstate, rel, NULL, NULL, false, copy_read_data, attnamelist, options);
11881203

11891204
/* Do the copy */
11901205
(void) CopyFrom(cstate);

‎src/test/subscription/t/014_binary.pl

Copy file name to clipboardExpand all lines: src/test/subscription/t/014_binary.pl
+170-10
Original file line numberDiff line numberDiff line change
@@ -40,35 +40,72 @@
4040
$node_publisher->safe_psql('postgres',
4141
"CREATE PUBLICATION tpub FOR ALL TABLES");
4242

43+
# ------------------------------------------------------
44+
# Ensure binary mode also executes COPY in binary format
45+
# ------------------------------------------------------
46+
47+
# Insert some content before creating a subscription
48+
$node_publisher->safe_psql(
49+
'postgres', qq(
50+
INSERT INTO public.test_numerical (a, b, c, d) VALUES
51+
(1, 1.2, 1.3, 10),
52+
(2, 2.2, 2.3, 20);
53+
INSERT INTO public.test_arrays (a, b, c) VALUES
54+
('{1,2,3}', '{1.1, 1.2, 1.3}', '{"one", "two", "three"}'),
55+
('{3,1,2}', '{1.3, 1.1, 1.2}', '{"three", "one", "two"}');
56+
));
57+
4358
my $publisher_connstring = $node_publisher->connstr . ' dbname=postgres';
4459
$node_subscriber->safe_psql('postgres',
4560
"CREATE SUBSCRIPTION tsub CONNECTION '$publisher_connstring' "
4661
. "PUBLICATION tpub WITH (slot_name = tpub_slot, binary = true)");
4762

63+
# Ensure the COPY command is executed in binary format on the publisher
64+
$node_publisher->wait_for_log(
65+
qr/LOG: ( [A-Z0-9]+:)? statement: COPY (.+)? TO STDOUT WITH \(FORMAT binary\)/
66+
);
67+
4868
# Ensure nodes are in sync with each other
4969
$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');
5070

71+
my $sync_check = qq(
72+
SELECT a, b, c, d FROM test_numerical ORDER BY a;
73+
SELECT a, b, c FROM test_arrays ORDER BY a;
74+
);
75+
76+
# Check the synced data on the subscriber
77+
my $result = $node_subscriber->safe_psql('postgres', $sync_check);
78+
79+
is( $result, '1|1.2|1.3|10
80+
2|2.2|2.3|20
81+
{1,2,3}|{1.1,1.2,1.3}|{one,two,three}
82+
{3,1,2}|{1.3,1.1,1.2}|{three,one,two}', 'check synced data on subscriber');
83+
84+
# ----------------------------------
85+
# Ensure apply works in binary mode
86+
# ----------------------------------
87+
5188
# Insert some content and make sure it's replicated across
5289
$node_publisher->safe_psql(
5390
'postgres', qq(
5491
INSERT INTO public.test_arrays (a, b, c) VALUES
55-
('{1,2,3}', '{1.1, 1.2, 1.3}', '{"one", "two", "three"}'),
56-
('{3,1,2}', '{1.3, 1.1, 1.2}', '{"three", "one", "two"}');
92+
('{2,1,3}', '{1.2, 1.1, 1.3}', '{"two", "one", "three"}'),
93+
('{1,3,2}', '{1.1, 1.3, 1.2}', '{"one", "three", "two"}');
5794
5895
INSERT INTO public.test_numerical (a, b, c, d) VALUES
59-
(1, 1.2, 1.3, 10),
60-
(2, 2.2, 2.3, 20),
61-
(3, 3.2, 3.3, 30);
96+
(3, 3.2, 3.3, 30),
97+
(4, 4.2, 4.3, 40);
6298
));
6399

64100
$node_publisher->wait_for_catchup('tsub');
65101

66-
my $result = $node_subscriber->safe_psql('postgres',
102+
$result = $node_subscriber->safe_psql('postgres',
67103
"SELECT a, b, c, d FROM test_numerical ORDER BY a");
68104

69105
is( $result, '1|1.2|1.3|10
70106
2|2.2|2.3|20
71-
3|3.2|3.3|30', 'check replicated data on subscriber');
107+
3|3.2|3.3|30
108+
4|4.2|4.3|40', 'check replicated data on subscriber');
72109

73110
# Test updates as well
74111
$node_publisher->safe_psql(
@@ -83,14 +120,21 @@
83120
"SELECT a, b, c FROM test_arrays ORDER BY a");
84121

85122
is( $result, '{1,2,3}|{42,1.2,1.3}|
123+
{1,3,2}|{42,1.3,1.2}|
124+
{2,1,3}|{42,1.1,1.3}|
86125
{3,1,2}|{42,1.1,1.2}|', 'check updated replicated data on subscriber');
87126

88127
$result = $node_subscriber->safe_psql('postgres',
89128
"SELECT a, b, c, d FROM test_numerical ORDER BY a");
90129

91130
is( $result, '1|42||10
92131
2|42||20
93-
3|42||30', 'check updated replicated data on subscriber');
132+
3|42||30
133+
4|42||40', 'check updated replicated data on subscriber');
134+
135+
# ------------------------------------------------------------------------------
136+
# Use ALTER SUBSCRIPTION to change to text format and then back to binary format
137+
# ------------------------------------------------------------------------------
94138

95139
# Test to reset back to text formatting, and then to binary again
96140
$node_subscriber->safe_psql('postgres',
@@ -99,7 +143,7 @@
99143
$node_publisher->safe_psql(
100144
'postgres', qq(
101145
INSERT INTO public.test_numerical (a, b, c, d) VALUES
102-
(4, 4.2, 4.3, 40);
146+
(5, 5.2, 5.3, 50);
103147
));
104148

105149
$node_publisher->wait_for_catchup('tsub');
@@ -110,7 +154,8 @@
110154
is( $result, '1|42||10
111155
2|42||20
112156
3|42||30
113-
4|4.2|4.3|40', 'check replicated data on subscriber');
157+
4|42||40
158+
5|5.2|5.3|50', 'check replicated data on subscriber');
114159

115160
$node_subscriber->safe_psql('postgres',
116161
"ALTER SUBSCRIPTION tsub SET (binary = true);");
@@ -127,9 +172,124 @@
127172
"SELECT a, b, c FROM test_arrays ORDER BY a");
128173

129174
is( $result, '{1,2,3}|{42,1.2,1.3}|
175+
{1,3,2}|{42,1.3,1.2}|
176+
{2,1,3}|{42,1.1,1.3}|
130177
{2,3,1}|{1.2,1.3,1.1}|{two,three,one}
131178
{3,1,2}|{42,1.1,1.2}|', 'check replicated data on subscriber');
132179

180+
# ---------------------------------------------------------------
181+
# Test binary replication without and with send/receive functions
182+
# ---------------------------------------------------------------
183+
184+
# Create a custom type without send/rcv functions
185+
$ddl = qq(
186+
CREATE TYPE myvarchar;
187+
CREATE FUNCTION myvarcharin(cstring, oid, integer) RETURNS myvarchar
188+
LANGUAGE internal IMMUTABLE PARALLEL SAFE STRICT AS 'varcharin';
189+
CREATE FUNCTION myvarcharout(myvarchar) RETURNS cstring
190+
LANGUAGE internal IMMUTABLE PARALLEL SAFE STRICT AS 'varcharout';
191+
CREATE TYPE myvarchar (
192+
input = myvarcharin,
193+
output = myvarcharout);
194+
CREATE TABLE public.test_myvarchar (
195+
a myvarchar
196+
););
197+
198+
$node_publisher->safe_psql('postgres', $ddl);
199+
$node_subscriber->safe_psql('postgres', $ddl);
200+
201+
# Insert some initial data
202+
$node_publisher->safe_psql(
203+
'postgres', qq(
204+
INSERT INTO public.test_myvarchar (a) VALUES
205+
('a');
206+
));
207+
208+
# Check the subscriber log from now on.
209+
my $offset = -s $node_subscriber->logfile;
210+
211+
# Refresh the publication to trigger the tablesync
212+
$node_subscriber->safe_psql('postgres',
213+
"ALTER SUBSCRIPTION tsub REFRESH PUBLICATION");
214+
215+
# It should fail
216+
$node_subscriber->wait_for_log(
217+
qr/ERROR: ( [A-Z0-9]+:)? no binary input function available for type/,
218+
$offset);
219+
220+
# Create and set send/rcv functions for the custom type
221+
$ddl = qq(
222+
CREATE FUNCTION myvarcharsend(myvarchar) RETURNS bytea
223+
LANGUAGE internal STABLE PARALLEL SAFE STRICT AS 'varcharsend';
224+
CREATE FUNCTION myvarcharrecv(internal, oid, integer) RETURNS myvarchar
225+
LANGUAGE internal STABLE PARALLEL SAFE STRICT AS 'varcharrecv';
226+
ALTER TYPE myvarchar SET (
227+
send = myvarcharsend,
228+
receive = myvarcharrecv
229+
););
230+
231+
$node_publisher->safe_psql('postgres', $ddl);
232+
$node_subscriber->safe_psql('postgres', $ddl);
233+
234+
# Now tablesync should succeed
235+
$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');
236+
237+
# Check the synced data on the subscriber
238+
$result =
239+
$node_subscriber->safe_psql('postgres', 'SELECT a FROM test_myvarchar;');
240+
241+
is($result, 'a', 'check synced data on subscriber with custom type');
242+
243+
# -----------------------------------------------------
244+
# Test mismatched column types with/without binary mode
245+
# -----------------------------------------------------
246+
247+
# Test syncing tables with mismatching column types
248+
$node_publisher->safe_psql(
249+
'postgres', qq(
250+
CREATE TABLE public.test_mismatching_types (
251+
a bigint PRIMARY KEY
252+
);
253+
INSERT INTO public.test_mismatching_types (a)
254+
VALUES (1), (2);
255+
));
256+
257+
# Check the subscriber log from now on.
258+
$offset = -s $node_subscriber->logfile;
259+
260+
$node_subscriber->safe_psql(
261+
'postgres', qq(
262+
CREATE TABLE public.test_mismatching_types (
263+
a int PRIMARY KEY
264+
);
265+
ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
266+
));
267+
268+
# Cannot sync due to type mismatch
269+
$node_subscriber->wait_for_log(
270+
qr/ERROR: ( [A-Z0-9]+:)? incorrect binary data format/, $offset);
271+
272+
# Check the publisher log from now on.
273+
$offset = -s $node_publisher->logfile;
274+
275+
# Setting binary to false should allow syncing
276+
$node_subscriber->safe_psql(
277+
'postgres', qq(
278+
ALTER SUBSCRIPTION tsub SET (binary = false);));
279+
280+
# Ensure the COPY command is executed in text format on the publisher
281+
$node_publisher->wait_for_log(
282+
qr/LOG: ( [A-Z0-9]+:)? statement: COPY (.+)? TO STDOUT\n/, $offset);
283+
284+
$node_subscriber->wait_for_subscription_sync($node_publisher, 'tsub');
285+
286+
# Check the synced data on the subscriber
287+
$result = $node_subscriber->safe_psql('postgres',
288+
'SELECT a FROM test_mismatching_types ORDER BY a;');
289+
290+
is( $result, '1
291+
2', 'check synced data on subscriber with binary = false');
292+
133293
$node_subscriber->stop('fast');
134294
$node_publisher->stop('fast');
135295

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.