From 74d50957ee0600bddb63dee0cf42ae95b5541c8e Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Fri, 16 May 2025 15:46:40 +0800 Subject: [PATCH 01/16] update --- ticdc/ticdc-data-sync-capabilities.md | 31 +++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 ticdc/ticdc-data-sync-capabilities.md diff --git a/ticdc/ticdc-data-sync-capabilities.md b/ticdc/ticdc-data-sync-capabilities.md new file mode 100644 index 0000000000000..7f6f9ea8053ba --- /dev/null +++ b/ticdc/ticdc-data-sync-capabilities.md @@ -0,0 +1,31 @@ +--- +title: TiCDC's Data Synchronization Capability +summary: Learn the TiCDC's data synchronization capabilities. +--- + +# TiCDC's Data Synchronization Capability + +## Backgroud + +TiCDC (TiDB Change Data Capture) is a core component for real-time data synchronization in the TiDB ecosystem. + +1. TiCDC monitors TiKV's Raft Log to convert row-level data changes (insert/update/delete) into downstream-compatible SQL statements. Unlike Binlog, TiCDC does not rely on parsing SQL statements. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). + + +2. TiCDC generates logical operations (such as INSERT/UPDATE/DELETE) that are equivalent to SQL semantics, rather than restoring the original SQL executed upstream one by one. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). + +3. TiCDC provides the guarantee of eventual consistency of transactions. [redo log](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios) provides the final consistency guarantee in disaster recovery scenarios. [Syncpoint](/ticdc/ticdc-upstream-downstream-check.md#enable-syncpoint) provides consistent snapshot reads and data consistency checks. + +4. TiCDC supports synchronizing data to multiple downstreams, including [TiDB and MySQL-compatible databases](/ticdc/ticdc-sink-to-mysql.md), [Kafka](/ticdc/ticdc-sink-to-kafka.md), [Pulsar](/ticdc/ticdc-sink-to-pulsar), [storage services (Amazon S3, GCS, Azure Blob Storage, and NFS](/ticdc/ticdc-sink-to-cloud-storage.md). + +## Data synchronization capabilities of TiCDC + +1. TiCDC supports synchronizing DDL and DML statements executed upstream, but does not synchronize DDL and DML executed in upstream system tables (including `mysql.*` and `information_schema.*`), nor does it synchronize temporary tables created in the upstream. + +2. TiCDC does not support synchronizing DQL (Data Query Language) statements, nor does it support synchronizing DCL (Data Control Language) statements. + +3. TiCDC supports synchronizing the settings of the index in the upstream table through DDL (`add index`, `create index`), and in order to reduce the impact on the synchronization delay of Changefeed, if the downstream is TiDB, TiCDC will [asynchronously execute the DDL operations of creating and adding indexes](/ticdc/ticdc-ddl.md#asynchronous-execution-of-add-index-and-create-index-ddls). + +4. For the foreign key constraints set in the table, TiCDC will synchronize the corresponding DDL (`add foreign key`) statements, but TiCDC is not responsible for synchronizing the settings of upstream system variables, such as [foreign_key_checks](/system-variables.md#foreign_key_checks). Therefore, customers need to set appropriate system variables in the downstream to determine whether the downstream foreign key constraint check is enabled. + +5. TiCDC only checks the integrity of the upstream changes received internally, and does not participate in checking whether the data changes meet the various downstream constraints. If a data change that does not meet the downstream constraints is encountered, TiCDC will report an error when writing downstream. \ No newline at end of file From 17eceeb9c38e24271b9bc5e6ab3a2d8ef83a9f83 Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:20:59 +0800 Subject: [PATCH 02/16] ticdc: add scheduler config recommendations for table split mode --- ticdc/ticdc-architecture.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 38a3326f24efc..651429b407c47 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -66,6 +66,16 @@ When this feature is enabled, TiCDC automatically splits and distributes tables > > For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table split mode. +### Configuration recommendations for table split mode + +After switching to the TiCDC new architecture, it is not recommended to continue using table split-related settings from the classic architecture. In most scenarios, it is recommended to start with the default values in the new architecture and only make minor adjustments for special cases. + +In table split mode, pay special attention to the following settings: + +- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its Region count exceeds this threshold. For scenarios where the Region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. +- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` Regions. +- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a non-`0` value. + ## Compatibility ### DDL progress tracking table From 26033992a9845a79126a6cd981309da1904c12ab Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:26:55 +0800 Subject: [PATCH 03/16] update --- ticdc/ticdc-data-sync-capabilities.md | 31 --------------------------- 1 file changed, 31 deletions(-) delete mode 100644 ticdc/ticdc-data-sync-capabilities.md diff --git a/ticdc/ticdc-data-sync-capabilities.md b/ticdc/ticdc-data-sync-capabilities.md deleted file mode 100644 index 7f6f9ea8053ba..0000000000000 --- a/ticdc/ticdc-data-sync-capabilities.md +++ /dev/null @@ -1,31 +0,0 @@ ---- -title: TiCDC's Data Synchronization Capability -summary: Learn the TiCDC's data synchronization capabilities. ---- - -# TiCDC's Data Synchronization Capability - -## Backgroud - -TiCDC (TiDB Change Data Capture) is a core component for real-time data synchronization in the TiDB ecosystem. - -1. TiCDC monitors TiKV's Raft Log to convert row-level data changes (insert/update/delete) into downstream-compatible SQL statements. Unlike Binlog, TiCDC does not rely on parsing SQL statements. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). - - -2. TiCDC generates logical operations (such as INSERT/UPDATE/DELETE) that are equivalent to SQL semantics, rather than restoring the original SQL executed upstream one by one. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). - -3. TiCDC provides the guarantee of eventual consistency of transactions. [redo log](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios) provides the final consistency guarantee in disaster recovery scenarios. [Syncpoint](/ticdc/ticdc-upstream-downstream-check.md#enable-syncpoint) provides consistent snapshot reads and data consistency checks. - -4. TiCDC supports synchronizing data to multiple downstreams, including [TiDB and MySQL-compatible databases](/ticdc/ticdc-sink-to-mysql.md), [Kafka](/ticdc/ticdc-sink-to-kafka.md), [Pulsar](/ticdc/ticdc-sink-to-pulsar), [storage services (Amazon S3, GCS, Azure Blob Storage, and NFS](/ticdc/ticdc-sink-to-cloud-storage.md). - -## Data synchronization capabilities of TiCDC - -1. TiCDC supports synchronizing DDL and DML statements executed upstream, but does not synchronize DDL and DML executed in upstream system tables (including `mysql.*` and `information_schema.*`), nor does it synchronize temporary tables created in the upstream. - -2. TiCDC does not support synchronizing DQL (Data Query Language) statements, nor does it support synchronizing DCL (Data Control Language) statements. - -3. TiCDC supports synchronizing the settings of the index in the upstream table through DDL (`add index`, `create index`), and in order to reduce the impact on the synchronization delay of Changefeed, if the downstream is TiDB, TiCDC will [asynchronously execute the DDL operations of creating and adding indexes](/ticdc/ticdc-ddl.md#asynchronous-execution-of-add-index-and-create-index-ddls). - -4. For the foreign key constraints set in the table, TiCDC will synchronize the corresponding DDL (`add foreign key`) statements, but TiCDC is not responsible for synchronizing the settings of upstream system variables, such as [foreign_key_checks](/system-variables.md#foreign_key_checks). Therefore, customers need to set appropriate system variables in the downstream to determine whether the downstream foreign key constraint check is enabled. - -5. TiCDC only checks the integrity of the upstream changes received internally, and does not participate in checking whether the data changes meet the various downstream constraints. If a data change that does not meet the downstream constraints is encountered, TiCDC will report an error when writing downstream. \ No newline at end of file From c7645d4887587d7d3bbc64335244d0a07c17f16a Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:27:50 +0800 Subject: [PATCH 04/16] Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 651429b407c47..01139f0f152b0 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -74,7 +74,7 @@ In table split mode, pay special attention to the following settings: - `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its Region count exceeds this threshold. For scenarios where the Region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. - `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` Regions. -- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a non-`0` value. +- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a value other than `0`. ## Compatibility From 0272afa194ccce5919bd31422762a1232ac129fe Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:28:01 +0800 Subject: [PATCH 05/16] Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 01139f0f152b0..a76f8bc1fe9b0 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -68,7 +68,7 @@ When this feature is enabled, TiCDC automatically splits and distributes tables ### Configuration recommendations for table split mode -After switching to the TiCDC new architecture, it is not recommended to continue using table split-related settings from the classic architecture. In most scenarios, it is recommended to start with the default values in the new architecture and only make minor adjustments for special cases. +After switching to the TiCDC new architecture, you should not continue using table split-related settings from the classic architecture. In most scenarios, it is recommended that you start with the default values in the new architecture and only make minor adjustments for special cases. In table split mode, pay special attention to the following settings: From 07293c18a2f63c959f9690a31f812a2890ba2281 Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:29:31 +0800 Subject: [PATCH 06/16] update --- ticdc/ticdc-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index a76f8bc1fe9b0..a68ce9f43870a 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -72,8 +72,8 @@ After switching to the TiCDC new architecture, you should not continue using tab In table split mode, pay special attention to the following settings: -- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its Region count exceeds this threshold. For scenarios where the Region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. -- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` Regions. +- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its region count exceeds this threshold. For scenarios where the region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. +- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` regions. - `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a value other than `0`. ## Compatibility From 9e1a854153f378e9ed935c62ae927e746b9c630e Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Wed, 25 Feb 2026 20:46:46 +0800 Subject: [PATCH 07/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index a68ce9f43870a..95e4dea5584d2 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -68,13 +68,13 @@ When this feature is enabled, TiCDC automatically splits and distributes tables ### Configuration recommendations for table split mode -After switching to the TiCDC new architecture, you should not continue using table split-related settings from the classic architecture. In most scenarios, it is recommended that you start with the default values in the new architecture and only make minor adjustments for special cases. +After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Adjust parameters only in special cases, and make minor incremental changes based on the defaults. -In table split mode, pay special attention to the following settings: +In table split mode, pay attention to the following settings: -- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its region count exceeds this threshold. For scenarios where the region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. -- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` regions. -- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a value other than `0`. +- `scheduler.region-threshold`: the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. +- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- `scheduler.write-key-threshold`: the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. ## Compatibility From e774098d9a77ad1b828efeb20fc987ef02d49278 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 15:48:27 +0800 Subject: [PATCH 08/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 95e4dea5584d2..8af02a1b91e94 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -64,9 +64,9 @@ When this feature is enabled, TiCDC automatically splits and distributes tables > **Note:** > -> For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table split mode. +> For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table-level task splitting mode. -### Configuration recommendations for table split mode +### Recommended configurations for table-level task splitting mode After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Adjust parameters only in special cases, and make minor incremental changes based on the defaults. From 14525eed243e703e96423bfa555dd890d3203355 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:00:47 +0800 Subject: [PATCH 09/16] Update ticdc-changefeed-config.md --- ticdc/ticdc-changefeed-config.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index f71fa7a1950ee..b1a6dd75f4e48 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -163,6 +163,11 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt - The value is `false` by default. Set it to `true` to enable this feature. - Default value: `false` +#### `region-count-per-span` + +- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- Default value: `100` + #### `region-threshold` - Default value: for the [TiCDC new architecture](/ticdc/ticdc-architecture.md), the default value is `10000`; for the [TiCDC classic architecture](/ticdc/ticdc-classic-architecture.md), the default value is `100000`. From 3620624b1e5d14b963b3727eb1448204909f4ca2 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:04:29 +0800 Subject: [PATCH 10/16] Update ticdc-architecture.md --- ticdc/ticdc-architecture.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 8af02a1b91e94..dcfbd3066f514 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -72,9 +72,9 @@ After switching to the new TiCDC architecture, do not reuse the table-splitting In table split mode, pay attention to the following settings: -- `scheduler.region-threshold`: the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. -- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. -- `scheduler.write-key-threshold`: the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. +- [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. +- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. ## Compatibility From 28c3dda4d40877340ead821239a416830982d003 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 16:06:39 +0800 Subject: [PATCH 11/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index dcfbd3066f514..efdecd24ccd72 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -68,7 +68,7 @@ When this feature is enabled, TiCDC automatically splits and distributes tables ### Recommended configurations for table-level task splitting mode -After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Adjust parameters only in special cases, and make minor incremental changes based on the defaults. +After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Make incremental adjustments based on the default values only in special scenarios where replication performance bottlenecks or scheduling imbalance occur. In table split mode, pay attention to the following settings: From 8e29b9c761cece65808a6ca0726373aaf893e934 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 16:20:36 +0800 Subject: [PATCH 12/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index efdecd24ccd72..88016ff93d4df 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -74,7 +74,7 @@ In table split mode, pay attention to the following settings: - [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. - [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. -- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. +- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to the default value `0`. ## Compatibility From 04564d046133862c1de38b6af489eb9ad78bf427 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 16:29:15 +0800 Subject: [PATCH 13/16] Apply suggestions from code review Co-authored-by: Grace Cai --- ticdc/ticdc-architecture.md | 4 ++-- ticdc/ticdc-changefeed-config.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 88016ff93d4df..df10fc17d8ba5 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -66,14 +66,14 @@ When this feature is enabled, TiCDC automatically splits and distributes tables > > For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table-level task splitting mode. -### Recommended configurations for table-level task splitting mode +### Recommended configurations for table-level task splitting After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Make incremental adjustments based on the default values only in special scenarios where replication performance bottlenecks or scheduling imbalance occur. In table split mode, pay attention to the following settings: - [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. -- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to the default value `0`. ## Compatibility diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index b1a6dd75f4e48..3911566a3411a 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -165,7 +165,7 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt #### `region-count-per-span` -- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - Default value: `100` #### `region-threshold` From b949b045c2979754dd5452d95331c88522101369 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:47:23 +0800 Subject: [PATCH 14/16] Update ticdc-changefeed-config.md --- ticdc/ticdc-changefeed-config.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 3911566a3411a..ee96321913de8 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -163,7 +163,7 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt - The value is `false` by default. Set it to `true` to enable this feature. - Default value: `false` -#### `region-count-per-span` +#### `region-count-per-span` New in v8.5.4 - During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - Default value: `100` From 91ddd762836393fd1650b2744a283e10fa016095 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:51:44 +0800 Subject: [PATCH 15/16] Update ticdc-architecture.md --- ticdc/ticdc-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index df10fc17d8ba5..8438066bf1919 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -73,8 +73,8 @@ After switching to the new TiCDC architecture, do not reuse the table-splitting In table split mode, pay attention to the following settings: - [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. -- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. -- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to the default value `0`. +- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span-new-in-v854): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. ## Compatibility From 9a0cf9665acfecab3d9fe668e7ccd748d0083613 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:55:23 +0800 Subject: [PATCH 16/16] Update ticdc-changefeed-config.md --- ticdc/ticdc-changefeed-config.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index ee96321913de8..6771bca1abcdb 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -165,7 +165,7 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt #### `region-count-per-span` New in v8.5.4 -- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. +- Introduced in the [TiCDC new architecture](/ticdc/ticdc-architecture.md). During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - Default value: `100` #### `region-threshold`