From 5c1978526e1999865bfb1c17540e6d9eb0a335ed Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Fri, 16 May 2025 15:46:40 +0800 Subject: [PATCH 01/16] update --- ticdc/ticdc-data-sync-capabilities.md | 31 +++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 ticdc/ticdc-data-sync-capabilities.md diff --git a/ticdc/ticdc-data-sync-capabilities.md b/ticdc/ticdc-data-sync-capabilities.md new file mode 100644 index 0000000000000..7f6f9ea8053ba --- /dev/null +++ b/ticdc/ticdc-data-sync-capabilities.md @@ -0,0 +1,31 @@ +--- +title: TiCDC's Data Synchronization Capability +summary: Learn the TiCDC's data synchronization capabilities. +--- + +# TiCDC's Data Synchronization Capability + +## Backgroud + +TiCDC (TiDB Change Data Capture) is a core component for real-time data synchronization in the TiDB ecosystem. + +1. TiCDC monitors TiKV's Raft Log to convert row-level data changes (insert/update/delete) into downstream-compatible SQL statements. Unlike Binlog, TiCDC does not rely on parsing SQL statements. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). + + +2. TiCDC generates logical operations (such as INSERT/UPDATE/DELETE) that are equivalent to SQL semantics, rather than restoring the original SQL executed upstream one by one. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). + +3. TiCDC provides the guarantee of eventual consistency of transactions. [redo log](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios) provides the final consistency guarantee in disaster recovery scenarios. [Syncpoint](/ticdc/ticdc-upstream-downstream-check.md#enable-syncpoint) provides consistent snapshot reads and data consistency checks. + +4. TiCDC supports synchronizing data to multiple downstreams, including [TiDB and MySQL-compatible databases](/ticdc/ticdc-sink-to-mysql.md), [Kafka](/ticdc/ticdc-sink-to-kafka.md), [Pulsar](/ticdc/ticdc-sink-to-pulsar), [storage services (Amazon S3, GCS, Azure Blob Storage, and NFS](/ticdc/ticdc-sink-to-cloud-storage.md). + +## Data synchronization capabilities of TiCDC + +1. TiCDC supports synchronizing DDL and DML statements executed upstream, but does not synchronize DDL and DML executed in upstream system tables (including `mysql.*` and `information_schema.*`), nor does it synchronize temporary tables created in the upstream. + +2. TiCDC does not support synchronizing DQL (Data Query Language) statements, nor does it support synchronizing DCL (Data Control Language) statements. + +3. TiCDC supports synchronizing the settings of the index in the upstream table through DDL (`add index`, `create index`), and in order to reduce the impact on the synchronization delay of Changefeed, if the downstream is TiDB, TiCDC will [asynchronously execute the DDL operations of creating and adding indexes](/ticdc/ticdc-ddl.md#asynchronous-execution-of-add-index-and-create-index-ddls). + +4. For the foreign key constraints set in the table, TiCDC will synchronize the corresponding DDL (`add foreign key`) statements, but TiCDC is not responsible for synchronizing the settings of upstream system variables, such as [foreign_key_checks](/system-variables.md#foreign_key_checks). Therefore, customers need to set appropriate system variables in the downstream to determine whether the downstream foreign key constraint check is enabled. + +5. TiCDC only checks the integrity of the upstream changes received internally, and does not participate in checking whether the data changes meet the various downstream constraints. If a data change that does not meet the downstream constraints is encountered, TiCDC will report an error when writing downstream. \ No newline at end of file From accb41bf429decef7a94c6e745580d0f7ac9b263 Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:20:59 +0800 Subject: [PATCH 02/16] ticdc: add scheduler config recommendations for table split mode --- ticdc/ticdc-architecture.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 38a3326f24efc..651429b407c47 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -66,6 +66,16 @@ When this feature is enabled, TiCDC automatically splits and distributes tables > > For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table split mode. +### Configuration recommendations for table split mode + +After switching to the TiCDC new architecture, it is not recommended to continue using table split-related settings from the classic architecture. In most scenarios, it is recommended to start with the default values in the new architecture and only make minor adjustments for special cases. + +In table split mode, pay special attention to the following settings: + +- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its Region count exceeds this threshold. For scenarios where the Region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. +- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` Regions. +- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a non-`0` value. + ## Compatibility ### DDL progress tracking table From 9026229fe50a2f38f89b335ccf0c6ca8c3e86b67 Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:26:55 +0800 Subject: [PATCH 03/16] update --- ticdc/ticdc-data-sync-capabilities.md | 31 --------------------------- 1 file changed, 31 deletions(-) delete mode 100644 ticdc/ticdc-data-sync-capabilities.md diff --git a/ticdc/ticdc-data-sync-capabilities.md b/ticdc/ticdc-data-sync-capabilities.md deleted file mode 100644 index 7f6f9ea8053ba..0000000000000 --- a/ticdc/ticdc-data-sync-capabilities.md +++ /dev/null @@ -1,31 +0,0 @@ ---- -title: TiCDC's Data Synchronization Capability -summary: Learn the TiCDC's data synchronization capabilities. ---- - -# TiCDC's Data Synchronization Capability - -## Backgroud - -TiCDC (TiDB Change Data Capture) is a core component for real-time data synchronization in the TiDB ecosystem. - -1. TiCDC monitors TiKV's Raft Log to convert row-level data changes (insert/update/delete) into downstream-compatible SQL statements. Unlike Binlog, TiCDC does not rely on parsing SQL statements. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). - - -2. TiCDC generates logical operations (such as INSERT/UPDATE/DELETE) that are equivalent to SQL semantics, rather than restoring the original SQL executed upstream one by one. Refer to [TiCDC's Implementation Principles for Processing Data Changes](/ticdc/ticdc-overview.md#implementation-of-processing-data-changes). - -3. TiCDC provides the guarantee of eventual consistency of transactions. [redo log](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios) provides the final consistency guarantee in disaster recovery scenarios. [Syncpoint](/ticdc/ticdc-upstream-downstream-check.md#enable-syncpoint) provides consistent snapshot reads and data consistency checks. - -4. TiCDC supports synchronizing data to multiple downstreams, including [TiDB and MySQL-compatible databases](/ticdc/ticdc-sink-to-mysql.md), [Kafka](/ticdc/ticdc-sink-to-kafka.md), [Pulsar](/ticdc/ticdc-sink-to-pulsar), [storage services (Amazon S3, GCS, Azure Blob Storage, and NFS](/ticdc/ticdc-sink-to-cloud-storage.md). - -## Data synchronization capabilities of TiCDC - -1. TiCDC supports synchronizing DDL and DML statements executed upstream, but does not synchronize DDL and DML executed in upstream system tables (including `mysql.*` and `information_schema.*`), nor does it synchronize temporary tables created in the upstream. - -2. TiCDC does not support synchronizing DQL (Data Query Language) statements, nor does it support synchronizing DCL (Data Control Language) statements. - -3. TiCDC supports synchronizing the settings of the index in the upstream table through DDL (`add index`, `create index`), and in order to reduce the impact on the synchronization delay of Changefeed, if the downstream is TiDB, TiCDC will [asynchronously execute the DDL operations of creating and adding indexes](/ticdc/ticdc-ddl.md#asynchronous-execution-of-add-index-and-create-index-ddls). - -4. For the foreign key constraints set in the table, TiCDC will synchronize the corresponding DDL (`add foreign key`) statements, but TiCDC is not responsible for synchronizing the settings of upstream system variables, such as [foreign_key_checks](/system-variables.md#foreign_key_checks). Therefore, customers need to set appropriate system variables in the downstream to determine whether the downstream foreign key constraint check is enabled. - -5. TiCDC only checks the integrity of the upstream changes received internally, and does not participate in checking whether the data changes meet the various downstream constraints. If a data change that does not meet the downstream constraints is encountered, TiCDC will report an error when writing downstream. \ No newline at end of file From 9aca5642a04c6c39385c61a239660d1e2873f527 Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:27:50 +0800 Subject: [PATCH 04/16] Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 651429b407c47..01139f0f152b0 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -74,7 +74,7 @@ In table split mode, pay special attention to the following settings: - `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its Region count exceeds this threshold. For scenarios where the Region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. - `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` Regions. -- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a non-`0` value. +- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a value other than `0`. ## Compatibility From cb3ea14d3e4f58346c21684238eda6c8ffd23d90 Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:28:01 +0800 Subject: [PATCH 05/16] Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 01139f0f152b0..a76f8bc1fe9b0 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -68,7 +68,7 @@ When this feature is enabled, TiCDC automatically splits and distributes tables ### Configuration recommendations for table split mode -After switching to the TiCDC new architecture, it is not recommended to continue using table split-related settings from the classic architecture. In most scenarios, it is recommended to start with the default values in the new architecture and only make minor adjustments for special cases. +After switching to the TiCDC new architecture, you should not continue using table split-related settings from the classic architecture. In most scenarios, it is recommended that you start with the default values in the new architecture and only make minor adjustments for special cases. In table split mode, pay special attention to the following settings: From 1eb35616b9ad2cf2606cc00ba49352379ffffcf5 Mon Sep 17 00:00:00 2001 From: hongyunyan <649330952@qq.com> Date: Wed, 25 Feb 2026 13:29:31 +0800 Subject: [PATCH 06/16] update --- ticdc/ticdc-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index a76f8bc1fe9b0..a68ce9f43870a 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -72,8 +72,8 @@ After switching to the TiCDC new architecture, you should not continue using tab In table split mode, pay special attention to the following settings: -- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its Region count exceeds this threshold. For scenarios where the Region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. -- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` Regions. +- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its region count exceeds this threshold. For scenarios where the region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. +- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` regions. - `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a value other than `0`. ## Compatibility From 0192efee8c3d9aa974ffb3ca39a70edf196c0e76 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Wed, 25 Feb 2026 20:46:46 +0800 Subject: [PATCH 07/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index a68ce9f43870a..95e4dea5584d2 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -68,13 +68,13 @@ When this feature is enabled, TiCDC automatically splits and distributes tables ### Configuration recommendations for table split mode -After switching to the TiCDC new architecture, you should not continue using table split-related settings from the classic architecture. In most scenarios, it is recommended that you start with the default values in the new architecture and only make minor adjustments for special cases. +After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Adjust parameters only in special cases, and make minor incremental changes based on the defaults. -In table split mode, pay special attention to the following settings: +In table split mode, pay attention to the following settings: -- `scheduler.region-threshold`: the default value is `10000`. TiCDC splits a table when its region count exceeds this threshold. For scenarios where the region count is relatively low but overall table traffic is high, you can decrease this value appropriately. However, this value must not be less than `scheduler.region-count-per-span`; otherwise, tasks might be scheduled repeatedly, increasing replication latency. -- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter, so that each split sub-table contains at most `region-count-per-span` regions. -- `scheduler.write-key-threshold`: the default value is `0` (disabled by default). TiCDC splits a table when its sink write traffic exceeds this threshold. It is not recommended to set this parameter to a value other than `0`. +- `scheduler.region-threshold`: the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. +- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- `scheduler.write-key-threshold`: the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. ## Compatibility From 8e992d5c2fe9e9f0a5464814c6a25fe68f588f47 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 15:48:27 +0800 Subject: [PATCH 08/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 95e4dea5584d2..8af02a1b91e94 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -64,9 +64,9 @@ When this feature is enabled, TiCDC automatically splits and distributes tables > **Note:** > -> For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table split mode. +> For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table-level task splitting mode. -### Configuration recommendations for table split mode +### Recommended configurations for table-level task splitting mode After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Adjust parameters only in special cases, and make minor incremental changes based on the defaults. From 3e597584f615d3dad101d4d4e0f947d0da517695 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:00:47 +0800 Subject: [PATCH 09/16] Update ticdc-changefeed-config.md --- ticdc/ticdc-changefeed-config.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 32bed5fd6152e..781294f948079 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -163,6 +163,11 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt - The value is `false` by default. Set it to `true` to enable this feature. - Default value: `false` +#### `region-count-per-span` + +- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- Default value: `100` + #### `region-threshold` - Default value: for the [TiCDC new architecture](/ticdc/ticdc-architecture.md), the default value is `10000`; for the [TiCDC classic architecture](/ticdc/ticdc-classic-architecture.md), the default value is `100000`. From 77bd6c1babba95a4fea7fa68f235e6247f3d60f5 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:04:29 +0800 Subject: [PATCH 10/16] Update ticdc-architecture.md --- ticdc/ticdc-architecture.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 8af02a1b91e94..dcfbd3066f514 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -72,9 +72,9 @@ After switching to the new TiCDC architecture, do not reuse the table-splitting In table split mode, pay attention to the following settings: -- `scheduler.region-threshold`: the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. -- `scheduler.region-count-per-span`: the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. -- `scheduler.write-key-threshold`: the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. +- [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. +- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. ## Compatibility From c4295c3393d325c8f28544b11c0fedfd1ae19280 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 16:06:39 +0800 Subject: [PATCH 11/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index dcfbd3066f514..efdecd24ccd72 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -68,7 +68,7 @@ When this feature is enabled, TiCDC automatically splits and distributes tables ### Recommended configurations for table-level task splitting mode -After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Adjust parameters only in special cases, and make minor incremental changes based on the defaults. +After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Make incremental adjustments based on the default values only in special scenarios where replication performance bottlenecks or scheduling imbalance occur. In table split mode, pay attention to the following settings: From 3163361b3cfd606f20af7323a80d8ade1bb9f4df Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 16:20:36 +0800 Subject: [PATCH 12/16] Update ticdc/ticdc-architecture.md --- ticdc/ticdc-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index efdecd24ccd72..88016ff93d4df 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -74,7 +74,7 @@ In table split mode, pay attention to the following settings: - [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. - [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. -- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. +- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to the default value `0`. ## Compatibility From 925192050775800634d49efb43fe6773156d71d2 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Feb 2026 16:29:15 +0800 Subject: [PATCH 13/16] Apply suggestions from code review Co-authored-by: Grace Cai --- ticdc/ticdc-architecture.md | 4 ++-- ticdc/ticdc-changefeed-config.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index 88016ff93d4df..df10fc17d8ba5 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -66,14 +66,14 @@ When this feature is enabled, TiCDC automatically splits and distributes tables > > For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table-level task splitting mode. -### Recommended configurations for table-level task splitting mode +### Recommended configurations for table-level task splitting After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Make incremental adjustments based on the default values only in special scenarios where replication performance bottlenecks or scheduling imbalance occur. In table split mode, pay attention to the following settings: - [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. -- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to the default value `0`. ## Compatibility diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 781294f948079..dbc2bc4e648b5 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -165,7 +165,7 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt #### `region-count-per-span` -- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - Default value: `100` #### `region-threshold` From e44a46ad784ec2411d67e9abb0908158b5a79b4d Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:47:23 +0800 Subject: [PATCH 14/16] Update ticdc-changefeed-config.md --- ticdc/ticdc-changefeed-config.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index dbc2bc4e648b5..003d2d8e4a386 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -163,7 +163,7 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt - The value is `false` by default. Set it to `true` to enable this feature. - Default value: `false` -#### `region-count-per-span` +#### `region-count-per-span` New in v8.5.4 - During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - Default value: `100` From 50322f8a94d597bdecf557975dbef92ab4bba462 Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:51:44 +0800 Subject: [PATCH 15/16] Update ticdc-architecture.md --- ticdc/ticdc-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index df10fc17d8ba5..8438066bf1919 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -73,8 +73,8 @@ After switching to the new TiCDC architecture, do not reuse the table-splitting In table split mode, pay attention to the following settings: - [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. -- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. -- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to the default value `0`. +- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span-new-in-v854): the default value is `100`. During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` regions. +- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter to `0`. ## Compatibility From becbc32267ef9a0d6e764ca8d652810f3edd499c Mon Sep 17 00:00:00 2001 From: houfaxin Date: Thu, 26 Feb 2026 16:55:23 +0800 Subject: [PATCH 16/16] Update ticdc-changefeed-config.md --- ticdc/ticdc-changefeed-config.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 003d2d8e4a386..37afc633077ce 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -165,7 +165,7 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt #### `region-count-per-span` New in v8.5.4 -- During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. +- Introduced in the [TiCDC new architecture](/ticdc/ticdc-architecture.md). During changefeed initialization, tables that meet the split conditions are split according to this parameter. After splitting, each split sub-table contains at most `region-count-per-span` Regions. - Default value: `100` #### `region-threshold`