Key Features
Change Data Capture
Flink CDC supports distributed scanning of historical data of database and then automatically switches to change data capturing. The switch uses the incremental snapshot algorithm which ensure the switch action does not lock the database.
Schema Evolution
Flink CDC has the ability of automatically creating downstream table using the inferred table structure based on upstream table, and applying upstream DDL to downstream systems during change data capturing.
Streaming Pipeline
Flink CDC jobs run in streaming mode by default, providing sub-second end-to-end latency in real-time binlog synchronization scenarios, effectively ensuring data freshness for downstream businesses.
Data Transformation
Flink CDC will soon support data transform operations of ETL, including column projection, computed column, filter expression and classical scalar functions.
Full Database Sync
Flink CDC supports synchronizing all tables of source database instance to downstream in one job by configuring the captured database list and table list.
Exactly-Once Semantics
Flink CDC supports reading database historical data and continues to read CDC events with exactly-once processing, even after job failures.