Master MySQL Load: Control Snapshot Sync
Ever worried about your MySQL master server groaning under the weight of automatic snapshot building during MygramDB startup? We've all been there! That sudden spike in load can disrupt your operations and cause performance hiccups. But what if you could gain manual control over this crucial process? This article dives deep into the Manual Snapshot Synchronization (SYNC Command) Design Document, a game-changer for managing your MygramDB instances with operational safety and flexibility as top priorities. Get ready to say goodbye to unexpected MySQL load and hello to seamless, operator-driven snapshot management!
1. Overview: Taking Charge of Your Snapshots
Traditionally, MygramDB has been quite proactive, automatically building snapshots from your MySQL database right when the server starts up. While convenient in some scenarios, this default behavior can lead to unexpected load on your MySQL master server, potentially causing performance issues or even downtime during critical startup sequences. The Manual Snapshot Synchronization (SYNC Command) design aims to put the power back in your hands. It introduces a way for you, the operator, to decide when snapshot synchronization occurs, offering a much-needed layer of control. Our primary goals are clear: to ensure operational safety by preventing those jarring load spikes, provide flexibility in scheduling these operations, offer transparency through clear feedback on progress and replication status, support multi-table synchronization independently, and most importantly, maintain data safety by preventing conflicts. For this initial phase, we're focusing on the core functionality, so things like incremental synchronization or real-time progress streaming to clients are non-goals. We're building a solid foundation first, with more advanced features planned for later iterations. This means you can confidently manage your snapshots without disrupting your database's performance.
2. Requirements: What We Need to Achieve
To make manual snapshot synchronization a reality, we've outlined a set of functional and non-functional requirements. Functionally, we need to introduce a auto_initial_snapshot flag in the configuration to disable the automatic snapshot build on startup. This is key to regaining control. Then, we need a clear way to trigger this process manually: the SYNC command. Once triggered, we need to provide a SYNC STATUS command so you can monitor the synchronization progress in real-time, ensuring you know exactly what's happening. Crucially, the SYNC command must execute asynchronously, returning immediately so it doesn't block your operations, while the synchronization happens in the background. After a successful snapshot is built, we need to ensure the system automatically starts or restarts the binlog replication process from the correct point, using the GTID captured during the snapshot. To prevent data corruption and ensure integrity, we must implement conflict prevention mechanisms, blocking conflicting operations like DUMP LOAD or REPLICATION START while a SYNC is in progress. Finally, the system must support multi-table synchronization, allowing operators to perform independent synchronization for different tables without interference. On the non-functional side, the SYNC operation must be non-blocking for other client connections, maintaining overall server responsiveness. It should be safe by default, meaning the default configuration prevents unexpected MySQL load. Graceful shutdown is essential; any ongoing SYNC operations must be cancelled cleanly. We also need to ensure memory safety by checking available memory before starting a SYNC, and the entire process must be thread-safe to support concurrent operations on different tables. Lastly, the SYNC command should be idempotent, meaning running it multiple times safely achieves the desired state.
3. Architecture: How It All Fits Together
The architecture of the manual snapshot synchronization is designed for clarity and robustness. The most significant change is to the default behavior. Previously, MygramDB would automatically build snapshots on startup. Now, with auto_initial_snapshot set to false (the new default), it will skip this step. This means the MySQL load will only occur when an operator explicitly triggers it using the SYNC command. Replication, which used to auto-start after a snapshot, will now also start only after a manual SYNC. The high-level SYNC flow starts when a client sends the SYNC command, for instance, SYNC articles. The TcpServer receives this, performs checks – is the table already syncing? Is the memory healthy? If all checks pass, it marks the table as syncing and launches a background thread. This thread then executes BuildSnapshotAsync(), which connects to MySQL, builds the snapshot using the SnapshotBuilder, captures the Global Transaction Identifier (GTID), and crucially, starts or restarts the BinlogReader from that specific GTID. Once complete, the table is unmarked as syncing. The component interactions are visualized clearly: a client initiates SYNC, the TcpServer handles it, checks memory and existing sync states, and if clear, launches a background thread. This thread interacts with the SnapshotBuilder and MySQL, capturing the GTID, and finally starting the BinlogReader. We also show how other commands like DUMP LOAD and SYNC STATUS interact with this new mechanism, ensuring conflicts are handled and status is reported. Key components include the SyncHandler for command processing, SyncState for tracking progress, BuildSnapshotAsync for the heavy lifting, a Conflict Detector to enforce rules, and a Shutdown Canceller to ensure clean exits. This modular design ensures that each part of the system works cohesively to provide a reliable and controllable snapshot synchronization experience.
4. Command Specification: Your New Toolkit
This new feature introduces two essential commands: SYNC and SYNC STATUS. The SYNC command allows you to initiate the snapshot building process. You can sync all tables by simply typing SYNC, or target a specific table with SYNC [table_name], like SYNC articles. Upon success, the server responds with OK SYNC STARTED table=articles job_id=1, confirming that the operation has begun in the background and providing a unique identifier. If there's an issue, like a SYNC already in progress for that table or critically low memory, you'll receive a clear error message, such as ERROR SYNC already in progress for table 'articles'. The SYNC STATUS command is your window into the ongoing process. When you run SYNC STATUS, you'll get detailed feedback. If a sync is in progress, it might look like table=articles status=IN_PROGRESS progress=10000/25000 rows (40%) rate=5000 rows/s. Once completed, it will show table=articles status=COMPLETED rows=25000 time=5.2s gtid=xxxx:123 replication=STARTED. If the sync fails, you'll get `table=articles status=FAILED rows=5000 error=