Here are some notes on “SQL Server 2008 Replication Overview” I took while attending an advanced class on SQL Server taught by Greg Low (from http://sqlblog.com/blogs/greg_low/ and http://www.sqldownunder.com/).
Please note that, although these notes were taken during the class, I might have added some of my own (mis)interpretation :-). Always check your facts on Books Online (I try to provide links when applicable). As with anything else you get from a blog, never use any of this in production before you thoroughly validate it a test environment. Please keep in mind that some of those will be hard to follow without some pre-requisite knowledge and the right context. Reading the post from top to bottom will help.
- Can copy data and objects
- Synchronize copies to maintain consistency
- Can be seen as a data distribution technology
- See http://msdn.microsoft.com/en-us/library/ms151198.aspx
- Publisher-Subscriber – like a magazine
- Articles are tables, views, stored procedures
- Publication is a set of articles
- Publisher, Distributor and Subscriber
- Scenario: Live Reporting Server – usually a read-only copy
- Scenario: Data Locality, HA, Rolling upgrades – take individual server offline, keep running
- Scenario: Point of sale application – download reference data, upload transactions, conflicts are rare, SQL CE
- Scenario: Sales force automation – offline user, download reference data, upload transactions
- Oldest type, started as read-only.
- Over time, improved scalability and availability
- Later introduced immediately updating subscribers, bi-directional updates
- Later introduced queued updating subscribers (should transition to Broker now)
- Server to server, high throughput, based on the log
- Scenarios: Reporting, reporting consolidation, read scale out, offload batch processing
- A form of transactional replication, bi-directional
- Scenarios: HA, Scale Out, Upgrade
- SQL Server 2008 added conflict detection system
- SQL Server 2008 can add nodes on the fly
- Mobile Applications, Distributed Server apps with conflicts
- Scenarios: Exchanging data with mobile users, consumer POS apps, Integration of data from multiple sites
- Topologies: Central with nodes, Distributed with multiple levels (tree-like)
- Changes are tracked by table, not as transactions in a set of tables
- Consider performance and scale – based on triggers on tables
- Merge agents does most of the work – needs to figure out which rows needs replicating
- Conflict resolution
- Copy the entire data set every time
- Used to create the initial data set for transactional or merge replication
- Scenario: Complete refreshes of data, volume of data is low compared to transactions
- Other ways to create initial set: backup/restore, transported via other means
Sync Services (ADO.NET)
- Client-centric technology, part of Microsoft Sync Framework
- Addressing common scenarios in mobile devices
- Row-level conflict detection for offline updates, simple filtering, extensible design
- SQL Server 2005 – Finds changes with Triggers
- SQL Server 2008 – Finds changes with Change Tracking
- It does all the plumbing work for offline applications
Sync Services – Demo
- Create VB application with a form
- Add sync object, connect to SDF database (SQL CE), configure tables for offline use
- Option to use SQL Server Change Tracking
- Creates a data source based on the sync object
- Dragged data source to the form to create the UI
- Add button to actually perform the sync operation (paste code example from sync object)
- Takes care of the work to connect, sync, keep track of everything
- Can set sync direction (server always win, client always win, bi-direction with conflict resolution)
- Framework is pretty straightforward and ties nicely with SQL Server
- Careful – SQL CE data types are not the same as full SQL Server
- Oracle publishing introduced in SQL Server 2005
- Previous support for DLL-based non-SQL Server publishing removed
- RMO classes are quite generic
- SQL Server Agent: Not a replication agent, but fires the replication jobs.
- Agents: Can run via RMO or command line (lets you specify custom parameters)
- Snapshot Agent: All types. Prepares schema and initial data. Stores snapshots, sync information.
- Log Reader Agent: Transactional. Copies transactions into distribution database. One per published DB.
- Distribution Agent: Transactional/Snapshot. Applies initial replica, transactions to the subscribers.
- Merge Agent: Merge. One agent per subscriptions. Updates both (upload first by default, option for parallel)
- Push/Push: Running Distribution/Mege Agent on distributor (Push) or Subscriber (Pull)
- Anonymous vs. Named Subscriptions: Named keeps data until subscriber receives it, anonymous has a time limit
- Queue Reader Agent: Transaction with Queued update. Runs on distributor. Deprecated. Use broker instead.
- Clean up Agents
- Reinitialize subscriptions with Validation Failures. Detects failures, marks them for re-initialization.
- Replication agents checkup Agent. Agent to check the agents.
- Replication Monitor Refresher.
- Replication Monitor: Configurable replication monitor grids, agent tab to view all agents and jobs
- Behavior of the agent configured by settings
- sp_help_agent_profile (1=snapshot, 2=log reader, 3=distribution, 4=merge, 9=queue reader)
- See http://msdn.microsoft.com/en-us/library/ms179923.aspx
- Agent runs in the context of an account.
- Integrated security is preferred, could use SQL Server security.
- Implement least previledge principle.
- Books Online: Replication Agent Security Model
- See http://msdn.microsoft.com/en-us/library/ms151868.aspx
- Careful – permission to access to the snapshot folder
- See http://msdn.microsoft.com/en-us/library/ms151151.aspx