Transcription of Data Replication Options in AWS - Amazon Web Services
1 2013 , Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of , Inc. data Replication Options in AWS Thomas Park Manager, Solutions Architecture November 15, 2013 Thomas Jefferson first acquired the letter-copying device he called "the finest invention of the present age" in March of 1804. Agenda data Replication design Options in AWS Replication design factors and challenges Use cases Do-it-yourself Options Managed and built-in AWS Options Demos data Replication Capabilities AWS Global Infrastructure Application Services Networking Deployment & Administration database Storage Compute AWS data Replication Capabilities Partner data Replication Capabilities data Replication Solution Architecture Business continuity Disaster recovery Customer experience Productivity Mismatched SLA Compliance Reducing cost Information security risks Global
2 Expansion Performance/availability Business Drivers AWS Capabilities Solutions Architecture Multiple Availability Zones Multiple regions AMI Amazon EBS and DB snapshots AMI copy Amazon EBS and DB snapshot copy Multi-AZ DBs Read replica DBs Provisioned IOPS Offline backups and achieves data lifecycle policies Highly durable storage Measure Metrics Evaluate Options Focus of Our Discussion Today data Preservation Databases Performance Storage/Content (Files and Objects) data Type Business Drivers Replication Options Design Factors Design Options Focus of Our Discussion Today data Preservation Databases Performance Storage/Content (Files and Objects) data Type Business Drivers Replication Options Design Factors Design Options Design Options in AWS Multi-AZ Cross Region Hybrid IT Single AZ Databases Files and Objects AZ AZ AZ Region Region Focus of Our Discussion Today data Preservation Databases Performance Storage/Content (Files and Objects)
3 data Type Business Drivers Replication Options Design Factors Design Options DR Metrics RTO/RPO Time Last Backup Event data Restored RPO 4 Hours RTO 5 Hours 2:00am 6:00am 11:00am Physical vs. Logical Synchronous vs. Asynchronous data Replication Options in AWS Multi-AZ Cross Region Hybrid IT Single AZ Databases Files and Objects AZ AZ AZ Region Region Preservation Focus of Our Discussion Today Databases Storage/Content (Files and Objects) data Type Business Drivers Replication Options Design Factors Design Options data Preservation Performance Performance Metric Total Time Estimated DB Size ~35 TB Estimated DB Size ~48 TB Daily and Weekly Updates ETL Source DB Target DB Can we do this in 10 hours?
4 600M Records and 320 GB in Size Performance Metric Total Time Estimated DB Size ~70 TB Estimated DB Size ~48 TB ETL Source DB Target DB Can we STILL do this in 10 hours? data Replication Options in AWS Multi-AZ Cross Region Hybrid IT Single AZ Databases Files and Objects AZ AZ AZ Region Region Preservation Performance Focus of Our Discussion Today Databases Storage/Content (Files and Objects) data Type Business Drivers Replication Options Design Factors Design Options data preservation Performance Factors Affecting Replication Designs Source Target Replicate Read/Write Read/Write Latency Bandwidth Throughput data Change Rate 1 2 3 5 4 Size of data Consistency 6 Challenges in Replication Availability & Performance data Size Consistency Change Rate database Compute Network Storage Infrastructure Capabilities Challenges in Replication Availability & Performance data Size Consistency Change Rate Infrastructure Capabilities database Compute Network Storage
5 Challenges in Replication Availability & Performance data Size Change Rate database Compute Network Storage Replication Design Options in AWS Flexibility Options The right tool for the right job Common data Replication Scenarios Hybrid IT database migration HA databases Increase throughput Cross regions data warehousing Please Meet Bob DBA for a large enterprise company 10 years of IT experience What is AWS? Disaster Recovery Bob, DBA Sue, DBA I can t find archlog_002 file!!!!!! We Need a Better MySQL SQL Server Oracle Daily 5 - 6 hours RTO is 8 hours RPO is 1 hour Bob, DBA Demo AWS S3 Upload Corporate data Center Amazon S3 Bucket Generic database DB Full Backup Think Parallel 2 Seconds Multipart Think Parallel 2 Seconds 2 Seconds 2 Seconds 2 Seconds 8 Seconds Foreach($file in $files)
6 {Write-S3 Object -BucketName mybucket -Key $ } Think Parallel Nearly 3 Days Think Parallel 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 120,000 files @ 15,000 TPS = 8 seconds Multiple Machines, Multiple Threads, Multiple Parts Demo AWS S3 Multipart Upload Corporate data center Amazon S3 Bucket Generic database DB Full Backup Think Parallel Use multipart upload (API/SDK or command line tools) min part size is 5 MB Use multiple threads GNU parallels: parallel -j0 -N2 --progress /usr/bin/s3cmd put {1} {2} Python multiprocessing.
7 Net parallel extensions, etc. Use multiple machines Limited by host CPU / memory / network / IO database Replication Options Bob, DBA Tom, Sys Admin How are you going to replicate databases? database Replication Options in AWS MySQL SQL Server Oracle Amazon RDS Replication Bob s Office Non-RDS to RDS database Replication Availability Zone A Corporate data Center Amazon RDS MySQL Dump mysqldump 2 3 AWS S3 CP Configure to Be a Master 1 Amazon S3 Bucket 4 mysqldump Initialize Bob s Office Non-RDS to RDS database Replication Availability Zone A Corporate data Center MySQL Run 5 Bob s Office Amazon RDS Non-RDS to RDS database Replication Availability Zone A Corporate data Center MySQL Bob s Office Run 6 Amazon RDS database Replication Options MySQL SQL Server
8 Oracle Amazon S3 Bucket Amazon RDS Log Shipping SQL Server Restore Bob s Office database Replication Options MySQL SQL Server Oracle OSB Amazon RDS OSB Cloud Module Oracle SQL Server RMAN restore Bob s Office Amazon S3 Bucket database Replication Options Amazon RDS MySQL Replication SQL Server and Oracle on EC2 SQL server log shipping, always on, mirroring, etc. Oracle RMAN/OSB, Active data Guard, etc. Bob, DBA HA database Replication Options Bob, DBA Katie, Director We need a highly available solution. HA DB Replication Options Oracle SQL Server Availability Zone A Availability Zone B SQL Server Amazon RDS DB Instance Standby (Multi-AZ) Oracle Standby data Guard data Guard Configuration Prepare primary database logging standby redo logs data guard parameters to and Prepare standby database environment or clone the Oracle home password file (orapwdSID)
9 From primary database data guard parameters to and Create standby database using RMAN target database for standby Configure data Guard broker database parameters on primary and standby database data Guard configuration for primary and standby using dgmgrl StaticConnectIdentifier for primary and standby data Guard configuration configuration should return success HA DB Replication Options Oracle SQL Server Availability Zone A Availability Zone B Amazon RDS DB Instance Standby (Multi-AZ) Physical Synchronous Replication Amazon RDS MySQL Multi-AZ Amazon RDS Oracle Multi-AZ Increase Throughput Bob, DBA Manager Hannah, Finance The order system is running slowly.
10 Disk I/O? Increase Throughput Options Amazon EC2 instance type Amazon RDS MySQL PIOPS Amazon RDS MySQL Read replicas Amazon RDS MySQL Bob, DBA Manager Amazon RDS Performance Options Amazon RDS DB Instance Read Replica Availability Zone A Availability Zone B Provision IOPS Asynchronous Availability Zone A Availability Zone B Web Web Web AS Web us-east-1 Availability Zone C Provision IOPS SQL Server Oracle Standby SQL Server Oracle Increase Throughput Options Amazon CloudFront Large objects Bob, DBA Manager Availability Zone A Availability Zone B SQL Server Oracle Standby Web Web Web AS Web us-east-1 Availability Zone C CloudFront SQL Server Oracle Logs Increase Throughput Options Amazon DynamoDB Sessions, orders Bob, DBA Manager Availability Zone A Availability Zone B SQL Server Oracle Standby Web Web Web AS Web us-east-1 Availability Zone C Automatic Replication CloudFront SQL Server Oracle Amazon DynamoDB Logs Cross-region Replication Options Bob, Architect Bella.