Summary
In this chapter, we’ve reviewed several essential elements of operations management for
cloud data centers. We discussed the importance of monitoring system and component
performance and performing routine maintenance (to include patching) as well as certain
risks and benefits associated with each. Issues related to environmental conditions such as
temperature, humidity, and backup power supply were included. We also detailed specific
approaches and methods for BC/DR planning and testing.
Exam Essentials
Understand the role of business continuity and disaster recovery programs in the
cloud.
Business continuity efforts are concerned with maintaining critical operations during
any interruption in service, whereas disaster recovery efforts are focused on the resumption
of operations after an interruption due to disaster. The two are related and in many organizations are rolled into one effort, often named continuity management.
Explain key business continuity terms, including RTO, RPO, and RSL. The recovery time objective (RTO) is the goal for recovery of operational capability after an interruption in service, measured in time. The recovery point objective (RPO) is the goal for limiting the loss of data from an unplanned event, and it is also measured in time. The recovery service level (RSL) is the proportion of a service, expressed as a percentage, that is necessary for continued operations during a disaster.
Explain the importance of security hygiene practices, including patching and baselining. Applying security patches, either manually or automatically, protects the organization from newly emerging security vulnerabilities. Baselining provides a standard, secure configuration from which systems may be built. Organizations should monitor systems and applications for deviations from security baselines that should be investigated and documented.
Explain the standard processes used for IT service management in an organization. Organizations often adopt standards for IT service management, including ITIL and ISO/IEC 20000- 1. The processes covered in service management programs include change management, continuity management, information security management, continual service improvement management, incident management, problem management, release management, deployment management, configuration management, service- level
management, availability management, and continuity management.
Understand the role of change and configuration management. Change management is the process used to review, approve, and document any modifications to the environment. Configuration management entails documenting the approved settings for systems and software, which helps establish baselines within the organization.