6.6. Storage mirroring

Data mirroring 1/6

Data mirroring

Data mirroring is the copying of data from one location to storage in another location in real time. As the data is copied in real time, the storage in the secondary location always keeps an exact copy of the the primary storage. Data mirroring is primarily used in disaster recovery applications.

Levels of Data Mirroring

Data Mirroring can be achieved at different levels.

  • Operating System layer - eg. LVM mirroring, Clustered file systems (GPFS replication).
  • Storage layer Peer to Peer Remote Copy - eg. Metro Mirror, Global Mirror.
  • Application layer - eg. database mirroring.

Operating System layer 2/6

LVM mirroring

The concept of an LVM mirror is a RAID 1, with the mirrored volumes are located in two different locations, and a redundant LVM controller duplicated in both locations. The mirroring only concerns logical volumes. In case one of the locations is unavailable, one part of the mirror is still available in the second location.

Advantages

  • Both mirrors are used for read/write
  • LVM is using the local copy (preferred) for its reads - no performance degradation
  • LVM is writing in parallel on the two copies (limited impact)
  • No outage in case of storage failure. Managed by LVM
  • Limited to specific OS (Linux, AIX)

Disadvantages

  • Distance limited by latency / bandwidth
  • Short outage in case of node failure

Clustered file system (GPFS replication)

A clustered file system is made up of two more, distinct, geographicallyseparate hardware sites operating in a coordinated fashion. Two or more of the sites consist of storage nodes and storage resources holding a complete replica of the file system. By maintaining a replica of the file system’s data at a geographically-separate location, the system sustains its processing capabilities by using the secondary replica of the file system in the event of a total failure in the primary environment.

Advantages

  • Storage failures are managed automatically by GPFS, with no database outage
  • No outage in case of storage or site failure

Disadvantages

At least three sites are required

Storage layer - general description 3/6

Peer to Peer Remote Copy

Hardware solution which allows to replicate storage between two storage devices in remote locations to ensure business continuity. Synchronization of the data between the storage device in the primary site and the storage device in the secondary site is done automatically at the storage device level.

  • Metro Mirror Synchronous mirroring means that each update to the source storage unit must also be updated in the target storage unit before another update can process.
  • Global Copy Offers a non-synchronous long-distance copy option whereby write operations to a storage unit at your production site are considered complete before they are transmitted to a storage unit at your recovery site.
  • Global Mirror Provides a long-distance remote copy solution across two sites using asynchronous technology.

Storage layer - Metro Mirror 4/6

Metro Mirror functionality

Metro Mirror functions offer a synchronous long-distance copy option that constantly updates a secondary copy of a volume to match changes made to a source volume. Synchronous mirroring means that each update to the source storage unit must also be updated in the target storage unit before another update can process.

When Metro Mirror receives a host update to the source volume, it completes the corresponding update to the target volume. This guarantees data consistency by ensuring that a write operation that completes is received by the host application after the update has been committed to the target storage unit and acknowledged by both the source and target storage units. This results in near perfect data consistency but can result in lag time between transactions.

With Metro Mirror, consistency is guaranteed across all volumes on which an application does write operations. When error conditions affect some of the volume pairs (or different volume pairs at different time), this consistency might be lost. Metro Mirror copying supports a maximum distance of 300 km (186 mi). Delays in response times for Metro Mirror are proportional to the distance between the volumes. However, 100% of the source data is available at the recovery site when the copy operation ends.

Metro Mirror operation

A copy to the target storage unit is synchronous with the source volumes I/O operation.

  1. An application requests a write I/O to the source storage unit. The write I/O is written into cache and nonvolatile storage (NVS).
  2. Metro Mirror sends the write I/O to the target storage unit cache and NVS.
  3. The storage unit at the recovery site signals that the write operation has completed when the updated data is in its cache and NVS.
  4. When the storage unit at the production site receives notification from the target storage unit that the write operation has completed, it returns the I/O completed status to your application.

Storage layer - Global Copy 5/6

Global Copy functionality

Global Copy functions offer a non-synchronous long-distance copy option whereby write operations to a storage unit at your production site are considered complete before they are transmitted to a storage unit at your recovery site. Global Copy is a non-synchronous mirroring function.

Host updates to the source volume are not delayed by waiting for the update to be confirmed by a storage unit at your recovery site. The source volume sends a periodic, incremental copy of updated tracks to the target volume instead of a constant stream of updates. There is no guarantee that dependent write operations are transferred in the same sequence that they have been applied to the source volume. This non-synchronous operation results in a fuzzy copy at the recovery site; however, through operational procedures, a point-in-time consistent copy can be created at the recovery site that is suitable for data migration, backup, and disaster recovery purposes.

The Global Copy function can operate at very long distances - well beyond the 300 km distance that is supported for Metro Mirrorand with minimal impact to applications, with the distance limited only by the network and the channel extended technology. During a disaster, data can be restored only to the last known consistent increment that was created. This means that data that is written to the production site but is waiting to be transferred to the recovery site is lost whenever the two storage units can no longer communicate.

Global copy operation

The following describes the Global Copy write sequence:

  1. During a Global Copy operation, the storage unit at the production site captures information about updates to the source and periodically sends those updates to the target volume at the recovery site.
  2. After the initial copy of tracks, the storage unit series periodically starts a synchronization cycle where all updated tracks, in ascending order from the lowest numbered track, are copied from the source volume to the target volume. The storage unit updates the target tracks with the current information for each track, regardless of the number of updates between the time that each track was last copied, the current time, and the order in which the updates occurred.
  3. When this process completes, the cycle is repeated. There is little response time degradation on application write operations in extended distance mode.
  4. Write updates to the source volume receive an immediate completion because the synchronization cycle is independent of the updates to the source volume.

Flash Copy

A flash copy is a point-in-time, full volume copy of data, with the copies immediately available for read or write access. It creates a copy of a source volume on the target volume. This copy is called a point-in-time copy.

Storage layer - Global Mirror 6/6

Global Mirror functionality

Global Mirror processing provides a long-distance remote copy solution across two sites using asynchronous technology. Global Mirror processing is most often associated with disaster recovery or preparing for disaster recovery. However, it can also be used for everyday processing and data migration.

The Global Mirror function is designed to mirror data between volume pairs of a storage unit over greater distances without affecting overall performance. It is also designed to provide application consistent data at a recover (or remote) site in case of a disaster at the local site. By creating a set of remote volumes every few seconds, this function addresses the consistency problem that can be created when large databases and volumes span multiple storage units. With Global Mirror, the data at the remote site is maintained to be a point-in-time consistent copy of the data at the local site.

Global Mirror is based on existing Copy Services functions: Global Copy and FlashCopy. Global Mirror operations periodically invoke a point-intime FlashCopy at the recovery site, at regular intervals, without disrupting the I/O to the source volume, thus giving a continuously updating, nearly up-to-date data backup. Then, by grouping many volumes into a session, which is managed by the master storage unit, it can copy multiple volumes to the recovery site simultaneously while maintaining point-in-time consistency across those volumes.

Global Mirror benefits

  • Support for virtually unlimited distances between the local and remote sites, with the distance typically limited only by the capabilities of your network and the channel extension technology.
  • A consistent and restartable copy of the data at the remote site, created with minimal impact to applications at your local site.
  • Data currency, where your remote site might lag behind your local site by 3 to 5 seconds, minimizing the amount of data exposure in the event of an unplanned outage.
  • Session support whereby data consistency at the remote site is internally managed across up to eight storage units that are located across the local and remote sites.
  • Efficient synchronization of the local and remote sites with fail-over and fail-back modes, helping to reduce the time to switch back to the local site after an outage.

Global Mirror automatic cycle

  1. Consistency groups of volumes are created at the local site.
  2. Increments of consistent data are sent to the remote site.
  3. FlashCopy operations are performed at the remote site.
  4. Global Copy operations are resumed between the local and remote site to copy out of sync tracks.
  5. The steps are repeated according to the defined time intervals.




Projekt Cloud Computing – nowe technologie w ofercie dydaktycznej Politechniki Wrocławskiej (UDA.POKL.04.03.00-00-135/12)jest realizowany w ramach Programu Operacyjnego Kapitał Ludzki, Priorytet IV. Szkolnictwo wyższe i nauka, Działanie 4.3. Wzmocnienie potencjału dydaktycznego uczelni w obszarach kluczowych w kontekście celów Strategii Europa 2020, współfinansowanego ze środków Europejskiego Funduszu Społecznego i budżetu Państwa