Professional Documents
Culture Documents
Chapter 87: Overviewing Windows NT/Database Backup and Recovery
Chapter 87: Overviewing Windows NT/Database Backup and Recovery
Introduction
Possibly the single most important aspect of the basis implementation is the establishment of an effective backup strategy. You must be able to restore the database in the event of any type of error. The strategy you implement for backups should be manageable. For example, it should be clear to operators what tapes should be mounted and when, so that mistakes are unlikely. The chosen backup strategy should not have an adverse impact on daily business. SAP provides two easy-to-use interfaces for performing a backup: SAPDBA on the operating system level, and CCMS-DBA on the R/3 level. ORACLE also offers two types of backup: online and offline. Backups can be performed at any of the following levels: database - full backup, all data files are saved tablespace - partial backup, data files of one or more tablespaces are saved data file - one or more data files are saved
1996 SAP Technology, Inc.
871
Our recommendation: Whenever downtime has to be minimized, we recommend that you use online backups (only offline backups cause system downtime).
Recommendations
Determine whether offline or online backups or both will be used Backup at the tablespace level Increase frequency of backups
872
Increase backup frequency of heavily used tablespaces Do backups in parallel Adjust backup frequency by carrying out tests Skip index tablespaces to reduce backup downtime Back up to disk first Backups with mirrored disks can be made offline Use hardware compression with the backup Assess your tape devices Verify that your backup tapes are readable Determine whether offline or online backups or both will be used You should carefully consider whether you need to use offline or online backup. Use online backup if downtime cannot be tolerated or if your database is very large and would take too long to back up offline. Backup at the tablespace level If you have decided that a complete backup takes too long, use backups at the tablespace level. For example, if the database is comprised of 20 tablespaces, back up 4 every nights Monday through Friday (which is equivalent to one full backup a week). Here, the worst case would be that a tablespace damaged on Sunday would have to be recovered from last Monday's backup. Increase the frequency of backups More frequent backups lead to a shorter recovery time and therefore shorter downtime. One compromise would be to make less frequent full backups, (for example, one full backup on Sunday) and more frequent partial backups (for example, backup one third of all tablespaces Monday, another third on Tuesday, the last third on Wednesday, and start over for Thursday, Friday, Saturday). Here, the worst case would be to have to use a backup which is 3 days old. Increase backup frequency of heavily used tablespaces It is wise to back up heavily used tablespaces more often (for example, by including them twice in a backup cycle). The reason behind this is that heavily used tablespaces put more of a load on disk drives, thus increasing the frequency with which they fail. The more often such tablespaces are backed up, the more likely a recent backup is available to be used in a recovery. Do backups in parallel Offline and online backups can be done in parallel to increase throughput. For example, if partial online backups of tablespaces are done (see example above), you can also schedule the backups of several groups of tablespaces to run in parallel, utilizing multiple devices. BRBACKUP supports parallel backups. Adjust backup frequency by carrying out tests There is no rule of thumb to determine the backup frequency. Suppose, for example, a test showed that to apply 3 redo log files to a restored full backup took 15 minutes (that is, 5 minutes recovery time per archived redo log file) and assuming 20 redo log files are archived on average a day, then a recovery from a 3 day old full backup would take 3 * 20 * 5 minutes = 300 minutes (or five hours). If, as in this example, a
873
recovery time of five hours is too long, more frequent backups can be taken (or other techniques such as mirrored disks can be used). Skip index tablespaces to reduce backup downtime To improve backup performance, consider skipping index tablespaces completely from your backups. An index tablespace contains only indices, which can always be rebuilt if required. If index tablespaces are not backed up, the Reorganize tablespace and data files function of SAPDBA should be used to generate the scripts needed to create the tablespaces and indexes if they have to be rebuilt (this function can be stopped after the scripts have been generated, before the actual reorganization takes place). The scripts should be recreated every time that index structures or index tablespace structures have changed. You should monitor to make sure that no tables are accidentally residing in an index tablespace. If a table resides in an index tablespace it should be reported by SAPDBA during the creation of the reorganization scripts. If a table has been 'misplaced', you should move it back to the appropriate data tablespace. The recommendation to omit index tablespaces from backups is only valid if no tables reside in these tablespaces. Back up to disk first If sufficient disk space is available, consider backing up to disk first. A disk backup is usually faster than a tape backup because disk devices are generally faster. You can then copy your backups from disk to tape without incurring downtime. If possible, retain the disk backup copy since a restore from disk is faster than from tape. Note that this assumes that the disks are not mirrored: with mirrored disks other options are available (see the next recommendation). Backups with mirrored disks can be made offline In the case of mirror disks, online backups are no different from backups of non-mirrored disks. If you wish to do offline backups in a mirrored disk environment, we recommend the following approach: First, shut down the database and split the disk mirror. Then restart the database on just one disk set. Finally, mount the second disk set onto another system on which you will perform the offline backup. If the database is backed up this way, it is not recommended to use BRBACKUP, because the BRBACKUP log information will be incomplete on the original system. After the backup is done, the two disk sets have to be re-synchronized. The downtime incurred with this method is the time taken to shut down the DBMS, split the mirror and then restart the DBMS. It may take a substantial amount of time to re-synchronize the two disk sets, leading to severe degradation of performance. This method has the disadvantage that the database is not mirrored during the backup. If it is considered a high priority to have disk mirroring all the time (for protection against disk failures), this technique can be safely deployed if three-way mirroring is used. Use hardware compression with backups You should consider using hardware compression with backups. It cuts down backup time by as much as 50%. If the backup must go to tape directly and be done online, consider using multiple tape drives for parallel backups to shorten backup time.
874
Assess your tape devices It pays to think about what kind of tape devices you are using for backing up your database since this will determine the downtime in the case of offline backups. To give you some idea of this, the capacity of tapes currently ranges from 2 to 30 GB and the speed of data transfer ranges from 1 to 10 GB per hour. Verify that your backup tapes are readable You should verify that your backups can indeed be successfully used for a restore. Use a separate system from the live production system. To carry out a test, you may want to restore the data files of the system tablespace, rollback segment tablespace, temporary tablespace and a tablespace of your choice, plus control files, online redo log files and archived redo log files. Mount the database, take offline all data files that were not restored, recover the database and then open it. If successful, this is proof that the restored files can be used.
Archive Backup
You should regularly (weekly, for example) perform a full backup of the off-line archived log files. These files are crucial to the recoverability of the database should a database failure occur. Because these files are so important, SAP recommends that two copies of the archive logs be saved to tape. The archive utility (called BRARCHIVE) supports this process. If you have configured your system according to the SAP recommendations, the ORACLE system will save the online redo log files (where the changes made to the database are recorded) automatically (offline redo log files are written). If you have not changed the standard profile init<DBSID>.ora, the offline redo log files are contained in the archive directory <ORACLE_HOME>\saparch. To perform a recovery of the database, the redo log entries must be available in their original, uninterrupted sequence. Therefore, it is essential that you protect the offline redo log files written by the ORACLE system against loss. Moreover, a database shutdown may occur when no more space is available for new offline redo log files in directory <ORACLE_HOME>\saparch (archive stuck). For this reason, you should regularly archive the offline redo log files to a volume (tape). The SAP utility BRARCHIVE is provided for this purpose. When you select Backup archive logs, SAPDBA calls the SAP program BRARCHIVE. You can, of course, directly archive the offline redo log files by simply calling BRARCHIVE. You can also use the Computing Center Management System of R/3 to plan an archive of the offline redo log files, start it, and then view its log. See the online help for the Computing Center Management System.
875
Overviewing Windows NT/Database Backup and Recovery Restoring with SAPDBA: Check (and Repair) Database
The menu item Check (and repair) database only enables recovery of the database up to the current time. If you want to restore an older version of the database or perform a point in time recovery, please use the Restore/Recovery function.
Requirements
The SAPDBA recovery function works closely with the BRBACKUP and BRARCHIVE programs. In the restore phase, the copies of the database files and archive logs created by these programs are restored by BRRESTORE. The BRBACKUP and BRARCHIVE logs contain information on the directories or data media in which these files are stored. In addition, if you used a non-SAP backup utility (connection via BRBACKUP/BRARCHIVE), SAPDBA is also provided with the backups required for a recovery. The following requirements must be met to run an SAPDBA-assisted database recovery successfully after media or user errors: The complete, undamaged BRBACKUP and BRARCHIVE logs must be available for SAPDBA to determine the location of the database files and archived redo log files. (Always make sure that a sufficiently large number of logs are available, i.e., do not delete the logs too soon.) Undamaged copies of the lost database files must be available. Undamaged copies of all redo logs that were written between the point when the database copy used for the recovery was created and the point when the media error occurred must be available. The control file and all copies of this file must be available in undamaged form. If this is not the case, you will have perform the actions necessary to bring about the required situation. In each case at least one member of the online redo log groups must be available in undamaged form. In other case you have at first to perform the corresponding actions to rebuild the necessary redo logs. SAPDBA determines whether any copies of the online redo log files are missing or damaged. If so, you receive a corresponding warning message, but can still execute the recovery. It is important to deal with this at an appropriate time. Refer to the ORACLE documentation. If these requirements are met, recovery can often be executed automatically using the SAPDBA recovery function. The recovery option should only be used by an experienced database administrator on his or her own responsibility. Make sure that only the database administrator is authorized to use the expert mode (see SAPDBA: Expert mode). If in doubt, please contact an experienced coworker or SAP directly if your database requires recovery.
876
Any recovery options that lie outside the scope of the CHECK AND REPAIR scenario should be attempted only after careful checking and preparation.
10-15 GB 20-30 GB
Note that due to various system limits (system bus, cpu, controllers), you cannot expect perfect scalability from these devices (that is, you cannot expect 10 DLT drives to backup 60GB/h). Certainly hardware speed is one area in which the customer should work with hardware vendors to obtain an optimal solution.
877
Overviewing Windows NT/Database Backup and Recovery Defining Oracle Redo Log Management and Archives
Do I have the media capacity for unattended backups? How much redo log volume is expected between backups? Do I have ample (at least double) disk capacity for handling this redo log volume in my archive directory?
878
2. Volume labeling A good strategy for volume labeling can provide you with increased security, especially if relatively untrained operators will be mounting tapes. With proper planning, for example, you can label your tapes with the day of the month (C11B24A, C11B24B, and C11B24C, for example, for the C11 Backup on the 24th of each month, volumes A, B and C). Such a strategy helps guarantee, for example, that the correct tapes are retrieved from off-site for the daily backup. (These considerations do not apply to users of backup management software, such as ADSM, Omniback, Legato and others.) 3. Off-site storage Plan offset storage for the case of fire or other major catastrophe in your computer room.
879
Overviewing Windows NT/Database Backup and Recovery Backup and Archive Procedures and Polices
Decide on how to schedule redo log archives Create a volume labeling scheme to ensure smooth operations Decide on backup retention period (recommended at least 28 days) Determine tape pool size Initialize tapes Determine physical tape storage strategy (off-site) Decide whether or not to use unattended operations If unattended operations, in CCMS or elsewhere? Document backup procedures in operations manual Train operators in backup procedures Implement a backup strategy Perform a test restore and recovery
Backup
System Environment
Software Components
The following tools are used to perform the backup/restore tasks: System Name DEV QAS PD1 PD2 Backup Software SAPDBA 3.0E SAPDBA 3.0E/HIBACK 3.02 SAPDBA 3.0E/HIBACK 3.02 SAPDBA 3.0E/HIBACK 3.02
Hardware Components
The hardware listed in the table below is used for backup and restore: System Name DEV QAS PD1 PD2 Backup Hardware 1 x 4mm DAT 4/8GB DDS-2 1 x DLT 15/30 GB 1 x ADIC Autoloader (5 DLT parallel) 2 x DLT 15/30 GB
8710
Overviewing Windows NT/Database Backup and Recovery Backup and Archive Procedures and Polices
Policies
Tape Retention Period
The tape retention period is chosen in such a way that even if one tape (backup/archive) is damaged or lost, the ability to recover the database is assured. System Name DEV QAS PD1 PD2 Regular Backup 15 days 30 days 30 days 30 days Month End Backup Year End Backup Archives 15 days 30 days 30 days 30 days
36 months 36 months
5 years 5 years
Use a schedules similar to the ones below in order to ensure that you will be able to quickly and easily restore the database. Database Schedule System Name DEV QAS PD1 PD2
Archives
Tue
Thu
Sat
Sun
A backup of the archives (former online redo logs) is necessary to be able to perform either a recovery of the database from an online backup or to perform point-in-time recovery. The CDS backup option of the BRARCHIVE program ensures that from each archive there will be stored two copies on two different tapes before the archive is automatically deleted. System Name DEV QAS PD1 PD2 S: Save Mon CDS CDS CDS CDS Tue CDS CDS CDS CDS Wed CDS CDS CDS CDS Thu CDS CDS CDS CDS Fri CDS CDS CDS CDS Sat Sun
CDS CDS
8711
Overviewing Windows NT/Database Backup and Recovery Backup and Archive Procedures and Polices
Supplementary Backups
Supplementary backups are made at special dates (month end, year end) so that you can restore the database to a previous state if needed. System Name DEV QAS PD1 PD2
Storage Location
Month End Backup None None Full Offline with Verification Full Offline with Verification
Year End Backup None None Full Offline with Verification Full Offline with Verification
For safety reasons, the backup media must be stored in a safe place. The production system copies of the tapes should be stored in a remote (external) location. System Name DEV QAS PD1 PD2
Tape Labeling
Choose self explanatory names to indicate the type and source of the backup. For further differentiation, use sequential numbers. System Name DEV QAS PD1 PD2 Regular Backup DEV_R_<NNN> QAS_R_<NNN> PD1_R_<NNN> PD2_R_<NNN> Month End Backup Year End Backup Archives DEV_A_<NNN> QAS_A_<NNN> PD1_A_<NNN> PD2_A_<NNN>
PD1_M_<NNN> PD2_M_<NNN>
PD1_Y_<NNN> PD2_Y_<NNN>
Verifying Backups
To guarantee the integrity of the backups, perform checks on the tapes according to the schedule below. System Name DEV QAS PD1 PD2 Frequency of Backup Verification Every 2 weeks Every 2 weeks Every 2 weeks Every 2 weeks
8712
Overviewing Windows NT/Database Backup and Recovery Backup and Archive Procedures and Polices
To avoid backing up a hidden, inconsistent database (e.g. bad blocks), the database must be checked at least once within one retention period. System Name DEV QAS PD1 PD2 Frequency of DB-checks Every 2 weeks Every 2 weeks Every 4 weeks Every 4 weeks
Procedures
Backup
The backup is performed unattended according to the backup frequency table. The scheduling functionality of the R/3 CCMS is used for scheduling the backup. For systems running plain SAPDBA the required tapes can be listed with the VOLUMES Needed button on the backup scheduling screen within the CCMS. On systems running SAPDBA in combination with HIBACK, the HIBACK Volumemanager must be used to list the required tapes. Extra backups such as the monthly and yearly backup have to be performed offline and will be performed either with SAPDBA interactive or prepared BRBACKUP script files with a special tape pool.
Archiving
Archiving (Backing up of Archives) is performed after successfully backing up the database according to the archiving frequency table. Archiving is performed during normal operation of the system (no performance impact). With plain SAPDBA the needed volumes can be find using the query only option in the backup archive menu of SAPDBA. If HIBACK is installed, the HIBACK Volumemanager must be used for finding the correct tape. For the extra backups there is no special archiving required, because these backups are performed offline, so the database is in a consistent state.
Verifying of Backups
Backups must be verified following the schedule. The verify and list tape contents option of SAPDBA will be used to perform this task. In addition checks can be performed with mt and cpio on the operating system level. On systems running with Hiback, Hiback is used to verify the tapes
Monitoring/Controlling
After the backup of the database and the archives is finished, all logs must be printed and placed in the folder for each system.
8713
Overviewing Windows NT/Database Backup and Recovery Backup and Archive Procedures and Polices
Database Check
An integrity check of the database must be performed within one retention period, in order to ensure that there are no corrupted blocks in the database. This is not recognized during backup. An export to the null device (dummy export) is one way to perform this check running ORACLE analyze script another way to check the database.
Recovery
System Environment
Software Components
For a database recovery, use the same software tools that you used for the backup. System Name DEV QAS PD1 PD2 Backup Software SAPDBA 3.0E SAPDBA 3.0E/HIBACK 3.02 SAPDBA 3.0E/HIBACK 3.02 SAPDBA 3.0E/HIBACK 3.02
Hardware Components
Backup Hardware 1 x 4mm DAT 4/8GB DDS-2 1 x DLT 15/30 GB 1 x ADIC Autoloader (5 DLT parallel) 2 x DLT 15/30 GB
Policies
Testing Recovery
The restore procedure is one of the key issues of the R/3 system. Therefore, the procedures for recovering a database must be maintained and tested regularly. System Name DEV QAS Regularly Tests every 2 months every 2 months Supplemental Tests when any of the involved components changes (e.g. SAPDBA) when any of the involved components changes (e.g. SAPDBA)
8714
Overviewing Windows NT/Database Backup and Recovery Backup and Archive Procedures and Polices
PD1 PD2
when any of the involved components changes (e.g. SAPDBA) when any of the involved components changes (e.g. SAPDBA)
Procedures
Recovery procedures are based on SAPDBA or on SAPDBA plus Hiback. It is also absolutely advisable, to be able to recover the database by hand, which means retrieving the backed up datafiles with native tools like mt and cpio, if you are using a Backint compliant tool like Hiback with plain Hiback. After retrieving the datafiles, the database must be recovered with database tools. It is essential to be familiar with this procedure too.
8715