GLAST Infrastructure How-to-Fix : Recovering from Oralce Data Loss

We currently use Oracle 9i for the production server, and Oracle 10g for the development and test servers. There are different procedures for recovering data from an Oracle disaster (for example, accientally dropping tables) depending on whether the server is Oracle 9i or 10g.

Ian MacGregor and I recently sat down together to figure out what GLAST should do in the event of Oracle data loss, and outline what is involved in recovering from such an event.

It is most difficult to recover from an Oracle 9i disaster. It is much easier to recover from an Oracle 10g disaster by making use of the new "Recycle Bin" feature.

Oracle 9i Recovery Procedure

SCS only backs up the production server, which is currently Oracle 9i. SCS does daily backups. However, it is still possible to recover data up to the point of a "drop table" event by making use of the daily backup from the previous day, plus the Oracle journaling features. What happens is that SCS restores the datatbase from the most recent backup, then uses Oracle journaling (similiar to a journaling file system) to recover data up to, but not including, the "drop table". Ian estimated it would take 2-3 hours to do such a recovery.

Oracle 10g Recovery Procedure

The Oracle 10g Recycle Bin allows us to recover the data instantly, but we have to write a script to do the recovery. Recovering a dropped table from the recycle bin restores the table's column names, column data types, and the data itself, but nothing else. Specifically, it doesn't recover any constraints (primary key constraints, unique constraints, foreign key constraints, sequences, etc.). These all have to be restored manually after the data is recovered usng ALTER TABLE statements.

Note that it is always an option to have SCS do the "Oracle 9i style" recovery for us, but it means a 2-3 hours outage (or perhaps longer during off hours). The "Recycle Bin" method is an instant reovery procedure, but requires that we write a script to do the reovery using the procedure outlined below. Since we advertise 1 hour turn-around of our data processing, it seemed important to me that we consider the "Recycle Bin" method.

To recover a droped table, you first have to find its "object name" in the recycle bin. The following ColdFusion query will get everything you need to recover the GINO main tables.

ColdFusion Not Required

You don't need ColdFusion to do this query. It was just easiest for me to show it using a ColdFusion query since I wanted to easily show the GINO master table list, and I don't know PL/SQL well enough to show the same thing in a PL/SQL script. What is important is knowing the exact list of tables to recover, which is what the following example illustrates.

<cfset gino_tables = ArrayNew(1) />

<cfset ArrayAppend(gino_tables, "TASK") />
<cfset ArrayAppend(gino_tables, "TASKTYPE") />

<cfset ArrayAppend(gino_tables, "TASKPROCESS") />
<cfset ArrayAppend(gino_tables, "BATCHGROUP") />
<cfset ArrayAppend(gino_tables, "BATCHQUEUE") />

<cfset ArrayAppend(gino_tables, "DATASET") />
<cfset ArrayAppend(gino_tables, "DSTYPE") />
<cfset ArrayAppend(gino_tables, "DSFILETYPE") />

<cfset ArrayAppend(gino_tables, "TP_DS") />

<cfset ArrayAppend(gino_tables, "TPINSTNACE") />
<cfset ArrayAppend(gino_tables, "PROCESSINGSTATUS") />

<cfset ArrayAppend(gino_tables, "DSINSTNACE") />

<cfset ArrayAppend(gino_tables, "TPI_DSI") />

<cfset ArrayAppend(gino_tables, "RUN") />
<cfset ArrayAppend(gino_tables, "RUNSTATUS") />

<cfset ArrayAppend(gino_tables, "USER") />
<cfset ArrayAppend(gino_tables, "ROLE") />
<cfset ArrayAppend(gino_tables, "USER_ROLE") />

<cfset ArrayAppend(gino_tables, "RECORDINFO") />
<cfset ArrayAppend(gino_tables, "RI_TASK") />
<cfset ArrayAppend(gino_tables, "RI_TASKPROCESS") />
<cfset ArrayAppend(gino_tables, "RI_DATASET") />
<cfset ArrayAppend(gino_tables, "RI_RUN") />

<cfquery name="qry" datasource="pipeline-dev">
select
	object_name,
	original_name,
	droptime,
	dropscn
from
	recyclebin
where
	original_name in (<cfqueryparam cfsqltype="cf_sql_varchar" list="yes" value="#ArrayToList(gino_tables)#" />)
order by original_name asc, droptime desc
</cfquery>

It is important to only select the tables in the "where clause", since many other things are dropped when table is dropped that aren't easily restored (like all of the contraints listed above).

Notice that there are duplicate tables for most (but not all) of the primary GINO tables. This is becuase, for this particular example, the pdbschema.sql script was run twice, which did two "drop table" commands for each table. We want the earlier timestamp, which will be the tables that include the data we want to recover.

flashback table "BIN$8V58EOw8AjzgNAADuj+ROw==$0" to before drop rename to task_scn_1550952795

The quotes around the "object_name" are important (otherwise you will get an error). The number in the name of the restored table (task_scn_1550952795 in the example above), comes from the "Drop SCN" column. This command will restore the TASK table to the uniqely named table named TASK_SCN_1550952795. Inspect this table to make sure it contains the data you think should be there. A simple "select * from task_scn_1550952795" should suffice. This will tell you if you reovered the correct version of the table, as the table could be listed multiple times depending on when the last time a purge was done.

rename task_scn_1550952795 to task

Finally, apply whatver contraints are listed in pdbschema.sql. For this example, they would be something similiar to this:

alter table Task (
	constraint PK_Task primary key (Task_PK),
	constraint FK_TaskType_Task foreign key (TaskType_FK)
		references TaskType (TaskType_PK),
	constraint NEW_TaskName unique (TaskName)
);

As I followed the above procedure for the GLAST_DP dev database, executing the commnads manually from the PL/SQL prompt, I recovered the TASK table as show in the following figure:

This should all be easy to automate in a new sql script, perhaps naming it pdb_data_recovery.sql

Introduction

Oracle 9i Recovery Procedure

Oracle 10g Recovery Procedure