rm-rf 误操作的恢复过程
很多DBA一定对rm -rf深恶痛绝吧,没准哪天自己一个犯迷糊就把数据库给消灭了,然后,就没有然后了……那万一……真的发生了这样的不幸,是否真的就无药可救了吗?未必,还是有解决方法的,也许某天当你不幸遇到,就可以用来救自己了。这里做恢复操作的前提是没有可用的rman备份,或者数据库冷备份等,也就是说,没有任何备份。
一、登陆SQLPLUS,并启动数据库
[Oracle@ora10g ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.1.0 - Production on Mon Aug 25 12:37:50 2014
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 285212672 bytes
Fixed Size 1218992 bytes
Variable Size 96470608 bytes
Database Buffers 184549376 bytes
Redo Buffers 2973696 bytes
Database mounted.
Database opened.
--查看实例初始化状态
SQL> select status from v$instance;
STATUS
------------
OPEN
--查看实例名
SQL> show parameter name;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert string
db_name string ora10g
db_unique_name string ora10g
global_names boolean FALSE
instance_name string ora10g
lock_name_space string
log_file_name_convert string
service_names string ora10g
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
二、模拟rm -rf误操作
[oracle@ora10g ~]$ cd /u01/app/oracle/oradata
[oracle@ora10g oradata]$ ll
total 4
drwxr-x--- 2 oracle oinstall 4096 Aug 25 11:15 ora10g
[oracle@ora10g oradata]$ pwd
/u01/app/oracle/oradata
[oracle@ora10g oradata]$ rm -rf ora10g
[oracle@ora10g oradata]$ exit
logout
[root@ora10g ~]# su - oracle
[oracle@ora10g ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.1.0 - Production on Mon Aug 25 12:43:58 2014
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> select count(*) from dba_objects;
select count(*) from dba_objects
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/u01/app/oracle/oradata/ora10g/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
SQL> select count(*) from dba_segments;
select count(*) from dba_segments
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/u01/app/oracle/oradata/ora10g/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
由于数据文件都被删除,其中包括system01.dbf,是存放数据字典的容器,想要再访问数据字典中得视图,当然是不可能的了,所以这里会报错,找不到文件,故障出现
--查看alert.log日志文件
[root@ora10g ~]# tailf /u01/app/oracle/admin/ora10g/bdump/alert_ora10g.log
ARCH shutting down
ARC2: Archival stopped
Mon Aug 25 12:45:38 2014
Errors in file /u01/app/oracle/admin/ora10g/bdump/ora10g_j000_3037.trc:
ORA-12012: error on auto execute of job 1
ORA-01116: error in opening database file 2
ORA-01110: data file 2: '/u01/app/oracle/oradata/ora10g/undotbs01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
Mon Aug 25 12:46:43 2014
Errors in file /u01/app/oracle/admin/ora10g/bdump/ora10g_j000_3070.trc:
ORA-12012: error on auto execute of job 1
ORA-01116: error in opening database file 2
ORA-01110: data file 2: '/u01/app/oracle/oradata/ora10g/undotbs01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
--查看dbwr进程,判断需要恢复文件句柄所在目录
SQL> !ps -ef|grep ora_dbw
oracle 2912 1 0 12:37 ? 00:00:00 ora_dbw0_ora10g
oracle 3078 3032 0 12:48 pts/3 00:00:00 /bin/bash -c ps -ef|grep ora_dbw
oracle 3080 3078 0 12:48 pts/3 00:00:00 grep ora_dbw
其实这个时候,所有oracle的进程都还在,都是以ora_开头的都是oracle的后台进程:
SQL> !ps -ef|grep ora_
oracle 2906 1 0 12:37 ? 00:00:00 ora_pmon_ora10g
oracle 2908 1 0 12:37 ? 00:00:00 ora_psp0_ora10g
oracle 2910 1 0 12:37 ? 00:00:00 ora_mman_ora10g
oracle 2912 1 0 12:37 ? 00:00:00 ora_dbw0_ora10g
oracle 2914 1 0 12:37 ? 00:00:00 ora_lgwr_ora10g
oracle 2916 1 0 12:37 ? 00:00:00 ora_ckpt_ora10g
oracle 2918 1 0 12:38 ? 00:00:01 ora_smon_ora10g
oracle 2920 1 0 12:38 ? 00:00:00 ora_reco_ora10g
oracle 2922 1 0 12:38 ? 00:00:00 ora_cjq0_ora10g
oracle 2924 1 0 12:38 ? 00:00:01 ora_mmon_ora10g
oracle 2926 1 0 12:38 ? 00:00:00 ora_mmnl_ora10g
oracle 2928 1 0 12:38 ? 00:00:00 ora_d000_ora10g
oracle 2930 1 0 12:38 ? 00:00:00 ora_s000_ora10g
oracle 2934 1 0 12:38 ? 00:00:00 ora_arc0_ora10g
oracle 2936 1 0 12:38 ? 00:00:00 ora_arc1_ora10g
oracle 2941 1 0 12:38 ? 00:00:00 ora_qmnc_ora10g
oracle 2943 1 0 12:38 ? 00:00:00 ora_q000_ora10g
oracle 2945 1 0 12:38 ? 00:00:00 ora_q001_ora10g
oracle 3077 1 0 12:48 ? 00:00:00 ora_j000_ora10g
oracle 3085 3032 0 12:49 pts/3 00:00:00 /bin/bash -c ps -ef|grep ora_
oracle 3087 3085 0 12:49 pts/3 00:00:00 /bin/bash -c ps -ef|grep ora_
由此可知,我们需要的被删除的文件句柄在/proc/2912/fd下
三、开始恢复误删除的文件
--恢复数据文件和控制文件
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
[oracle@ora10g ~]$ cd /proc/2912
[oracle@ora10g 2912]$ ll
total 0
dr-xr-xr-x 2 oracle oinstall 0 Aug 25 12:51 attr
-r-------- 1 oracle oinstall 0 Aug 25 12:51 auxv
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:48 cmdline
-rw-r--r-- 1 oracle oinstall 0 Aug 25 12:51 coredump_filter
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:51 cpuset
lrwxrwxrwx 1 oracle oinstall 0 Aug 25 12:51 cwd -> /u01/app/oracle/product/10.2.0/db_1/dbs
-r-------- 1 oracle oinstall 0 Aug 25 12:51 environ
lrwxrwxrwx 1 oracle oinstall 0 Aug 25 12:51 exe -> /u01/app/oracle/product/10.2.0/db_1/bin/oracle
dr-x------ 2 oracle oinstall 0 Aug 25 12:51 fd
-r-------- 1 oracle oinstall 0 Aug 25 12:51 limits
-rw-r--r-- 1 oracle oinstall 0 Aug 25 12:51 loginuid
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:37 maps
-rw------- 1 oracle oinstall 0 Aug 25 12:51 mem
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:51 mounts
-r-------- 1 oracle oinstall 0 Aug 25 12:51 mountstats
-rw-r--r-- 1 oracle oinstall 0 Aug 25 12:51 oom_adj
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:51 oom_score
lrwxrwxrwx 1 oracle oinstall 0 Aug 25 12:51 root -> /
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:51 schedstat
-r-------- 1 oracle oinstall 0 Aug 25 12:51 smaps
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:37 stat
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:51 statm
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:48 status
dr-xr-xr-x 3 oracle oinstall 0 Aug 25 12:51 task
-r--r--r-- 1 oracle oinstall 0 Aug 25 12:51 wchan
[oracle@ora10g 2912]$ cd fd
[oracle@ora10g fd]$ ls -ltr
total 0
lr-x------ 1 oracle oinstall 64 Aug 25 12:51 0 -> /dev/null
l-wx------ 1 oracle oinstall 64 Aug 25 12:51 6 -> /u01/app/oracle/admin/ora10g/bdump/alert_ora10g.log
l-wx------ 1 oracle oinstall 64 Aug 25 12:51 5 -> /u01/app/oracle/admin/ora10g/udump/ora10g_ora_2904.trc
lr-x------ 1 oracle oinstall 64 Aug 25 12:51 4 -> /dev/null
lr-x------ 1 oracle oinstall 64 Aug 25 12:51 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 Aug 25 12:51 2 -> /dev/null
lr-x------ 1 oracle oinstall 64 Aug 25 12:51 1 -> /dev/null
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 9 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_ora10g.dat
l-wx------ 1 oracle oinstall 64 Aug 25 12:51 8 -> /u01/app/oracle/admin/ora10g/bdump/alert_ora10g.log
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 7 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkinstora10g (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 23 -> /u01/app/oracle/oradata/ora10g/temp01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 22 -> /u01/app/oracle/oradata/ora10g/example01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 21 -> /u01/app/oracle/oradata/ora10g/users01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 20 -> /u01/app/oracle/oradata/ora10g/sysaux01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 19 -> /u01/app/oracle/oradata/ora10g/undotbs01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 18 -> /u01/app/oracle/oradata/ora10g/system01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 17 -> /u01/app/oracle/oradata/ora10g/control03.ctl (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 16 -> /u01/app/oracle/oradata/ora10g/control02.ctl (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 15 -> /u01/app/oracle/oradata/ora10g/control01.ctl (deleted)
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 14 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkORA10G
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 13 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_ora10g.dat
lr-x------ 1 oracle oinstall 64 Aug 25 12:51 12 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Aug 25 12:51 11 -> /dev/zero
lrwx------ 1 oracle oinstall 64 Aug 25 12:51 10 -> /u01/app/oracle/admin/ora10g/adump/ora_2904.aud
[oracle@ora10g fd]$
分析:可以看句柄7,15-23的文件末尾被标记(deleted),这是由刚才的rm -rf操作所导致的,误删除后只要Oracle数据库未重启,进程就不会停止,那么就可以通过/proc/#oracle进程号/fd目录中的文件句柄号,来对这些被delete的文件进行恢复,方法就是cp文件句柄到原路径,注意一点这里如果不是在fd目录,那就要用绝对路径来指定文件句柄,如果删除文件后就,又对数据库进行了关闭操作,那就无解了