查看原文
其他

Oracle 19c RAC 遇到的几个问题

JiekeXu JiekeXu DBA之路 2024-03-03

作者 | JiekeXu

来源 |公众号 JiekeXu DBA之路(ID: JiekeXu_IT)

大家好,我是 JiekeXu,很高兴又和大家见面了,今天和大家一起来看看 Oracle 19c RAC 遇到的几个问题,欢迎点击上方蓝字关注我,标星或置顶,更多干货第一时间到达!

Oracle19c 作为长期支持的大版本,是很多公司和个人选择的主流数据库版本,很多公司新上线的系统也都是以 19C 为主,也有很多企业渐渐地迁移数据库到19C,11204 版本已经逐渐退出了舞台。下面对 Oracle 19C版本遇到的几个最主要的问题做个简单介绍。

 

1、Fock() 炸弹

 

19 RAC 使用 srvctl 无法正常启动,reboot 重启也不会随着 OS 启动而启动数据库实例。

 

现象如下图:

 

 

Alert 日志报错

 

 

对应 trace



猜测可能是资源限制问题,但查了很多都是正常。

 

ipcs –ma 查看 Oracle 共享内存段


 

ipcs -q 

 

 

对应时间段的 OS message



Grid 用户资源限制情况,基本上都是无限制。



Oracle 用户资源限制情况,基本上都是无限制


 

使用 strace 跟踪启动过程

strace–o /tmp/strace2.log srvctl start  database -d teststb –i teststb2

strace–o /tmp/strace1.log srvctl start  database -d teststb -i teststb1

 

其他资源限制查看


 

cat /etc/security/limits.conf



cat /etc/systemd/system.conf



将此参数打开修改为 infinity 然后重启主机则恢复正常。

 

cat /etc/systemd/system.conf|grepDefaultTasksMax#DefaultTasksMax=512DefaultTasksMax=infinity


Database And ASM InstanceOra-27300 OS System Dependent Operation Fork Failed With Status 11 (Doc ID2331884.1)

systemd limited maximumnumber of tasks that may be created on the node.This setting will alsoaffect maxpid value on the OS.


主要是MAX_PID



修改DefaultTasksMax


从SLES 12 SP2开始引入了PID cgroup controller,限制fork()的并发数,避免fork()炸弹。由配置参数DefaultTasksMax控制,缺省值为512,该值比较小,不适用于数据库运行的环境,需要修改为大于等于65536,在此修改为infinity(无限制)。

如果在DBCA期间抛出如下图所示错误,则可能是由于DefaultTasksMax值过小导致的。为了避免DBCA出现错误,提前修改该参数值。



解决办法:

1)、查看当前值:

test1:~ #systemctl show --property DefaultTasksMax

DefaultTasksMax=512

 

2)、编辑/etc/systemd/system.conf,取消DefaultTasksMax该行注释,并修改值为"infinity".

# vi /etc/systemd/system.conf

DefaultTasksMax=infinity

 

3)、reboot OS

 

4)、查看修改后的值

oracle@test1:~>systemctl show --property DefaultTasksMax

DefaultTasksMax=18446744073709551615

 

2、客户端问题 ORA-28040

前置条件:已安装Oracle11g客户端,配置好环境变量,用PL/SQL Developer登录数据库

解决方案:

在oracle 19C服务器端oracle用户下:


cd $ORACLE_HOME/network/admin目录下 新建文件 sqlnet.oravi sqlnet.oraSQLNET.ALLOWED_LOGON_VERSION_SERVER=10;SQLNET.ALLOWED_LOGON_VERSION_CLIENT=10;


②在服务器端,管理员身份登录,重新修改密码。


sqlplus / as sysdbaalter session set container=服务名; --如果使用 pdb 需要切换到 pdb 修改。alter user 用户名 identified by 密码;


注意:配置完服务器端的sqlnet.ora文件后,务必要重新修改密码,否则仍登录失败

 

3、打补丁报错


打补丁报错 Prerequisite check "CheckActiveFilesAndExecutables" failed。

 

testrac1:~# /app/app/19.0.0.0/grid/OPatch/opatchauto apply /app/soft/32226239
OPatchautosession is initiated at Thu Mar 415:27:50 2022
Systeminitialization log file is /app/app/19.0.0.0/grid/cfgtoollogs/opatchautodb/systemconfig2022-03-04_03-27-53PM.log.
Sessionlog file is /app/app/19.0.0.0/grid/cfgtoollogs/opatchauto/opatchauto2022-03-04_03-28-19PM.logTheid for this session is U4JFExecutingOPatch prereq operations to verify patch applicability on home/app/app/oracle/product/19.0.0.0/dbhome_1ExecutingOPatch prereq operations to verify patch applicability on home/app/app/19.0.0.0/gridPatchapplicability verified successfully on home /app/app/19.0.0.0/gridPatchapplicability verified successfully on home/app/app/oracle/product/19.0.0.0/dbhome_1Executingpatch validation checks on home /app/app/19.0.0.0/gridPatchvalidation checks successfully completed on home /app/app/19.0.0.0/gridExecutingpatch validation checks on home /app/app/oracle/product/19.0.0.0/dbhome_1Patchvalidation checks successfully completed on home/app/app/oracle/product/19.0.0.0/dbhome_1
VerifyingSQL patch applicability on home /app/app/oracle/product/19.0.0.0/dbhome_1"/bin/sh-c 'cd /app/app/oracle/product/19.0.0.0/dbhome_1;ORACLE_HOME=/app/app/oracle/product/19.0.0.0/dbhome_1 ORACLE_SID=piccyx1/app/app/oracle/product/19.0.0.0/dbhome_1/OPatch/datapatch -prereq-verbose'" command failed with errors. Please refer to logs for more details.SQL changes, if any, can be analyzed by manually retrying the same command.
SQLpatch applicability verified successfully on home/app/app/oracle/product/19.0.0.0/dbhome_1
Preparingto bring down database service on home /app/app/oracle/product/19.0.0.0/dbhome_1Successfullyprepared home /app/app/oracle/product/19.0.0.0/dbhome_1 to bring down databaseservice
Bringingdown CRS service on home /app/app/19.0.0.0/gridPrepatchoperation log file location: /app/app/grid/crsdata/testrac1/crsconfig/crs_prepatch_apply_inplace_testrac1_2022-03-04_03-29-01PM.logCRSservice brought down successfully on home /app/app/19.0.0.0/grid
Performingprepatch operation on home /app/app/oracle/product/19.0.0.0/dbhome_1Perpatchoperation completed successfully on home/app/app/oracle/product/19.0.0.0/dbhome_1
Startapplying binary patch on home /app/app/oracle/product/19.0.0.0/dbhome_1Failedwhile applying binary patches on home /app/app/oracle/product/19.0.0.0/dbhome_1
Executionof [OPatchAutoBinaryAction] patch action failed, check log for more details.Failures:PatchTarget : testrac1->/app/app/oracle/product/19.0.0.0/dbhome_1 Type[rac]Details:[---------------------------PatchingFailed---------------------------------Commandexecution failed during patching in home:/app/app/oracle/product/19.0.0.0/dbhome_1, host: testrac1.Commandfailed: /app/app/oracle/product/19.0.0.0/dbhome_1/OPatch/opatchauto apply /app/soft/32226239 -oh/app/app/oracle/product/19.0.0.0/dbhome_1 -target_type rac_database -binary-invPtrLoc /app/app/19.0.0.0/grid/oraInst.loc -jre/app/app/19.0.0.0/grid/OPatch/jre -persistresult/app/app/oracle/product/19.0.0.0/dbhome_1/opatchautocfg/db/sessioninfo/sessionresult_testrac1_rac_2.ser-analyzedresult /app/app/oracle/product/19.0.0.0/dbhome_1/opatchautocfg/db/sessioninfo/sessionresult_analyze_testrac1_rac_2.serCommandfailure output: ==Followingpatches FAILED in apply:
Patch:/app/soft/32226239/32218454Log:/app/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2022-03-04_15-30-20PM_1.logReason: Failed during Patching:oracle.opatch.opatchsdk.OPatchException: Prerequisite check"CheckActiveFilesAndExecutables" failed.
Afterfixing the cause of failure Run opatchauto resume]OPATCHAUTO-68061:The orchestration engine failed.OPATCHAUTO-68061:The orchestration engine failed with return code 1OPATCHAUTO-68061:Check the log for more details.OPatchAutofailed.OPatchautosession completed at Thu Mar 4 15:31:152022Timetaken to complete the session 3 minutes, 25 seconds
opatchauto failed with error code 42


 查看日志

 

tail -100f/app/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2022-03-04_15-30-20PM_1.log
[Mar4, 2022 3:31:14 PM] [INFO] Finishfuser command /bin/fuser /app/app/oracle/product/19.0.0.0/dbhome_1/bin/expdp atThu Mar 04 15:31:14 CST 2021[Mar4, 2022 3:31:14 PM] [INFO] Followingactive executables are not used by opatch process : /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libsqlplus.so Followingactive executables are used by opatch process :[Mar4, 2022 3:31:14 PM] [INFO] Prerequisite check "CheckActiveFilesAndExecutables" failed. The detailsare:
Following active executables are not used byopatch process : /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1 /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libsqlplus.so Following active executables are used byopatch process :[Mar4, 2022 3:31:15 PM] [SEVERE] OUI-67073:UtilSessionfailed: Prerequisite check "CheckActiveFilesAndExecutables" failed.[Mar4, 2022 3:31:15 PM] [INFO] FinishingUtilSession at Thu Mar 04 15:31:15 CST 2022[Mar4, 2021 3:31:15 PM] [INFO] Log filelocation: /app/app/oracle/product/19.0.0.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2022-03-04_15-30-20PM_1.log

 

查看进程占用情况

 

fuser/app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1/app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1:89500m 109054m


kill 掉占用的进程

kill-9 89500

kill-9 109054

 

验证没有进程占用


fuser/app/app/oracle/product/19.0.0.0/dbhome_1/lib/libsqlplus.sofuser /app/app/oracle/product/19.0.0.0/dbhome_1/lib/libclntsh.so.19.1


继续打补丁

testrac1:~# /app/app/19.0.0.0/grid/OPatch/opatchauto resume

 

4、CRS-6706 patchlevel 不一致

 

19.3.RAC 的 2 节点打补丁失败,导致2节点集群无法启动报错:

 

[root@testrac2 soft]#/u01/app/19.0.0.0/grid/bin/crsctl start crsCRS-6706: OracleClusterware Release patch level ('4203896349') does not match Software patchlevel ('724960844'). Oracle Clusterware cannot be started.CRS-4000: Command Startfailed, or completed with errors.


从报错补丁 patch level 不一致导致。然后查看 mos

 

1. Run the following command as the root user tocomplete the patching set up behind the scenes:
#GI_HOME/bin:> ./clscfg -localpatch
2. Run the following command as the root user tolock the GI home:
#GI_HOME/crs/install:> ./rootcrs.sh -lock
3. Run the following command as the root user tostart the GI:
#GI_HOME/bin:> ./crsctl start crs
执行:[root@testrac2 bin]#./clscfg -localpatch[root@testrac2install]# ./rootcrs.sh -lock[root@testrac2 bin]#./crsctl start crs


然后集群可以启动。

 

参考文档:

CRS-6706: OracleClusterware Release patch level ('nnn') does not match Software patch level('mmm') (文档 ID 1639285.1)
Patching 12.2.0.1 GridInfrastructure gives error CRS-6706: Oracle Clusterware Release Patch Level('748994161') Does Not Match Software Patch Level (文档 ID 2348013.1)


5、内存大页配置过低 ORA-27106

由于内存大页配置小于 SGA 导致数据库无法启动。可调大HugePages内存大页或者改小 SGA,对于HugePages内存大页设置问题后面单独说明。



推荐阅读 Mos 文档:

最常见的5个导致 RAC 实例崩溃的问题 (文档 ID1549191.1)

诊断 GridInfrastructure 启动问题 (文档 ID1623340.1)

RAC 环境中最常见的 5 个数据库和/或实例性能问题 (文档 ID 1602076.1)

如何诊断 11.2 集群节点驱逐问题 (文档 ID 1674872.1)

Grid Infrastructure 启动的五大问题 (文档 ID 1526147.1)

Crs Components Are NotStarting After Server Reboot (文档 ID1152653.1)

Troubleshooting 10g and 11.1Clusterware Reboots (文档 ID265769.1)

Troubleshooting 11.2 or 12.1Grid Infrastructure root.sh Issues (文档 ID 1053970.1)

11gR2 Clusterware 和 Grid Home – 你需要知道的事 (文档 ID 2225748.1)

VIP, SCAN VIP/Listener FailsOver and Listener Stops After Short Public Network Hiccup (文档 ID 1333165.1)

How to change Hostname / IPfor a Grid Infrastructure Oracle Restart Standalone Configuration (SIHA) (文档 ID 1552810.1)

How to disable Automatic VIPfailback (文档 ID1280218.1)

导致 Scan VIP和 Scan Listener(监听程序)出现故障的最常见的5 个问题 (文档 ID 1602038.1)

How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)

How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)

Rebalance of one diskgroupdismounted another diskgroup as disks were offlined (文档 ID 1525330.1)最常见的5个导致 RAC 实例崩溃的问题 (文档 ID 1549191.1)

诊断 GridInfrastructure 启动问题 (文档 ID1623340.1)

RAC 环境中最常见的 5 个数据库和/或实例性能问题 (文档 ID 1602076.1)

如何诊断 11.2 集群节点驱逐问题 (文档 ID 1674872.1)

Grid Infrastructure 启动的五大问题 (文档 ID 1526147.1)

Crs Components Are NotStarting After Server Reboot (文档 ID1152653.1)

Troubleshooting 10g and 11.1Clusterware Reboots (文档 ID265769.1)

Troubleshooting 11.2 or 12.1Grid Infrastructure root.sh Issues (文档 ID 1053970.1)

11gR2 Clusterware 和 Grid Home – 你需要知道的事 (文档 ID 2225748.1)

VIP, SCAN VIP/Listener FailsOver and Listener Stops After Short Public Network Hiccup (文档 ID 1333165.1)

How to change Hostname / IPfor a Grid Infrastructure Oracle Restart Standalone Configuration (SIHA) (文档 ID 1552810.1)

How to disable Automatic VIPfailback (文档 ID1280218.1)

导致 Scan VIP和 Scan Listener(监听程序)出现故障的最常见的5 个问题 (文档 ID 1602038.1)

How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)

How To Configure Server SideTransparent Application Failover (文档 ID 460982.1)

Rebalance of one diskgroupdismounted another diskgroup as disks were offlined (文档 ID 1525330.1)



全文完,希望可以帮到正在阅读的你,如果觉得有帮助,可以分享给你身边的朋友,你关心谁就分享给谁,一起学习共同进步~~~


❤️ 欢迎关注我的公众号,来一起玩耍吧!!!



————————————————————————————
公众号:JiekeXu DBA之路
墨天轮:https://www.modb.pro/u/4347
CSDN :https://blog.csdn.net/JiekeXu
腾讯云:https://cloud.tencent.com/developer/user/5645107
————————————————————————————



Oracle 表碎片检查及整理方案

2021 年公众号历史文章合集整理

2020 年公众号历史文章合集整理

我的 2021 年终总结和 2022 展望

Oracle 查询表空间使用率超慢问题一则

国产数据库|TiDB 5.4 单机快速安装初体验

Oracle ADG 备库停启维护流程及增量恢复

Oracle 19c 使用数据泵如何导入导出 PDB 用户



继续滑动看下一个

Oracle 19c RAC 遇到的几个问题

JiekeXu JiekeXu DBA之路
向上滑动看下一个

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存