Exadata patching on Storage Server,InfiniBand Switch And Compute Nodes 18c/19c
CELL NODES/Storage Server Patching Plan
Exadata Master Note Doc ID – 888828.1-First go to the Oracle Document ID 888828.1, this is the first read requirement for any exadata database machine patching activity.
As part of patching on Compute Node ,The below is going to update
A) Oracle Linux Operating System
B) Firmware (Flash,Disk ,RAID Controller ,ILOM)
C) Exadata Software
1) There is Two type of patching
a) ROLLING - - It does not required downtime ,only one cell is affected in case of any issue and it will take upto 2 hours per cell
2) NON-ROLLING. All cells are done in parallel and all database must remain shutdown and much faster as compared to Rolling
1) Download And Stage patch on Compute Node
2) Create a File (cell_group to list all the Cell server Hostname
3) unpack patch and check SSH Connectivity
4) run exachk and fix any issue
5) configure a Blackout on Grid Control
6) perform a backup of all compute node
Summary
./patchmgr -cell ~/cell_group -reset_force
./patchmgr -cell ~/cell_group -reset_Cleanup
Non-Rolling update
1) dcli -g dbs_group -l root "Grid_home/bin/crsctl stop crs"
2) dcli -g cell_group -l roo "cellcli -e alter cell shutdown services all"
3) ./patchmgr -cells ~ /cell_group -patch_check_prereq
4) ./patchmgr -cells ~ /cell_group -patch
Rolling update
./patchmgr -cells ~/cell_group -patch_check_prereq -rolling
./patchmgr -cells ~cell_group -patch -rolling
=========================================================================
Important locations
=========================================================================
Patch home: /u01/patch_2022/cell/patch_20.1.19.0.0.220216
Cell_group: /u01/patch_2022/GI/cell_group
----------------------------------------------------------------------------------------------------------------------------
Cell nodes: easyxdadc002cl01,
easyxdadc002cl02, easyxdadc002cl03, easyxdadc002cl04, easyxdadc002cl05, easyxdadc002cl06,
easyxdadc002cl07
10.33.131.229 easyxdadc002cl01.bbc.local easyxdadc002cl01
10.33.131.230 easyxdadc002cl02.bbc.local easyxdadc002cl02
10.33.131.231 easyxdadc002cl03.bbc.local easyxdadc002cl03
10.33.131.232 easyxdadc002cl04.bbc.local easyxdadc002cl04
10.33.131.233 easyxdadc002cl05.bbc.local easyxdadc002cl05
10.33.131.234 easyxdadc002cl06.bbc.local easyxdadc002cl06
10.33.131.235 easyxdadc002cl07.bbc.local easyxdadc002cl07
Cell Nodes Current image version => Pre-Patching
=========================================================================
[root@easyxdadc002db01 cell]# dcli -g
/u01/patch_2022/GI/cell_group -l root "imageinfo | grep 'image
version'"
easyxdadc002cl01: Active image version: 18.1.31.0.0.201013
easyxdadc002cl01: Inactive image version: 12.2.1.1.8.180818
easyxdadc002cl02: Active image version: 18.1.31.0.0.201013
easyxdadc002cl02: Inactive image version: 12.2.1.1.8.180818
easyxdadc002cl03: Active image version: 18.1.31.0.0.201013
easyxdadc002cl03: Inactive image version: 12.2.1.1.8.180818
easyxdadc002cl04: Active image version: 18.1.31.0.0.201013
easyxdadc002cl04: Inactive image version: 12.2.1.1.8.180818
easyxdadc002cl05: Active image version: 18.1.31.0.0.201013
easyxdadc002cl05: Inactive image version: 12.2.1.1.8.180818
easyxdadc002cl06: Active image version: 18.1.31.0.0.201013
easyxdadc002cl06: Inactive image version: 12.2.1.1.8.180818
easyxdadc002cl07: Active image version: 18.1.31.0.0.201013
easyxdadc002cl07: Inactive image version: 12.2.1.1.8.180818
[root@easyxdadc002db01 cell]#
CELL NODE PATCHING STEPS
=========================================================================
=> CELL NODE PATCHING PRECHECKS
=========================================================================
=> ALL HAS TO BE PERFORMED FROM DBNODE 1
=========================================================================
-----------------
PRECHECK 1
-----------------
Step 1: [root@ easyxdadc002db01 ~]#cd /u01/patch_2022/cell/patch_20.1.19.0.0.220216
Step 2: [root@easyxdadc002db01 patch_20.1.19.0.0.220216]#
dcli -g /u01/patch_2022/GI/cell_group -l root 'df -h /'
easyxdadc002cl01: Filesystem Size
Used Avail Use% Mounted on
easyxdadc002cl01: /dev/md5 9.8G
4.9G 4.4G 53% /
easyxdadc002cl02: Filesystem Size
Used Avail Use% Mounted on
easyxdadc002cl02: /dev/md5 9.8G
6.0G 3.3G 65% /
easyxdadc002cl03: Filesystem Size
Used Avail Use% Mounted on
easyxdadc002cl03: /dev/md5 9.8G
4.8G 4.5G 52% /
easyxdadc002cl04: Filesystem Size
Used Avail Use% Mounted on
easyxdadc002cl04: /dev/md5 9.8G
4.6G 4.7G 50% /
easyxdadc002cl05: Filesystem Size
Used Avail Use% Mounted on
easyxdadc002cl05: /dev/md5 9.8G
4.6G 4.7G 50% /
easyxdadc002cl06: Filesystem Size
Used Avail Use% Mounted on
easyxdadc002cl06: /dev/md5 9.8G
4.6G 4.7G 50% /
easyxdadc002cl07: Filesystem Size
Used Avail Use% Mounted on
easyxdadc002cl07: /dev/md5 9.8G
4.6G 4.7G 50% /
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]#
Imp: Make
sure all cell nodes have more than 3 GB space available if not then run cleanup
and check again
./patchmgr -cells ~/cell_group -reset_force – first time
the storage servers is updated oracle suggests this
./patchmgr -cells ~/cell_group -cleanup -cleans up patch
files and temp contents in the cell servers,before cleaning up it will collect
all the problem diagnostics and analysis.
Step 3: Disk Check
Imp: Execute below commands and in case of any issues
check with Oracle support and once fixed then only proceed further.
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# dcli -g
/u01/patch_2022/GI/cell_group -l root 'cellcli -e list physicaldisk where
status!=normal'
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# date
Wed Oct 12 14:19:58 BST 2022
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# [root@easyxdadc002db01
~]#
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# dcli -l
root -g /u01/patch_2022/GI/cell_group "cellcli -e list physicaldisk where
diskType=FlashDisk and status not = normal"
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# date
Wed Oct 12 14:20:44 BST 2022
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# [root@easyxdadc002db01
~]#
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# date
Wed Oct 12 14:20:44 BST 2022
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# dcli -g
~/cell_group -l root 'ipmitool sunoem cli "show -d properties -level all
/SYS fault_state==Faulted"'
easyxdadc002cl01: Connected. Use ^D to exit.
easyxdadc002cl01: -> show -d properties -level all /SYS
fault_state==Faulted
easyxdadc002cl01: show: Query found no matches.
easyxdadc002cl01:
easyxdadc002cl01:
easyxdadc002cl01: -> Session closed
easyxdadc002cl01: Disconnected
easyxdadc002cl02: Connected. Use ^D to exit.
easyxdadc002cl02: -> show -d properties -level all /SYS
fault_state==Faulted
easyxdadc002cl02: show: Query found no matches.
easyxdadc002cl02:
easyxdadc002cl02:
easyxdadc002cl02: -> Session closed
easyxdadc002cl02: Disconnected
easyxdadc002cl03: Connected. Use ^D to exit.
easyxdadc002cl03: -> show -d properties -level all /SYS
fault_state==Faulted
easyxdadc002cl03: show: Query found no matches.
easyxdadc002cl03:
easyxdadc002cl03:
easyxdadc002cl03: -> Session closed
easyxdadc002cl03: Disconnected
easyxdadc002cl04: Connected. Use ^D to exit.
easyxdadc002cl04: -> show -d properties -level all /SYS
fault_state==Faulted
easyxdadc002cl04: show: Query found no matches.
easyxdadc002cl04:
easyxdadc002cl04:
easyxdadc002cl04: -> Session closed
easyxdadc002cl04: Disconnected
easyxdadc002cl05: Connected. Use ^D to exit.
easyxdadc002cl05: -> show -d properties -level all /SYS
fault_state==Faulted
easyxdadc002cl05: show: Query found no matches.
easyxdadc002cl05:
easyxdadc002cl05:
easyxdadc002cl05: -> Session closed
easyxdadc002cl05: Disconnected
easyxdadc002cl06: Connected. Use ^D to exit.
easyxdadc002cl06: -> show -d properties -level all /SYS
fault_state==Faulted
easyxdadc002cl06: show: Query found no matches.
easyxdadc002cl06:
easyxdadc002cl06:
easyxdadc002cl06: -> Session closed
easyxdadc002cl06: Disconnected
easyxdadc002cl07: Connected. Use ^D to exit.
easyxdadc002cl07: -> show -d properties -level all /SYS
fault_state==Faulted
easyxdadc002cl07: show: Query found no matches.
easyxdadc002cl07:
easyxdadc002cl07:
easyxdadc002cl07: -> Session closed
easyxdadc002cl07: Disconnected
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]#
dcli -l root -g ~/cell_group "cellcli -e drop
alerthistory all"
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# dcli -l
root -g /u01/patch_2022/GI/cell_group
'cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome'
| grep -vi yes
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]# date
Wed Oct 12 14:23:56 BST 2022
[root@easyxdadc002db01 patch_20.1.19.0.0.220216]#
Step 4: Although
all configurations are in place so execute only select queries, don’t alter any
values.
[root@easyxdrtw001db01 ~]#
---------------------------------------------------
ASM Check
---------------------------------------------------
From DBNODE1
.switchDB +ASM1
sqlplus / as sysasm
Verify that there is no rebalance running
select * from gv$asm_operation;
---------------------------------------------------
Rolling patch checks
---------------------------------------------------
Check ASM_POWER_LIMIT and adjust if needed
SQL> show parameter ASM_POWER_LIMIT
NAME TYPE VALUE
------------------------------------ -----------
------------------------------
asm_power_limit integer 1
Set ASM_POWER_LIMIT parameter value at least to 4.
alter system set ASM_POWER_LIMIT = 4 scope=both
show parameter ASM_POWER_LIMIT
------------------------------------------------------------------------------------------------------
Check 'disk_repair_time' for all mounted disk groups in the Oracle ASM instance and adjust if needed
SQL> column dg format a15
column attribute format a30
column value format a15
select dg.name dg,a.name attribute,a.value from
v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and
a.name='disk_repair_time' order by dg;
SQL> SQL> SQL>
DG ATTRIBUTE VALUE
--------------- ------------------------------
---------------
DATA
disk_repair_time
8.5H
DBFS_DG
disk_repair_time
8.5H
RECO
disk_repair_time
8.5H
SQL>
Keep this 48:
alter diskgroup 'DATA' set attribute
'disk_repair_time'='48h';
alter diskgroup 'DBFS_DG' set attribute
'disk_repair_time'='48h';
alter diskgroup 'DEVDATA' set attribute
'disk_repair_time'='48h';
alter diskgroup 'DEVRECO' set attribute 'disk_repair_time'='48h';
alter diskgroup 'RECO' set attribute
'disk_repair_time'='48h';
--------------------------------------------------------------------
Check 'compatible.advm' and adjust if needed
--------------------------------------------------------------------
column dg format a15
column attribute format a30
column value format a15
select dg.name dg,a.name attribute,a.value from
v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and
a.name='compatible.advm' order by dg;
alter diskgroup '<diskgroup_name>' set attribute
'compatible.advm'='<clusterware-active-version>';
Step 5 :
--------------------------------------------------------------------
ILOM Checks
--------------------------------------------------------------------
Check ILOM access to all cell ILOM(s)
Cell ILOM(s)
ssh easyxdrtw001cl01-ilom
ssh easyxdrtw001cl02-ilom
ssh easyxdrtw001cl03-ilom
ssh easyxdrtw001cl04-ilom
ssh easyxdrtw001cl05-ilom
ssh easyxdrtw001cl06-ilom
ssh easyxdrtw001cl07-ilom
DB ILOM(s)
ssh easyxdrtw001da01-ilom
ssh easyxdrtw001da02-ilom
ssh easyxdrtw001da03-ilom
ssh easyxdrtw001da04-ilom
Step 6:
-------------------------------------------------------------------------------------
Stop agents if running (Rolling)
-------------------------------------------------------------------------------------
su -l -c
"/bin/emctl stop agent"
dcli -l root -g dbs_group 'su -l -c "/bin/emctl status agent"' | grep
'Agent is'
-------------------------------------------------------------------------------------
Cell Node Uptime check
-------------------------------------------------------------------------------------
Check uptime and reboot in ROLLING FASHION if needed
dcli -l root -g /u01/patch_2022/GI/cell_group uptime
Step 7 : Execute below command to run prechecks and make sure
no errors reported , in case of any errors , please check with Oracle support
and once fixed rerun the prechecks .
./patchmgr -cells ~/cell_group -patch_check_prereq
-rolling -smtp_from "Cellnodes_Precheck" -smtp_to umesh.roy@easyreliable.com
=========================================================================
ACTUAL CELL NODE PATCHING STEPS NOW
=========================================================================
----------------------------------------------------------------------------------------------------------------------------
Stop services on local cell nodes
----------------------------------------------------------------------------------------------------------------------------cd
/u01/patch_2022/cell/patch_20.1.19.0.0.220216
login to Cell Node –
[root@easyxdrtw001cl01 ~]#
cellcli -e list cell attributes rsStatus, msStatus, cellsrvStatus
detail
rsStatus: running
msStatus: running
cellsrvStatus: running
[root@easyxdrtw001cl01 ~]#
cellcli -e alter cell shutdown services all
----------------------------------------------------------------------------------------------------------------------------
Cleanup space from any previous runs
----------------------------------------------------------------------------------------------------------------------------
./patchmgr -cells cell_group -reset_force
./patchmgr -cells cell_group -cleanup
Apply patch in rolling fashion
Patch Precheck
nohup ./patchmgr -cells
/home/oracle/prechk/cell01_file
-patch_check_prereq -rolling -smtp_from
"Patching_Update_Cell01" support@easyreliable.com &
Patch Command
nohup ./patchmgr -cells
/home/oracle/prechk/cell01_file -patch
-rolling -smtp_from "Patching_Update_Cell01" support@easyreliable.com
&
Note : Repeat the same steps for all cell nodes
----------------------------------------------------------------------------------------------------------------------------
Check patching complete with imageinfo and imagehistory.
----------------------------------------------------------------------------------------------------------------------------
dcli -l root -g /home/oracle/prechk/cell01_file imageinfo|
egrep 'Active image version|Cell boot usb version'
OR
dcli -l root -g ~/cell_group imageinfo| egrep 'Active
image version|Cell boot usb version'
----------------------------------------------------------------------------------------------------------------------------
Check cells are online
----------------------------------------------------------------------------------------------------------------------------
dcli -g /home/oracle/prechk/cell01_file -l root "cellcli
-e list cell"
dcli -g /home/oracle/prechk/cell02_file -l root "cellcli
-e list cell"
dcli -g /home/oracle/prechk/cell03_file -l root "cellcli
-e list cell"
dcli -g /home/oracle/prechk/cell04_file -l root "cellcli
-e list cell"
dcli -g /home/oracle/prechk/cell05_file -l root "cellcli
-e list cell"
dcli -g /home/oracle/prechk/cell06_file -l root "cellcli
-e list cell"
dcli -g /home/oracle/prechk/cell07_file -l root "cellcli
-e list cell"
---------------------------------------------------------------------------------------------------------------
Verify Status POST PATCHING of CELL NODES
--------------------------------------------------------------------------------------------------------------
dcli -l root –g /home/oracle/prechk/cell01_file service celld
status
dcli –l root –g /home/oracle/prechk/cell01_file list griddisk
attributes name, status
dcli –l root –g /home/oracle/prechk/cell01_file list griddisk
attributes name,asmmodestatus,asmdeactivationoutcome
dcli –l root –g /home/oracle/prechk/cell01_file alter
griddisk all active
----------------------------------------------------------------------------------------------------------------------------
./patchmgr -cells /home/oracle/prechk/cell01_file -cleanup
./patchmgr -cells /home/oracle/prechk/cell02_file -cleanup
./patchmgr -cells /home/oracle/prechk/cell03_file -cleanup
./patchmgr -cells /home/oracle/prechk/cell04_file -cleanup
./patchmgr -cells /home/oracle/prechk/cell05_file -cleanup
./patchmgr -cells /home/oracle/prechk/cell06_file -cleanup
./patchmgr -cells /home/oracle/prechk/cell07_file -cleanup
----------------------------------------------------------------------------------------------------------------------------
Reboot the cell nodes once to ensure ILOM patches
applied.
----------------------------------------------------------------------------------------------------------------------------
dcli -g /home/oracle/prechk/cell01_file -l root
"shutdown -r now"
dcli -g /home/oracle/prechk/cell02_file -l root
"shutdown -r now"
dcli -g /home/oracle/prechk/cell03_file -l root
"shutdown -r now"
dcli -g /home/oracle/prechk/cell04_file -l root
"shutdown -r now"
dcli -g /home/oracle/prechk/cell05_file -l root
"shutdown -r now"
dcli -g /home/oracle/prechk/cell06_file -l root
"shutdown -r now"
dcli -g /home/oracle/prechk/cell07_file -l root
"shutdown -r now"
Again wait for cells to come back online. Typically 10 minutes.
[root@easyxdadc002cl01 ~]# imageinfo
Kernel version: 4.1.12-124.42.4.el6uek.x86_64 #2 SMP Thu Sep
3 16:03:23 PDT 2020 x86_64
Cell version: OSS_18.1.31.0.0_LINUX.X64_201013
Cell rpm version: cell-18.1.31.0.0_LINUX.X64_201013-1.x86_64
Active image version: 18.1.31.0.0.201013
Active image kernel version: 4.1.12-124.42.4.el6uek
Active image activated: 2021-01-30 02:40:24 +0000
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7
Cell boot usb partition: /dev/sdm1
Cell boot usb version: 18.1.31.0.0.201013
Inactive image version: 12.2.1.1.8.180818
Inactive image activated: 2019-05-18 09:32:18 +0100
Inactive image status: success
Inactive system partition on device: /dev/md6
Inactive software partition on device: /dev/md8
Inactive marker for the rollback: /boot/I_am_hd_boot.inactive
Inactive grub config for the rollback:
/boot/grub/grub.conf.inactive
Inactive usb grub config for the rollback:
/boot/grub/grub.conf.usb.inactive
Inactive kernel version for the rollback:
4.1.12-94.8.4.el6uek.x86_64
Rollback to the inactive partitions: Possible
[root@easyxdadc002cl01 ~]#
[root@easyxdrtw001cl01 ~]# imageinfo
Kernel version: 4.14.35-1902.306.2.14.el7uek.x86_64 #2 SMP
Fri Jan 28 09:46:24 P
ST 2022 x86_64
Cell version: OSS_20.1.19.0.0_LINUX.X64_220216
Cell rpm version: cell-20.1.19.0.0_LINUX.X64_220216-1.x86_64
Active image version: 20.1.19.0.0.220216
Active image kernel version: 4.14.35-1902.306.2.14.el7uek
Active image activated: 2022-08-12 22:34:41 +0100
Active image status: success
Active node type: STORAGE
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7
Cell boot usb partition: /dev/sdm1
Cell boot usb version: 20.1.19.0.0.220216
Inactive image version: 18.1.31.0.0.201013
Inactive image activated: 2020-11-28 21:07:26 +0000
Inactive image status: success
Inactive node type: STORAGE
Inactive system partition on device: /dev/md6
Inactive software partition on device: /dev/md8
Inactive marker for the rollback: /boot/I_am_hd_boot.inactive
Inactive grub config for the rollback:
/boot/grub2/grub.cfg.inactive
Inactive usb grub config for the rollback:
/boot/grub2/grub.cfg.usb.inactive
Inactive kernel version for the rollback:
4.1.12-124.42.4.el6uek.x86_64
Rollback to the inactive partitions: Possible
[root@easyxdrtw001cl01 ~]#
[root@easyxdrtw001cl01 ~]# imagehistory
Version :
12.2.1.1.0.170126.2
Image activation date : 2017-03-22 23:21:56 +0000
Imaging mode : fresh
Imaging status : success
Version : 12.2.1.1.3.171017
Image activation date : 2018-10-27 08:46:11 +0100
Imaging mode : out of partition
upgrade
Imaging status : success
Version :
12.2.1.1.8.180818
Image activation date : 2018-12-07 23:35:24 +0000
Imaging mode : out of partition
upgrade
Imaging status : success
Version :
18.1.31.0.0.201013
Image activation date : 2020-11-28 21:07:26 +0000
Imaging mode : out of partition
upgrade
Imaging status : success
Version :
20.1.19.0.0.220216
Image activation date : 2022-08-12 22:34:41 +0100
Imaging mode : out of partition
upgrade
Imaging status : success
[root@easyxdrtw001cl01 ~]#
Infiniband Switch Patch
There is two type of switch one spine switch and tow leaf switch.Switch firmware is upgraded in a rolling manner ,if a spine switch is present in the RACK then the Spine switch is upgraded first
Summary
1) run pre-check
./patchmgr -ibswitches ibswitches.ist -upgrade -ibswitch_precheck
2) run upgrade
./patchmgr -ibwtiches ibswitches.ist -upgrade
3) verify
Real-Time Example
1] All actions need to be performed by root user.
2] Upgrade infiniband is 100% online activity.
3] Apply infiniband switch patch from compute node 01.
Log in to Exadata Compute node 1 ( easyexdadc002da01.bbc.local) as root user and navigate the Exadata Storage Software staging area
Patch Location == /u01/patch_2022/IB/patch_switch_20.1.19.0.0.220311
Check the current version of IBSwitch
dcli -g /u01/patch_2022/GI/ibswitch_group -l root version | grep "version"
Steps to apply the Patch
Login to easyexdrtw001db01.bbc.local as root user
1. Execute below command to precheck the IB switch .Cross check no error are there , if no error found then only preceed for Step 2.
#./patchmgr -ibswitches /u01/patch_2022/GI/ibswitch_group -upgrade -ibswitch_precheck -smtp_from "IBSwitch_Precheck" support@easyreliable.com
2. Excute below command to upgrade the IB Switches.
# nohup ./patchmgr -ibswitches /u01/conf_files/ibswitch_group -upgrade -smtp_from "IBSwitch_Patch_Update" support@easyreliable.com &
3. Tail nohup.out and monitor .
4. Check version of each IB Switch after patch
dcli -g /u01/conf_files/ibswitch_group -l root version | grep "version"
Rollback Steps:
- Manually download the InfiniBand switch firmware package to patch directory
- Set export variable "EXADATA_IMAGE_IBSWITCH_ROLLBACK_VERSION" to the appropriate version
- Run patchmgr command to initiate rollback.
DB Nodes/Compute Nodes Patch Steps
As part of patching on Compute Node ,The below is going to update
A) Oracle Linux Operating System
B) Firmware (Flash,Disk ,RAID Controller ,ILOM)
C) Exadata Software
There is Two Method for patching
There is no need to upzip patch software. We need to provide just path for patch software
Old Method
./dbnodeupdate.sh -u -l p20746761_121211_linux-x86-64_new.zip . It need to be run each compute node
New Method(starting 12.2.1.1.0 onwards)
./patchmgr -dbnodes /home/oracle/dbs_group -upgrade -iso_repo p25463013_12211-_Linux-x86-64.zip -target_version 12.2.1.1.0.170126.2 .It can be run in parallel which was not possible in old method
A Powerfull Utility : dbnodeupdate.sh
1) validate provided media (zip,http)
2) validate user inpute
3) create log file to track script execution and changes
4) create diag file with 'before patching' situation
5) create and runs the backup utility
6) check space requirements of /boot filesystem
7) includes 'check-only' option
8) Relinks all database and Grid Infrastructure (GI ) Homes
9) Enables/disables GI to stop/start
10) provide rollback option
We can patch compute Node in Rolling and Rolling Fashion
Behavior of NON-ROLLIMG upgrades
If a Node fails at the pre-check stage, The whole process fails
if a node fails at the patch stage or reboot stage , patchmgr skips further steps for the node . The upgrade process continues for the other nodes. The pre-check stage is done serially. the patch/reboot and complete stage are done in parallel
Summary for Patching Compute Node
1) run pre-check
2) fix all error reported
3) run the upgrade using patchmgr utility
4) check the image version
5) install rpms required for 3rd party production
REAL TIME EXAMPLE
dcli -l root -g /root/dbs_group imageinfo -version
dcli -l root -g /root/dbs_group imageinfo -status
dcli -l root -g /root/dbs_group uname -r
Ensure backups are completed and commented
comment out cronjobs
oem agents to be stopped
ACTION ON NODE BEING PATCHED
Bring down dbs gracefully
------------------------
script to bring down dbs
On each DB Node:
----------------
$GI_HOME/bin/crsctl status crs
************* UNMOUNT the NFS because reboot took time ******************************
comment fstab entry for NFS mount points
df -h
umount -a -t nfs4,smbfs,nfs,cifs -f -l
df -h
uptime
reboot
dbmcli -e list alerthistory
$GI_HOME/bin/crsctl disable crs
$GI_HOME/bin/crsctl stop crs
/u01/app/12.2.0.1/grid/bin
$GI_HOME/bin/crsctl check crs | grep online | wc -l | while read retval; do if [[ $retval -eq 0 ]]; then echo CRS Stopped; elif [[ $retval -eq 4 ]]; then echo CRS Running; else echo CRS Not Ready; fi; done;
uptime
Reset ILOM (SP):
-----------------
as root from node 1
#ipmitool bmc reset cold
Sent cold reset command to MC <<<<< output
#
Precheck:
---------
cd /u01/patch_2022/dbnodeupdate
./dbnodeupdate.sh -u -l /u01/patch_2022/dbpatch/dbserver_patch_220723/p33757259_201000_Linux-x86-64.zip -t 20.1.19.0.0.220216 -a -v
say Y and proceed
did not report any home issue ...
Precheck - Skip GI & DB Homes Validation:
-----------------------------------------
cd /u01/patch_2022/dbnodeupdate
./dbnodeupdate.sh -u -l /u01/patch_2022/dbpatch/dbserver_patch_220723/p33757259_201000_Linux-x86-64.zip -t 20.1.19.0.0.220216 -S -a -v
Check inodes to make sure Backup will success:
----------------------------------------------
inodes=$(df -i -P / | awk 'END{print $3}'); if [ $inodes -gt 500000 ] && [ $inodes -le 1000000 ]; then echo -e "\nWARN... $inodes files\n"; elif [ $inodes -gt 1000000 ]; then echo -e "\nFAIL... $inodes files\n"; else echo -e "\nPASS... $inodes files\n"; fi
Backup Active LVM Sys1 to Inactive LVM Sys2:
--------------------------------------------
cd /u01/patch_2022/dbnodeupdate
./dbnodeupdate.sh -b -s -a
Remove Custom RPMs: (from above check and remove the custom rpms)
------------------
UNMOUNTING THE NFS
df -h
umount -a -t nfs4,smbfs,nfs,cifs -f -l
df -h
comment in fstab,mtab
Precheck again after removing Custom RPMs:
------------------------------------------
cd /u01/patch_2022/dbnodeupdate
./dbnodeupdate.sh -u -l /u01/patch_2022/dbpatch/dbserver_patch_220723/p33757259_201000_Linux-x86-64.zip -t 20.1.19.0.0.220216 -S -a -v
Perform Upgrade:
----------------
cd /u01/patch_2022/dbnodeupdate
nohup ./dbnodeupdate.sh -u -l /u01/patch_2022/dbpatch/dbserver_patch_220723/p33757259_201000_Linux-x86-64.zip -t 20.1.19.0.0.220216 -S -a -n -q &
cd /u01/patch_2022/dbnodeupdate
nohup ./dbnodeupdate.sh -u -l /u01/patch_2022/dbpatch/dbserver_patch_220723/p33757259_201000_Linux-x86-64.zip -t 20.1.19.0.0.220216 -S -a -n -q -w &
Checks After node coming up:
-----------------------------
#/opt/oracle.cellos/CheckHWnFWProfile
[SUCCESS] The hardware and ****
***check with
imageinfo
### Added ####dcli -l root -g /root/dbs_group imageinfo -version
df -h
umount -a -t nfs4,smbfs,nfs,cifs -f -l
*** check if its should be mount
cd /u01/patch_2022/dbnodeupdate
./dbnodeupdate.sh -t 20.1.19.0.0.220216 -a -c -q
yum list installed | grep fuse
$GI_HOME/bin/crsctl check crs | grep online | wc -l | while read retval; do if [[ $retval -eq 0 ]]; then echo CRS Stopped; elif [[ $retval -eq 4 ]]; then echo CRS Running; else echo CRS Not Ready; fi; done;
$GI_HOME/bin/crsctl enable crs (dbnodeupdate.sh should enable crs & starts)
start dbs
start applications
************
Exadata Database server patching (GI+DB)
Note- patch details
earlier Oracle used to release patch as below
1) BP
2) PSU
3) SPU
RU & RUR
RU: Release Updates . IT is just like BP
1) RU are release on a quarterly basis Jan,April,July And October
2) RU contain Optimizer Fixes+Functional Fixes +Regression Fixes +Security Fixes
RUR: Release Updates Revision - IT is just like PSU
1) RUR are release on a quarterly basis Jan,April,July And October
2) RUR contain Regression Fixes +Security Fixes
No comments:
Post a Comment