If you ever happen to be tasked with recovery of CUCM IM & Presence (a newer name for CUP) server, perhaps this post will help you.
Disclaimer: The following recovery process worked for me (and I have a number of successful recoveries under my belt), and while every step has been taken to provide my readers with accurate information, please use your discretion before taking any decisions based on the contents of this post. You may want to validate some or all of the steps with a Cisco TAC engineer.
Step 1: Preserve your existing backups! If you have DRF backups in place, save them by copying the backup files to a safe place. Why? Existing backup copies can be overwritten by newer backup jobs (say, in case the restore process takes you longer than expected and you have selected to keep only a couple of most recent backups when you configured Backup Device in DRS).
Step 2: In CUCM, unassign all users from the existing Presence Group.
Step 3: Delete Presence Group and delete the failed CUP server from System -> Server in CUCM.
Step 4: Add the CUP node back to CUCM with the same name under System -> Server. A default Presence Group is created and the CUP node is added to it – that’s fine.
Step 5: Proceed with a fresh install of the CUP node. Note: the version should match exactly the one of the failed node.
Hint: all CUP ISOs that are available for download on CCO are bootable, so you do not have to use any tricks to turn non-bootable ISO into a bootable one.
Step 6: Proceed with DRS recovery. Now, this is important: you must perform full cluster recovery (restore both CUP and CUCM) from your backup. Why? Well, since the CUP node has been deleted and re-added in steps 3 and 4 above, the CUP server will have a new PKID in CUCM database. If you just recover the CUP node without recovering CUCM database, the node will have a different (old) PKID and thus would no longer match new PKID recorded in CUCM. As a result, certain services will not start in CUP and you will see the following error in CUP: “The IM&P Publisher node was deleted from the CUCM server list. This node needs to be reinstalled.”
Step 7: Once the restore process completes, restart CUCM Pub first (utils system restart), wait for it to come up, then restart CUCM Sub and CUP Pub.
Step 8: Perform typical health checks of your CUCM and CUP nodes:
- utils dbreplication status, followed by utils dbreplication runtimestate on your CUCM Pub to verify database replication between Pub and Sub nodes;
- Launch RTMT, connect to CUCM Pub and review the alarms;
- Perform diagnostics in CUP Pub (Diagnostics – > System Troubleshooter)
That’s it! Hope this helps someone.