Jabber for Windows: “Cannot communicate with the server”

Hello folks!

I was delaying this post for a while, hoping to find a resolution to the issue that I’ve been working on for over a month now. This is a somewhat unique case which may not be experienced by many Cisco customers, but there is a chance that there are others that are hitting the same defect. Below is a quick overview of the environment, the description of the actual problem and current workarounds as discovered through independent troubleshooting processes and through Cisco TAC.

Overview/Conditions:

  • The client is a multinational company with presence at some very remote locations. The internal communication between different sites is within MPLS with highly heterogeneous connectivity (a whole variety of fiber, copper, microwave and satellite communications).
  • Due to such a vastly distributed environment with varying network latencies, there are little opportunities to centralize call processing to a few regional clusters. Hence, the client has a number of CUCM clusters with some being in a very close geographic proximity to one another. This is especially true for one region where the only form of communication is via high-latency satellite connection.
  • Cisco Jabber is used throughout the organization, so each CUCM cluster would have a CUP server (or two) to support IM & Presence capabilities. The ILS is used for Inter-cluster lookup and a centralized UDS for user directory. (BTW, if anyone is interested in seeing a separate post on an end-to-end configuration of a multi-cluster environment (with MRA!) to support Jabber – please drop me a line in the comments section).
  • Majority of Cisco Jabber clients are running version 11.0 and above and all CUCM clusters were recently upgraded to version 11.0.1. Prior to upgrading to CUCM version 11.0.1, the client was running version 10.5.2.

Problem Description:

  • The issue affects users who are located in remote areas where communication between the site and the rest of the corporate network is happening over high-latency satellite link. Locally, Cisco Jabber users connect to their home clusters just fine. When a user with working Cisco Jabber travels to another remote location and tries to connect, the client shows the all-too-common “Cannot communicate with the server” error.
  • It has been observed that the maximum allowable latency between Cisco Jabber client and the user’s Home Cluster is somewhere between 600-700 ms (round trip delay). With latency of 1000 ms or more, Jabber does not connect with the above “Cannot communicate with the server” error.
  • The PRT may show the following errors:
    • 2016-06-21 07:18:26,226 INFO  [0x000018ac] [ls\src\http\BasicHttpClientImpl.cpp(448)] [csf.httpclient] [csf::http::executeImpl] – *—–* HTTP response code 0 for request #21 to https://cucm.example.com:6972/CSFdevice.cnf.xml
    • 2016-06-21 07:18:26,345 WARN  [0x000018ac] [mpl\ucm-config\tftp\TftpFileSet.cpp(113)] [csf.config] [csf::ucm90::TftpFileSet::fetchInitialTftpFile] – Failed to connect to Tftp server : result : UNKNOWN_ERROR
      Note how the above reveals that Jabber client is requesting a configuration file from TFTP using port 6972 rather than 6970. This change was introduced wtih CUCM version 11.x and Jabber 11.x. 

Problem Resolution/Workaround:

Currently, there is no solution to resolve this issue, but as always with Cisco, there are workarounds:

  • Downgrade affected Cisco Jabber clients to any 10.x version (e.g. the latest build for version 10.6 that is currently offered on CCO is 10.6(7)). Prior to version 11.x, Jabber was using port 6970 to grab the configuration file off TFTP server. CUCM 11.0 is backward compatible with older versions of Cisco Jabber clients and would allow Jabber to connect on that port. Don’t ask me how the difference in port for the same service (TFTP) could alter the Cisco Jabber’s behaviour, but this workaround actually works.
  • If users who experience the problem do not care about phone services and just want IM & Presence functionality to be working, provide instructions on how to connect Cisco Jabber to the CUP Server manually (in Cisco Jabber for Windows, click “Advanced Settings”, choose “Cisco IM & Presence” for Account Type, select “Use the following server” for Login Server and type FQDN of the home CUP server).
    Note: since Jabber client is not connecting to CUCM’s TFTP to grab its config files, any customized configurations specified in the jabber-config.xml file are not going to apply.
  • Downgrade your CUCM environment to 10.5.2 (I wouldn’t).
  • Upgrade your CUCM environment to version 11.5 (apparently, it has just become available for download on CCO).
    Note, though, that although the latter was suggested by Cisco TAC, this workaround has yet to be verified by yours truly. 

This post will be updated once a formal resolution takes place. I would also expect Cisco TAC to file the bug in it’s Bug Tracker. When they do, I will publish an update with the link to the bug ID.

Hope this helps someone.

Issues with TMSPE after upgrading to TMS version 15.2.1

OK, folks, so you want to keep your Cisco UC systems up-to-date and decided to upgrade your TelePresence Management Suite (and extensions for it) to the latest-and-greatest. You’ve done your due diligence and followed the Install and Upgrade guides and Release Notes for all systems that can be potentially affected (TMS, TMSPE, TMSXE, VCS, etc.) to ensure you cover all your bases in regards to inter-dependencies (there are plenty). However, after the upgrade, you notice a few new alarms on the VCS and TMS. The errors may look something like the following:

On TMS:

"((-1) Importer Error : TypeError('__init__() takes exactly 4 arguments (2 given)',))"

On VCS Control:

"The VCS is unable to communicate with the TMS Provisioning Extension services. Phone book service failures can also occur if TMS does not have any users provisioned against this cluster."

In the event log on the VCS, you would see some additional details:

"...provisioning: Level="ERROR" Detail="Import from TMS Provisioning Extension services failed" Service="device" Status="{"reason": "Importer Error : TypeError('__init__() takes exactly 4 arguments (2 given)',)", "reason_code": -1, "detail": "Traceback (most recent call last):\n File \"/share/python/site-packages/ni/externalmanagerinterface/control/importcontrol.py\", line 766, in run\n File \"/share/python/site-packages/ni/utils/web/restclient.py\", line 345, in send_get\n File \"/share/python/site-packages/ni/utils/web/restclient.py\", line 308, in send_request\n File \"/share/python/site-packages/ni/utils/web/restclient.py\", line 320, in http_request\n File \"/share/python/site-packages/ni/utils/web/httplib2ssl.py\", line 399, in request\n File \"/lib64/python2.7/site-packages/httplib2/__init__.py\", line 1608, in request\n File \"/lib64/python2.7/site-packages/httplib2/__init__.py\", line 1359, in _request\n File \"/lib64/python2.7/site-packages/httplib2/__init__.py\", line 1247, in _auth_from_challenge\n File \"/lib64/python2.7/site-packages/httplib2/__init__.py\", line 523, in __init__\nTypeError: __init__() takes exactly 4 arguments (2 given)\n", "success": false, "error": "InternalServerError"}"

What gives? Well, apparently, there has been a change in the way the TMSPE is authenticating with TMS in the newest version of the suite. Navigate to your TMS server and open IIS Manager. Expand Sites -> Default Web Site; click on ‘tmsagent’, then select ‘Authentication’. Ensure that ‘Digest Authentication’ is disabled.

TMS Agent settings in IIS

If it is enabled, disable it and then restart your web server (iisreset /noforce). Next, verify that Provisioning Extension is operating successfully (you may need to restart TMS Provisioning Extension service).

Hope this helps someone.