TheGeekery

The Usual Tech Ramblings

Remote Registry, handle leaks, and objects not found

Earlier today while updating some documentation, I noticed 2 of the servers being monitored in SolarWinds SAM were reporting applications in an “unknown” state. When I pulled up the display, and looked at the details of the state, it was throwing an error:

Bad input parameter. HResult: The specified object is not found on the system.

I thought this was a little weird, as the monitors used to work, and the server hadn’t been patched, or any changes made recently.

First step was verifying that the counters it was looking for really existed. Logging onto the server, I opened the performance monitor, and tracked down the supposidly missing performance counters. They were there. Maybe it was an issue with accessing remotely, so I jumped on the SolarWinds server, and tested remote counter access from there, repeating the same process, but specifying the remote server name. Again, no issue.

This is when I decided to do some searching, and stumbled across a Thwack post that mentioned the same error, but related to Exchange. They had basically done the same testing as I had, but were urged to open a support ticket for more troubleshooting help with their support department.

The last post, before mine, in that thread was the answer I was looking for. They were experiencing an issue with the Remote Registry service, and a simple restart fixed the issue. I took a look at the services on both the servers I was seeing issues with and the issue jumped out immediately. Both servers were using about 600MB and 50k handles. This is very unusual for a service such as the Remote Registry. This tripped a light bulb, as I was working on an issue with a coworker, and he had identified a bug and hotfix for memory leaks in the Remote Registry. In KB2699780 it details the same behavior, and we were scheduled to deploy this hotfix on a different set of servers for a similar issue.

A quick restart of the Remote Registry service had the applications successfully polling again, now to just schedule some maintenance to get the hotfix applied to these servers.

Exchange 2007/2010 and Hiding Mailboxes

Like any large organization, we have automated processes that go and happily disable user accounts on termination. This process looks in our HR database for certain flags, and reacts accordingly. As part of the termination/disabling process, it’ll also flag their email account to be hidden from the Exchange Global Address List (GAL).

In Exchange 2003 hiding accounts from the GAL used to handled by an Active Directory (AD) user attribute called msExchHideFromAddressLists. When this was set to TRUE, the user would be hidden from the GAL. Our HR applications toggle this flag for users that are disabled to hide them away from other users.

This process all worked fine for a long time, until Exchange 2007 rolled around. I guess there was plenty of push to allow you to hide a user from all the GALs, but still allow specific GALs to have those users in. So Microsoft introduced a new AD user attribute called showInAddressBook. The problem now appears that if you toggle the msExchHideFromAddressList, but have a value set for showInAddressBook, the user accounts are no longer hidden in the GAL mentioned in the latter attribute.

Can anybody see where this is going? Yup, it appears that all the user accounts were getting the default GALs assigned to the showInAddressBook attribute, so even when they were having the option to hide the user, they were still showing up1. This was causing problems as people that were disabled/terminated were still showing up, and causing some confusions and concerns.

I started to poke around, and bashed together a quick PowerShell script that will walk through all disabled users that have a showInAddressBook attribute, it’ll then wipe out that attribute.

$ldapFilter = '(&(objectClass=user)(userAccountControl:1.2.840.113556.1.4.803:=2)(showInAddressBook=*))'
$ldap=[ADSI]"LDAP://OU=Disabled Accounts,DC=DOMAIN,DC=TLD,DC=PVT"

$srch = New-Object DirectoryServices.DirectorySearcher($ldap)
$srch.Filter = $ldapFilter
$foundUsers = $srch.findAll()

$foundUsers | % {
       $adUser = $_
       $props = $adUser.Properties
       $objUser = [ADSI]$("LDAP://{0}" -f $props.get_Item('distinguishedName').Item(0))
       $objUser.PutEX(1,'ShowInAddressBook', $null)
       $objUser.SetInfo()
}

If you’ve not seen LDAP queries before, they work by starting with the operator (and, or, etc), and then the objects that they apply to. So in the example above, it reads as such:

(objectClass=user) AND (userAccountControl:1.2.840.113556.1.4.803:=2) AND (showInAddressBook=*)

It can get a little more complicated when you start stringing together multiple options such as AND and OR operators, and various combinations of them. In this example, we’re going for pretty simple.

I then used the .NET libraries System.DirectoryServices.DirectorySearcher. This uses the LDAP query specified, and returns all matching results. Next was a case of walking through the results, and fetching a DirectoryEntry object to edit the properties. In this case we’re setting it to $null which removes it.

After letting this script run over about 25k users disabled users, it cleared up the fluff in the GAL, and made HR happy.

  1. As a weird side-note to this, if you check the box to hide the user in the Exchange management suite, it removes the showInAddressBook flag on its own, same for the PowerShell options too. 

Octopress and OpenID

One of the things I had completely forgotten about during my migration from WordPress to Octopress was OpenID. I had used one of the few OpenID plugins that tied into WordPress, and allowed you to use WordPress as an OpenID provider, giving me the ability to login to sites using my WordPress site.

This was great, and I’d completely forgotten about it because I rarely used it. That was until yesterday when somebody on the #Nagios IRC channel had asked a question, and then posted the same question to stackoverflow. I decided to answer the question over there, and remembered I had signed up for an account using OpenID, so I dutifully typed in my site URL, and was stumped because I wasn’t redirected.

This is where I did a little face-meets-desk action. I’d killed my OpenID account by killing off my WordPress site. I tried to think of a way around this, and did some quick searching, and stumbled upon a post by Darrin Mison, on the exact same topic. Darrin had left his WordPress site active over on Wordpress.com, but had migrated to his URL else where. Because of this, Darrin was able to use what is called a deligate, and tell anybody making a request to look elsewhere to authenticate.

This sparked a vague memory, and reminded me that when I first started tinkering with OpenID, I used a different site for the authentication, so a quick check, and I was able to login there. Now I just needed to edit my Octopress site to provide the delegate information.

I used myOpenID.com as my delegate, and they have a help article on how to handle using your own URL. Following what Darrin had done, I edited source/_includes/custom/head.html and added the lines that the were mentioned in the help doc. So now my head.html template looks like this:

<link rel="openid.server" href="http://www.myopenid.com/server" />
<link rel="openid.delegate" href="http://jonangliss.myopenid.com/" />
<link rel="openid2.local_id" href="http://jonangliss.myopenid.com" />
<link rel="openid2.provider" href="http://www.myopenid.com/server" />
<meta http-equiv="X-XRDS-Location" content="http://www.myopenid.com/xrds?username=jonangliss.myopenid.com" />
<!--Fonts from Google"s Web font directory at http://google.com/webfonts -->
<link href="http://fonts.googleapis.com/css?family=PT+Serif:regular,italic,bold,bolditalic" rel="stylesheet" type="text/css">
<link href="http://fonts.googleapis.com/css?family=PT+Sans:regular,italic,bold,bolditalic" rel="stylesheet" type="text/css">

Pretty simple, and a rebuild of the blog, and my page now includes the delegate headers required to redirect OpenID requests.

Adding XML Elements using PowerShell

In a follow-up to my previous post on removing XML elements using PowerShell, I decided to figure out how to add elements using PowerShell. I’m working with the same file from Remote Desktop Manager (RDM) and adding remote management configurations based on DNS checks.

In the enterprise licensed version of RDM you are given the ability to add “remote management” interface details to a host configuration. In our environment, that remote management interface is iLO, and is available from a dedicated IP address, over HTTPS, giving you access to a remote console as well as power management features. RDM handles this with a small tweak to the XML file adding another element under the connection meta information.

The XML that RDM is looking for is like this:

<Connection>
  <MetaInformation>
    <ServerRemoteManagementUrl>https://hostname-ilo</ServerRemoteManagementUrl>
  </MetaInformation>
</Connection>

I’ve removed most of the information, which you can see in the previous post.

As we’re trying to be careful with the file, we need to first validate the XML has a MetaInformation element, and then an existing ServerRemoteManagementUrl element. If one, or neither, exist, then they get created. Not all hosts have iLO interfaces, such as virtual machines, so we need to verify the presence of a DNS record first, and then only create the entry if it exists.

$ilo_str = "https://{0}-ilo"
$ilo_host = "{0}-ilo"

[XML]$conns = Get-Content c:\temp\connections.xml

$nodes = $conns.ArrayOfConnection.SelectNodes("Connection[ConnectionType[.='RDPConfigured']]")

if ($nodes -eq $null) {
     continue;
}

$nodes | %{
     $node_name = $_.Name
     $meta = $_.SelectSingleNode("./MetaInformation")
     if ($meta -eq $null) {
          $meta = $conns.CreateElement("MetaInformation")
          $_.AppendChild($meta)
     }

     $ilo = $meta.SelectSingleNode("./ServerRemoteManagementUrl")
    
     if ($ilo -eq $null) {
    
          $dns = $null;
          try {
               $dns = [System.Net.Dns]::GetHostAddresses($($ilo_host -f $node_name))
          }
          catch [System.Exception]
          {
            ## Doing Nothing ##
          }
    
          if ($dns -ne $null) {
               $ilo = $conns.CreateElement("ServerRemoteManagementUrl")
               $ilo.InnerText = $ilo_str -f $node_name
               $meta.AppendChild($ilo)
          }
         
     }

}

$conns.Save("C:\temp\connections_2.xml")

Again, working with a copy of the original file, I use some crafty XPath queries again to only select connections that are RDP. I then loop through the connections/nodes, and extract the name. Lines 14-18 test for the presence of the MetaInformation element, and create it if it doesn’t exist. Line 20 checks for the ServerRemoteManagementUrl element, if it’s not there, it creates it proceeds with DNS validation.

Lines 24-31 perform a DNS lookup, unfortunately it returns an exception rather than a $null or empty object, so I had to throw in some quick dummy catch code that doesn’t really do anything. If a DNS record is returned it creates the new element, and adds it to the MetaInformation element. For the final step, I saved it to a second file so I could do a comparison between the files to make sure it did as I expected.

One thing to note about adding elements to an XML document is that the CreateElement function (lines 16 and 34) are not executed against the node you are adding the element to, they are executed against the document root. This is so that the element gets all the correct name space information. You then append your element to the existing element.

Removing XML Elements using PowerShell

Every now and again I have to strip out elements from an XML file. In this case, I was doing some cleanup of my Remote Desktop Manager configuration file. When I first started my current job, to save a lot of discovery, my boss shared his configuration file. Unfortunately the configuration file had a lot of hosts that had duplicate configuration information that wasn’t relevant because the “duplicate” option had been used to copy existing hosts. This meant stuff like host description had been copied.

Remote Desktop Manager (RDM) uses an XML file for its configuration, which makes editing it really easy. To clean up the invalid descriptions, I used a little PowerShell and some XML know-how. Here is an example entry I need to clean up…

  <Connection>
    <ConnectionType>RDPConfigured</ConnectionType>
    <Events />
    <Group>MyDomain\App Servers\DEV</Group>
    <ID>73146eeb-caf9-4579-a146-41f7330261a6</ID>
    <MetaInformation />
    <Name>SERVER1</Name>
    <ScreenSize>R1280x800</ScreenSize>
    <Stamp>5f8a9830-fc2e-440e-a72b-f889d5b17a5b</Stamp>
    <Url>SERVER1</Url>
    <Description>HP Command View EVA</Description>
  </Connection>

And here is the PowerShell that is used to cleanup the file.


[xml]$xml = Get-Content C:\Temp\Connections.XML

$node = $conns.SelectSingleNode("//Description[.='HP Command View EVA']")
while ($node -ne $null) {
    $node.ParentNode.RemoveChild($node)
    $node = $conns.SelectSingleNode("//Description[.='HP Command View EVA']")
}

$xml.save("C:\Temp\Connections.XML")

Pretty simple, but here is how it works. The first line is pretty obvious, it’s getting the content of the file1. It then explicitly converts the array object into XML using [xml]. The next bit is where it gets a little harder, and requires a little knowledge of XPath syntax. The code is looking to select a single node, that has the name “Description”, with the data in it that says ‘HP Command View EVA’. If it’s found, it’ll return a XMLElement object, otherwise $node ends up being $null. This gives us the ability to wrap the search in a loop, and remove the elements we don’t need. To remove the element, you have to tell the parent node to remove it, so you ask the node to go back to the parent to remove itself, a little weird, but it works. The final step is to go back and save it to a file.

The hardest bit about handling XML is knowing how XPath stuff works, once that is understood, the rest is usually pretty easy. PowerShell treats XML as an object, so it’s easy to figure out what you can do with the objects using Get-Member.

  1. Which I had copied to C:\Temp to make a backup of, instead of working on the real file. 

Updating iLO firmware using hponcfg and XML

In the course of updating all of our HP BladeSystem blades (BL465c) servers over the last few weeks, I’ve stumbled across some interesting things. For example, you can updated all the iLO cards at once if you have an Onboard Administrator (OA), a TFTP server, and a little XML knowhow…

<RIBCL VERSION="2.0">
        <LOGIN USER_LOGIN="Administrator" PASSWORD="UsingAutoLogin">
                <RIB_INFO MODE="write">
                        <UPDATE_RIB_FIRMWARE IMAGE_LOCATION="tftp://TFTP_SERVER/ilo3_150.bin" />
                </RIB_INFO>
        </LOGIN>
</RIBCL>

This gets saved as an XML file on the TFTP server, I named it update_firmware.xml. The USER_LOGIN and PASSWORD fields do not matter as single sign-on is used from the OA. The iLO update binary is put on the TFTP server as well (you should use the version applicable to the hardware you’re updating). Then comes the easy bit. SSH to the Onboard Administrator, and execute the hponcfg command as such:

hponcfg ALL tftp://TFTP_SERVER/update_firmware.xml

If you only need to update a single blade, change ALL to the blade number. Otherwise, this will download the iLO firmware update, push it to each of the iLO cards in the BladeSystem chassis, and then restart them. This will not impact the running server. You should see output like this once it has started:

<!-- Transfering image: 0% complete -->
<!-- Transfering image: 10% complete -->
<!-- Transfering image: 20% complete -->
<!-- Transfering image: 30% complete -->
<!-- Transfering image: 40% complete -->
<!-- Transfering image: 50% complete -->
<!-- Transfering image: 60% complete -->
<!-- Transfering image: 70% complete -->
<!-- Transfering image: 80% complete -->
<!-- Transfering image: 90% complete -->
<!-- Transfering image: 100% complete -->

Bay 15: RIBCL results retrieved.
<!-- ======== START RIBCL RESULTS ======== -->

<!--more output here-->

<?xml version="1.0"?>
<RIBCL VERSION="2.22">
<RESPONSE
    STATUS="0x0000"
    MESSAGE='No error'
     />
<INFORM>Firmware flash in progress [100%].</INFORM>
</RIBCL>
<?xml version="1.0"?>
<RIBCL VERSION="2.22">
<RESPONSE
    STATUS="0x0000"
    MESSAGE='No error'
     />
<INFORM>Firmware flash completed successfully. iLO 3 reset
initiated.</INFORM>
</RIBCL>

And that’s it, the magic is done. Using hponcfg is possible from Windows as well when updating the local machines, so it’s quite possible to use the same XML (though I’ve not tested it).

Five Saturdays

Note: I started writing this post at the beginning of December, but due to time issues, and working on blog migrations, I never got around to posting. I’ve still decided to post because it throws in some PowerShell goodies.

The internet is such a gullable place. Really it is. People post to Facebook nearly everything they see anywhere because it sounds like it’s quite possible, and usually accompanying some cool picture to make it seem more important.

An example of one that keeps coming up…

This year, December has 5 Saturdays, 5 Sundays, and 5 Mondays. This only happens once every 824 years.

Along with some blah blah crap about money, and Chinese superstitions. I’m not sure why people don’t stop and think for just a second, and wonder how that could be possible.

Lets do some mental math, and see what happens. December has 31 days, so regardless of what year it is, there will always be 3 days that occur 5 times that month. What are the chances that any other month with 31 days, would start on a Saturday? You’d think pretty high, and I’m guessing a little more frequently than once every 824 years.

To prove the point, I threw together some PowerShell to figure out how many might occur within the next 20 years.

$date = Get-Date "00:00:00 11/01/12"

1..300 | %{
     $date = $date.AddMonths(1)
    
     $mo = $date.Month
     $yr = $date.Year
     $dy = $date.DayOfWeek
    
     $dim = [System.DateTime]::DaysInMonth($yr, $mo)
    
     if (($dim -eq 31) -and ($dy -eq [System.DayOfWeek]::Saturday)) {
          "{0}`t{1}" -f $yr, $mo
     }
}

So what is this doing? The first line is grabbing November 1st, and the it loops 300 times. Each loop it adds a month, and figures out what day it is, and the number of days in the month. If there are 31 days in the month, and the day is Saturday, it outputs the year and the month. So how did it look? Did I get no results because I’m inside the 824 years? Far from it…

2012     12
2014     3
2015     8
2016     10
2017     7
2018     12
2020     8
2021     5
2022     1
2022     10
2023     7
2025     3
2026     8
2027     5
2028     1
2028     7
2029     12
2031     3
2032     5
2033     1
2033     10
2034     7
2035     12
2036     3
2037     8

So it looks like it occurs quite frequently. Lets also assume just for a second that they meant only December, if we look at the results, we can see it’s pretty consistent. 6 years until the next even, then 11 years, then another 6 years, much more frequently than 824 years.

As a side note, Snopes covers this issue as well.

So my tip of the day, if you feel the urge to repost somebody’s random image and something doesn’t seem right, hit Google or Bing, and search for part of the phrase and see what you come up with.

WordPress to Octopress

Well, it’s only taken me about 3 months of on-off messing about, but I’ve finally cut my blog over to using Octopress. It’s been a long ride with Wordpress but the overhead and all the fluff were more than I needed. Octopress is nice and simple, and just gets stuff done.

The hardest part was exporting and converting. The 6 years of Wordpress usage had me tinkering with all sorts of plugins to get various fun things working, but ultimately ended up being abandoned. Unfortunately after abandoning them, I nearly played clean-up and fixed all the data it left floating around. One major example is the 4+ different code formatting plugins I’ve used.

I don’t remember all the steps I took to get it done, but have most of it documented. I’ll write it up some day. For now, let me know if you spot anything terribly wrong.

Lync, Federation, and DNS

Over the last few weeks, I’ve been working on a Microsoft Lync pilot at work. One of the requirements was external federation. This feature basically allows us to use instant messenger (IM) between users in both locations. So for example, you are CompanyA and you do business on a regular basis with CompanyB and are both using Lync. Federation allows you to add each other to your Lync clients, and talk to each other.

The configuration and implementation went pretty smoothly, but I was having intermittent issues with federation. The problem came up when adding an external company that hadn’t explicitly added us to their federated domains list. Initially we had dismissed the issue as a firewall issue because we got federation working with some consultants, however I was later asked to add a vendor and started seeing the same issues.

After some testing, I wasn’t getting any closer, so I enabled the client logging options in the Lync client. Those are found under Options, General, and “Turn on Logging in Lync”. This writes a log file to a Tracing folder under your user profile directory (C:\Users\username\Tracing). When I started digging into the logs, some errors popped out at me.

ms-diagnostics: 1027;reason="Cannot route this type of SIP request to or from federated partners";

The other error that popped out at me was:

ms-diagnostics:  1034;reason="Previous hop federated peer did not report diagnostic information";

Without doing much digging, the first suggests the request I’d sent to the vendor couldn’t be routed, and the second reports that no error explaining why it couldn’t be routed was returned from the remote side. This made me think it was a potential firewall issue again. Doing the basic testing of validating that our Edge server was accepting incoming connections, and I could connect to the vendors edge server, I eliminted the firewall being an issue. This got me really scratching my brain.

I ran a SIP stack trace from the Lync Edge server and saw more unusual errors, such as “504 Server time-out”. This was beginning to frustrate me, I had confirmed that both servers could talk to each other, why were they getting timeouts?

I decided to go back to basics, start at the very bottom. First thing was connectivity, that we already established by using telnet to the servers. The next was DNS. Lync, like a lot of Microsoft products, takes advantage of Service Records (SRV) in DNS. This record tells the requesting client the protocol, port, and host to connect to. In this case, the Edge server is looking for the SRV record for the entry _sipfederatiltls._tcp.sipdomain.com. The response should look something like this:

_sipfederationtls._tcp.sipdomain.com. 300 IN SRV 0 0 5061 sipexternal.sipdomain.com.

So the protocol is TCP, the port is 5061, and the server I need to connect to is sipexternal.sipdomain.com. I ran a check against our domain, and the vendors domain, and both came back with records. Except, with them being so close together on the screen, I immediately spotted an issue.

_sipfederationtls._tcp.sipdomain.com. 300 IN SRV 0 0 5601 sipexternal.mysipdomain.com.
_sipfederationtls._tcp.sipdomain.com. 3600 IN SRV 0 0 5061 sipexternal.vendorsipdomain.com.

Ignoring the different in 300 and 3600, the Time-To-Live of the record, the next difference was the port numbers. Looks like I made a simple transposition of numbers. I did a quick test from outside the firewall, and confirmed that 5601 was not open. I went back through the firewall change tickets, and confirmed I had requested 5061, and the Lync configuration was also set to 5061.

A quick DNS change for the SRV record, fixing the port, and within 10 minutes I received 3 notifications from the vendor that they had staff adding me to their contact lists.

One of the things I’ve come to learn over the years, whenever there is something awfully quirky going on, and you cannot quite figure out what’s causing it, take a look at DNS. I’ve had a number of issues that have resulted in being a simple DNS issue. In this case, it was simple human error, but boiled down to be DNS saying one thing, and it should have been something else.

Lync, Exchange Unified Messaging, and TLS

One of the cool things about Exchange is a role called Unified Messaging (UM). This role allows you to bridge voice messaging, call routing, and emails, all into a convenient package. What’s even better, you can get Lync to integrate right into that feature set too, giving your Lync system a voicemail system.

Part of our pilot program was to setup Lync and Exchange UM. This was to allow us to demonstrate that Lync and our existing Exchange infrastructure could potentially replace the older PBX style system, while giving us cost savings across the board. Configuring the Lync and UM integration is pretty easy, this site gives a great step-by-step walk through of the process. As we were working with consultants during the installation, they walked us through the steps, with a minor deviation to the instructions on that site.

When it came time to test, we called the UM extension number, and got a fast busy. Because it was close to end of day, we thought about giving it time to perform replication at the AD level as contacts and AD records are created as part of the process. Unfortunately the next morning, the issue was still around. Digging about, we confirmed all the settings were correct, however we were seeing weird errors regarding TLS and certificate names in the Application log on the UM server…

The Unified Messaging server failed to exchange the required certificates with an IP gateway to enable Transport Layer Security
(TLS) for an incoming call. Please check that this is a configured TLS peer and that the certificates being used are correct. More
information: A TLS failure occurred because the target name specified in the certificate isn't correct. The error code = 0x1 and the message = Incorrect function.. Remote certificate:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
(Lyncpool.internaldomain.local). Remote end point: 10.10.10.17:50574.
Local end point: 10.10.102.27:5061.

The error suggested we were seeing an issue with the SSL certificate on the Lync pool front end server. The first thing to check was that the Subject Name (SN) for the SSL certificate was the same as the name used for the UM IP gateway (Lync pool).

UMDialPlan_01 We verified that the SN matched the UM IP gateway, and that the fingerprint reported by the error message above (which I changed if you didn’t guess) matched. Next was to make sure the certificate chain wasn’t broken. Both servers had a correct certificate chain, and because they were internal certificates issued from our own CA server, the full chain of trust was available. We verified that both servers could open the others’ certificate and also saw the full chain.

We then went back and revalidated all the configurations again, and we didn’t spot anything obvious, the consultants were stumped.

UMDialPlan_03 After several hours of bashing our heads against it, the consultants started throwing out some wild ideas, such as us using an unrecommended TLD in the cert, and some other bits and pieces too. I found those a little hard to follow, but continued to research. Going back over the post we’d discovered on UM configuration, I scoured the images looking for something we may have missed. It wasn’t until I got to the very last image that I spotted the issue…

UMDialPlan_01 The output of the script reports the pool name, associated dial plan, and the UM IP Gateway. When I looked in the Exchange Management console, the “Associated UM IP gateways” field was empty (highlighted with red border as filled in), suggesting that the dial plan had no gateway assigned. When I discussed it with the other team member we remembered that the step to run the ExchUCUtil.ps1 script was done before executing the OcsUmUtil.exe process. We believe that the order of execution resulted in some entries not being updated with the correct information, which in turn made Exchange UM have issues matching an UM IP gateway to a dial plan. After re-executing the ExchUCUtil.ps1 script on the UM server, the UM IP Gateway field was populated with “1:1”, which matched the UM IP Gateway tab for the Lync Front End server.

Less than a minute after we confirmed the field had been populated with information, we ran a test call to the UM extension, and was greeted with the friendly UM voicemail system.

This is a weird case where an error message took us down the wrong path, if it had been a little clearer, it may have steered us in the right direction. A better error, obviously, would be to report that the inbound call was not matched to the calling UM IP Gateway. We probably would have had the issue resolved in about 10 minutes, instead of 2 days.

As a side note, the SSL certificates must be configured correctly as well, otherwise you will get exactly the same error message. This is why we went down the path of examining the certificates very closely. The UM server SN must be the Fully Qualified Domain Name (FQDN) of the server, and the UM IP Gateway SN has to be that of the Lync Front End server.