Monday, March 12, 2018

The Evolution of the Skype Optimizer: From Locally-Run VBScript to Azure Web App

I was recently amazed when I realized the seeds of what is now the Skype Optimizer was created 10 YEARS AGO, back when Office Communications Server 2007 R2 was starting to make headway in the unified communications space.

The Beginning

The genesis of the program grew out of a need to know when to strip the +1 from a North American local number and when not to. Rather than rehash the creation myth, you can read all about it from one of my earliest blog posts where I announced the Dialing Rule Optimizer to the world (at that time, the 30-odd subscribers to my blog).

The very first version was a straight VBScript that I had to manually input the proper variables and run it by hand. Rather than give the code away, I told people to email me the phone numbers they wanted to get optimized dial rules for.  I would run the script and send them the results, which was a simple text file with either a bunch of regex, or text formatted to be applied to AudioCodes or Dialogic gateways.  Word got around within Microsoft, and I found myself busy sending stuff to various Microsoft consultants.

When Lync 2010 came around, which was in its earliest days known as Communications Server "14", I added the capability to create simple routing rules that consisted of a few lines of PowerShell code.  I also wrapped the code around a simple UI in something called an HTA (short for HTML application).  It made generating rulesets easier for me, but it was still something that I was running from my local machine.
The earliest known copy of the original Dialing Rule Optimizer. I obtained this from the Smithsonian Museum. The text-only v1.0 has been lost to the sands of time.
I soon figured out that it would be relatively straightforward to move the HTA into an actual web page.  I put the code on a web server hosted by the company I was working for at the time and opened up the tool to the entire world. I actually put this code on the computer that was running our OCS 2007 R2 server!

The very first web-based iteration of the Dialing Rule Optimizer. Note the Communications Server "14" logo on the top-right.

Once Communications Server "14" became Lync 2010, I realized that I could go beyond simple optimized route creation, and modified the Optimizer to create everything required for a simple Enterprise Voice setup for US and Canada deployments.

Shortly after, I realized that I could do the same for other countries as well. The Optimizer interface grew somewhat to accommodate the requirements for different countries.
Dramatic differences abound! Communications Server "14" has changed to "Microsoft Lync". Also, UK dial plans!
I slowly added other countries to the Optimizer. I also added other features such as extension dialing rules, least-cost/failover routing, among many others.

Over time, the back-end code base was starting to become difficult to support. I was using a series of XML files to deal with languages and country-specific dialrules, and the sheer number of them was becoming cumbersome to manage.  I decided to move everything from the company-hosted platform to Amazon Web Services. I built a single Windows VM with SQL Express and ported the XML files to a database. It worked well, but AWS was starting to cost a fair bit to run for a free service. Donations were not keeping pace with costs.

I then discovered that Microsoft MVPs got a monthly allotment of funds in Azure. I immediately moved my infrastructure to Azure, where it ran mostly trouble-free for the next several years.

The Optimizer featureset grew and grew, but the interface was still as ugly as the day I first created it.  One person even suggested it looked like a GeoCities page. Hey, my argument was always that I was not (and am still not) a web developer.

The Modern Era

I decided to try to give the Optimizer a more modern look. I completely re-wrote the front-end code using Notepad++ as my trusty editor.  I replaced the clunky extension builder with a better Javascript framework that emulated an Excel spreadsheet, and made other significant under-the-hood improvements. After much trial and error, I was pleased to unveil the new look.

The website looked modern, clean and easy-to-use.  However, it bugged me that while the site was running in Azure, it was still just a single Windows VM with a local SQL Server instance. It was also costing most of my monthly Azure MVP credits with not a lot of headroom. I decided to try to make the Skype Optimizer simpler and cheaper to manage, figuring that there would probably come a day when I would not be a Microsoft MVP (the horror!!!) and I'd be expected to pay a monthly bill (Oh the humanity!!!).

My first step was to dump the local SQL and move to Azure SQL Database. First, I had to copy the gigabytes of data to my SQL instance.  I opted to use transactional replication, which would allow me to keep both my local and Azure-based SQL instances up-to-date while I tested things out.  It turned out to be ridiculously easy. I had to modernize my code a bit to allow it to still read/write data to Azure SQL, but this was pretty straightforward as well.

With that hurdle out of the way, I looked at a few different ways to further reduce my costs and administrative burden. 

Docker Containers

I'd heard about Docker containers and how it was an easy way to reduce the overall complexity and costs over a traditional virtual machine.  I installed Docker on my home machine and started messing around with it. I used the Image2Docker tool to make a copy of my Windows VM-based website and installed it locally.  I had to do quite a bit of modifications to my dockerfile to support some of the added features that Image2Docker didn't capture, but after a while, I managed to make the Skype Optimizer work in a Docker container. 

Moving my newly-created container to Azure wasn't too much work, but there was definitely a learning curve involved. It started up fine, and I pointed my DNS entries to the container and away we went!  However, all was not perfect:
  1. My container stopped working a few times over the span of a few weeks. Troubleshooting this proved to be nearly impossible due to the nature of Docker containers and how you lose the previous state every time you restart it. Either that or I just don't know how these things really work.
  2. Making code modifications wasn't simple either. I'd have to make the change in my local Docker image and publish that image to Azure. The startup process took 5-10 minutes, which was probably due to how I built my container.  I looked at ways to improve the startup time, but it was already eating up lots of my time.
  3. The costs to run the container wasn't much cheaper than a full VM
Because of those reasons, I decided that Docker containers weren't well-suited to my needs and I finally turned to....

Azure Web App

All the back-end code changes I made to the Optimizer to support both Azure SQL Database and Docker containers actually had an unexpected side benefit: it allowed me to easily move the Optimizer to Azure Web App, which is Azure's web hosting framework.

To make the process of managing this easier, I finally moved away from Notepad++ as a development environment and embraced Visual Studio. Visual Studio made it trivial to take my entire website and migrate it to Azure. I had a few challenges with making my code-signing certificate work, but in the end it all worked flawlessly. 

The Skype Optimizer has been running as an Azure Web App for several months now. Its extremely reliable, simple to manage, and about 3 times cheaper to run than the original Windows VM. 

The Future

So, there you have it. The entire history of the Skype Optimizer posted here for posterity. Where do things go from here? Well, with Microsoft Teams eventually taking over the Enterprise Voice role from Skype for Business, I may decide to look into what it would take to turn the Skype Optimizer into a Teams Direct Routing and Calling Plans management platform. 

Until then, I will continue to keep updating the Skype Optimizer to make sure Skype and Teams administrators worldwide have a single place to get accurate, up-to-date dialing rules for every country in the world.

Friday, March 9, 2018

Edge Topology Replication Failures Caused by Mismatched Windows Updates

While getting a new Skype for Business edge server ready for production, I generally make sure all the latest Windows Updates are applied. I was doing exactly this for the company I work for (Nectar Services Corp).  Everything seemed to go just fine, but I noticed the edge server's replication status was showing as False when I ran Get-CsManagementStoreReplicationStatus.  The usual procedure of running Invoke-CsManagementStoreReplication -ReplicaFQDN servername did nothing. Even wiping out the C:\RtcReplicaRoot\xds-replica folder as described in this dusty old blog post didn't make a difference.

The Event Logs on the front-end server that was the master replication partner showed these fairly frequent error events:
Log Name:      Lync Server
Source:        LS File Transfer Agent Service
Date:          2/22/2018 10:01:29 AM
Event ID:      1046
Task Category: (1121)
Level:         Error
Keywords:      Classic
User:          N/A
Skype for Business Server 2015, File Transfer Agent cannot send replication data to Replica Replicator Agent on Edge
Edge machine:
Exception: System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it
I could reach the edge server's replication web service URL via port 4443 as described in the event log, so there wasn't a firewall issue or an issue with the web service that I could determine.

The edge server was also throwing errors, saying that it hadn't heard from any of the replication servers in a while and its feelings were very hurt.
Log Name:      Lync Server
Source:        LS Replica Replicator Agent Service
Date:          2/22/2018 12:04:32 PM
Event ID:      3045
Task Category: (3003)
Level:         Error
Keywords:      Classic
User:          N/A
The replication synthetic transaction has not been updated in a significant time period.
Time since the last update: 01.03:43:09
Cause: The Master Replicator Agent has not updated the replication transaction document in a significant time period.
If other replicas are experiencing similar issues, check the Master Replicator Agent and File Transfer Agent service health.  Verify access to the DFS files shares and replica file shares
The weird thing was that replication WAS working prior to me installing the final batch of Windows Updates. So, it seemed to make sense to uninstall each of the last Windows Update until things start working again. Unfortunately, after uninstalling those updates, along with a bunch of other ones in an increasingly desperate attempt to find the issue, the issue still persisted. Back to the drawing board.

I noticed that I was unable to copy binary files to the edge server via RDP copy/paste. Text files would copy fine, but any other filetype would cause the RDP session to crash with "Unexpected Server Error".  This seemed to be related to the replication issue, because this also worked fine prior to installing those updates. This is a good time to note that to access the edge server, I had to first RDP to a front-end server, then RDP to the edge from there, because the firewall was blocking all other access. This would prove to be relevant later.

I went as far as uninstalling and reinstalling Skype for Business along with all the supporting components, but it STILL didn't work.

Frustration level: STRATOSPHERIC

Finally, I nuked the entire edge server from orbit (It's the only way to be sure) and had the edge server rebuilt from scratch. Once again, replication worked.

End of story, right? Wrong. I had to figure out what the issue was, so I re-applied each of the original suspected updates until the issue re-appeared. And re-appear it did, along with the RDP copy/paste issue. The offensive update that caused the issue was 4072650, which is vaguely described as Hyper-V integration components update for Windows virtual machines. The description wasn't much help, but it certainly had an effect. Once again, removing that patch didn't make the issue disappear.

Finally, I checked to see if the front-end servers had this truly despicable update, AND THEY DIDN'T.  It was buried in Windows Update as an optional update, which weren't being automatically downloaded and installed. On a hunch, I installed the update (no restart required, thankfully), and VOILA, replication started working again, and I could copy/paste files between front-ends and edge via RDP.

So, this vague Windows Update was the source of all my issues. Sigh... Presumably, this would only show up in the following circumstances:

  • Using Hyper-V hosted virtual machines
  • Running Windows 2012 R2
  • Update 4072650 is installed on some VMs but not all

TL;DR version

If you're having replication errors on your edge server, and are getting LS File Transfer Agent Service error 1046 on your master replication server, then make sure that all front-end and edge servers have Update 4072650 if you see it applied on any one server.

Tuesday, April 25, 2017

Tenant Dial Plans in Skype for Business Online

As others have noted, all Skype for Business Online tenants that are capable of PSTN calling are now able to create their own custom dial plans. This allows administrators to create custom normalization rules for local and extension dialing and other cases the default normalization rules don't handle.

Dial plans work a bit differently in Skype Online than they do in on-premise deployments. In on-prem deployments, dial plans are completely independent of one another. Users assigned either a specific site or user-level dial plan will only use the normalization rules contained within that dial plan. Normalization rules assigned at the global level do not apply.

In Skype Online, users can be assigned to either a tenant global dial plan or a tenant user dial plan. As with Skype for Business on-prem, users will get normalization rules from the global tenant dial plan, unless they are explicitly assigned to a user tenant dial plan. There is no inheritance from tenant global to tenant user dial plans.

However, users WILL inherit dial plan normalization rules from a Skype Online global dial plan specific to the country where their phone numbers are hosted. This is the default dial plan that all users are assigned to in the absence of a tenant global or user dial plan.

If you were to run Get-CsDialPlan on your tenant, you will see dial plans for most of the countries in the world (220 to be exact. 12 short of what's in the Skype Optimizer, but who's counting?). These dial plans are pretty basic, accounting for national and international dialing prefixes but not much else.  The US and UK dial plan normalization rules are shown below:

US Intl Dialing
US Long Distance
US Default
US Extension Rule
GB Intl Dialing
GB Long Distance
GB Default
GB Extension Rule

The rules are quite rudimentary, and don't allow for capturing and fixing user-initiated dialing errors such as overdialing, incorrect adding of national prefixes for international calls, or local calls. The extension rule is quite interesting because it appears to have been written by someone who really doesn't know how to use regular expressions. Unless your extension rule matches the very specific cases of ext, extn, EXT, EXT, x or X, then it won't work. If they used my standard for pattern and translation, it would be both simpler and would capture pretty much ANY style of writing phone numbers with extensions:
Pattern: ^\+?(\d+)\D+(\d+)$
Translation: +$1;ext=$2
If you look at the dial rules for other countries, you'll see the only difference between them is the long distance and international dialing prefix and the country code.  I get that it can be tricky to maintain dial rules for 230+ countries, but hey if one simple Canadian dude can do it, couldn't Microsoft???

Aaaanyways, the global dial plan for your given country is applied along with any tenant level dial plans.  It appears that the tenant dial plans are applied first, followed by the global dial plan for your country.

This means you can use the Skype Optimizer to create more nuanced normalization rules that will work better than the default ones.  You can't disable the global dial plan, so if a user dials a number not covered by your tenant dial plans, then the global ones will apply. So, even if someone tries to dial a completely nonsensical number such as 00000023423422348987987998770, it would get captured by the global Default normalization rule and would allow the user to dial that number. Of course, it wouldn't get far, but I would think that if you can prevent users from dialing invalid numbers at the client level, it would keep Skype Online servers from having to use precious system resources to parse that number and notify the user it is invalid.

Thankfully, this means the Skype Optimizer is still relevant in the Skype Online world. You can still use it to create dial plans and normalization rules that will make the user dialing experience better, including support for:

Just select the desired options for your country, press the Generate Rules button, and run the script. It will ask for your credentials for your Skype Online tenant and will then ask if you want to create a tenant global or tenant user dial plan. It will then apply said normalization rules to your tenant. That's all it takes!

Buy now! Operators are standing by!

Tuesday, August 23, 2016

Lync/S4B Client Can Normalize Numbers with Plus Sign?!?!?

One of my central tenets of number normalization in Lync/Skype for Business has always been:
"Lync/Skype for Business will not attempt to normalize a number that already has a plus sign at the beginning"
There are signs that this is no longer true, at least for customers running Skype for Business Server 2015 and the Skype for Business 2016 client (and even the Lync 2010 client). I was messing around with normalization rules last night and realized that my Skype for Business 2016 client (version 16.0.7030.1021 32-bit) WAS normalizing numbers that started with a plus sign.  I later validated this with a Lync 2010 client against the same Skype for Business 2016 server.  And a commenter below said he's pretty sure he did this with Lync Server 2013.  So this means that either it has ALWAYS worked this way and I never realized it, or its something relatively new on the server-side.

I should clarify here and state that the normalization rule in question was specifically constructed to include a plus sign in the normalization pattern (as highlighted below):
This DOES NOT mean that the Skype for Business client will suddenly normalize all numbers that start with a plus. If your normalization rules do not include a plus sign, it will not normalize a number with a plus sign.

This has all sorts of ramifications, all of them positive. That means that dealing with improperly formatted click-to-dial numbers in Internet Explorer is much simpler.  We can now fix this with a simple normalization rule, instead of doing it in a trunk translation rule or route.

This also means that dialing contact phone numbers from Outlook can also be much more reliable, especially if an extension is entered.  Before, if a user entered a contact phone number with an extension in Outlook using the "Phone number builder", it would format the number like "+1 (212) 555-1212 x 345".  If you clicked this number, Lync would parse the "x" as a "9" and the number would come out in Lync like "+121255512129345" and would likely fail.

I have no idea how long this new behaviour has been available.  It works on a Skype for Business Server 2015 environment with both Skype for Business 2016 clients and Lync 2010 clients.  I don't have a Lync 2013 or older environment to test with.  I tried to see if this behaviour extended to server-side normalization scenarios with no luck. I tried to change the destination phone number on an incoming call that started with a plus sign.  It didn't work, which implies that this is a client-only thing. The MSPL workaround is still required.

If this is only new to me, then I apologize for wasting everybody's time.  I've incorporated my new findings into the Lync Optimizer (as you would expect), because it does allow for several dialing scenarios to work better than before.

If anybody has any information on this or can test other versions of Lync server/client, drop me a line.

UPDATE (2016-Aug-24): Updated to clarify that a normalization rule has to explicitly look for a plus sign for this to work. Original post implied that any valid, matching normalization rule would be applied to a number that had a plus sign on the front.  Also made changes based on further observations with older clients.

Wednesday, August 10, 2016

North American Call Authorization Tricks

Almost every country in the world has a dedicated country code assigned to it for telephony purposes (see for a full list). There is one very large exception to this rule and that is countries in North America (US and Canada) and a good portion of the Caribbean.  These countries all share the same country code of +1 and are collectively part of the North American Numbering Plan (NANP), which is managed by the North American Numbering Plan Association (NANPA).

A NANP number consists of the country code +1, a 3-digit area code, and a 7-digit subscriber number, often formatted like +1 (212) 222-3333.  You'll sometimes see the +1 omitted from the number on business cards and websites that don't think globally.

Dialing numbers in North America is quite different from most places in the world.  In the rest of the world, if you want to dial a long-distance or national number, you have to dial a national access code.  In a lot of countries, this number is 0.  However, in NANP countries, you have to dial 1 for long-distance/national numbers. That's right, you have to dial the actual NANP country code as the long-distance access code, which is unique among all other countries.  Incidentally, the international access code is 011 (in many other countries, its 00).
"Connect me with the German Chancellor immediately! I have an idea for a new album!"
If you want to dial Canada from the US, or the US from Canada, or any of the Caribbean countries in NANP, you don't dial the international access code 011, you just dial the number as you would any other long-distance number.  So, if I want to dial Toronto, Canada from Vancouver, Canada I would dial +1 416 555 1111.  If I wanted to dial New York City from Vancouver, I'd dial +1 212 555 1111.

It gets even better.  NANP area codes are allocated by a trio of over-caffeinated spider monkeys who press numbers on a giant keypad at random to assign a new area code (well, that's how it appears).  There is no logical separation between area codes between any of the countries.  For example, area code 646 is in New York, USA, 647 is in Ontario, Canada, and 649 is in the Turks & Caicos (Caribbean island nation). Click here for a current list of area code allocations in NANP.

These days, calls between the US and Canada are often priced identically to a domestic long-distance call.  But calls to the NANP Caribbean countries are usually priced much higher, akin to dialing some very remote international destinations.  Continuing the previous example, a US user that has signed up for Microsoft's Skype for Business Online PSTN Calling Plan can expect to pay $0.013 per minute  (just over 1 cent a minute) to call area code 647 in Ontario, but would end up paying $0.583 (almost 45x more expensive) to $0.741 per minute to call area code 649 in the Turks & Caicos.

Since it is nigh impossible for a regular person to distinguish a Caribbean country from US/Canada based solely on the area code, it is easy for people to accidentally incur high phone charges when calling these countries.
He had an Apple watch long before anybody else. True trendsetter, even if it took 30 years.
Skype for Business administrators often want to prevent certain users from dialing these Caribbean countries, and it can be easily done with some fancy routing rules.  Assuming you've already configured voice polices for national and international access, then you just have to modify the National level voice route to block Caribbean destinations.  You could do some research to determine the Caribbean area codes, and create your own regular expression to block them, but I figure I should do SOMETHING useful with this post and provide you with that information.

This Skype for Business route number pattern will block all Caribbean countries that are part of NANP, except for the US Virgin Islands and Puerto Rico (which are often not priced much differently than US numbers, when calling from the US):
If you want to also block US Virgin Islands and Puerto Rico, then use this one:
You may note that I've also made sure to exclude premium area codes like 900 and 976.

To make sure that people assigned an International voice policy can dial those Caribbean countries, your international route should include a pattern for North American numbers and look something like this:
If you REALLY wanted to get draconian and block calls to Canada from the US, and vice versa, you could use the following rules:

Block Canadian and Caribbean calls:
Block American and Caribbean calls:
The above rules are formatted slightly differently from the ones that block Caribbean countries.  The Caribbean rules show which area codes to BLOCK.  The other rules show which area codes to ALLOW.  Of course, I didn't roll these rules by hand, since there are tens of dozens of area codes allocated to each country, and do change from time to time.   I used the super handy-dandy Lync Optimizer.  As you might expect, these options are available in the Lync Optimizer via the "Treat as National" option (only shows for North American dial rules).  You can select one of the following options with regards to dialing other countries within NANP:
  1. In-Country Only - Treat all calls to NANP countries other than your own as international, even though users don't have to dial 011 to reach them. As such, users will have to be a member of a voice policy that allows international dialing.
  2. US/Canada - Treat calls to anywhere in US/Canada as national calls, excluding the Caribbean. To dial Caribbean countries, users will have to be a member of a voice policy that allows international dialing. Not available for Caribbean rulesets.
  3. US/Canada/Caribbean - Treats calls to anywhere in NANPA (US/Canada/Caribbean) as national calls (along with the potentially higher call costs).

If you are creating a ruleset for a Caribbean country that uses +1 as the country code, you won't be able to select US/Canada, since this would make dialing within that Caribbean country difficult.

Selecting the Simple Ruleset option prevents usage of this feature, and will default to US/Canada/Caribbean. You won't be able to control how people dial other NANP countries when there is only a single simple routing rule.

So, there you go.  A lesson in the finer details of dialing numbers in North America that the rest of the world might not be aware of. Try to contain your excitement.

Monday, July 18, 2016

Using the Lync Optimizer to Configure a Single Multi-Country SIP Trunk

As SIP trunking becomes more reliable and trusted, companies are taking advantage of the ability to consolidate standard PSTN phone lines from multiple office locations into a single SIP trunk.  Most of the larger SIP providers can provide phone numbers for multiple countries, allowing for even further consolidation of telephony resources.

Skype for Business is very well suited to this sort of consolidation.  Typically, larger companies deploy large centralized Enterprise Edition pools in a few central datacentres that serve offices spread across entire continents.  These centralized pools in turn connect to a SIP trunk over a trusted, secure connection.

As mentioned, these SIP trunks can provide phone numbers for multiple countries, and price those calls as if they originated from those countries.  So, if a company has a SIP trunk in London, UK, they can give out phone numbers to other countries.  For example, users could have Germany-based phone numbers served out of the London SIP trunk, and calls from those numbers would be priced as if they originated from Germany. If a German user placed a call to another German phone number, they would be charged as a local or national call, even though the SIP trunk may physically reside outside of Germany.  Conversely, if that same German user called a London phone number, the call would be charged as an international call.

Setting up Skype for Business dial plans and voice policies in this situation can get tricky, but fortunately I've come up with a very clean and easy to execute plan on how this can be accomplished using the Lync (Skype) Dialing Rule Optimizer.

Let's use an example of a company with a central Skype for Business deployment in London, UK with a single SIP trunk assigned to that location.  This company has offices in the following locations:
  • London, UK
  • Berlin, Germany
  • Paris, France
All user phone numbers for all three locations are hosted out of the London SIP trunk.

The company has the following requirements:
  • Users must be able to dial as they are accustomed to in their respective country
  • Must be able to limit dialing for certain groups to the local, national or international level for their respective country
Let's assume this is a greenfield deployment, and nothing has been done yet in terms of Enterprise Voice configuration outside of getting the London SIP trunk connected to Skype for Business.

The first thing to do is to generate rulesets for each of the three locations using the Dialing Rule Optimizer.

First, London:

Then Berlin:

Note that I selected the "Force English Rulenames", which I did just because I can't read German or French and would prefer English rule names and descriptions throughout my Skype for Business deployment.

And finally Paris...

The resulting .PS1 rulesets are then copied to one of the Skype for Business servers. 

First, the London ruleset is applied to the deployment, creating user-level dial plans and voice policies when prompted.  When complete, there will be a single London dial plan and 3 voice policies. All routes use the single SIP trunk (as you would expect, since its the only one available).
Dial Plan page after running London ruleset
Voice Policy page after running London ruleset

In this state, we can easily assign a UK dial plan and voice policies to UK users as appropriate.

Next, run the rulesets for Berlin and Paris.  This is done for multiple reasons. Firstly, it will create German and French dial plans, so users in those countries can dial numbers as they do in those countries.  Secondly, it will allow administrators to assign country-specific local, national and international policies.  Someone assigned to a German national policy will only be able to dial phone numbers in Germany and nowhere else, regardless of the physical location of the SIP trunk.

When running the rulesets, make sure NOT to select the option to use least-cost routing. Least-cost routing only works with multiple SIP trunks from different providers. With a single SIP trunk, as in this example, implementing least-cost routing adds unnecessary PSTN usages to voice policies that ultimately don't accomplish anything.  If we make a call from a German number to London using a London route, the call will still be billed as an international call from Germany, as the SIP provider will see the German number as the source and bill accordingly.

To make things easier and faster to run, I recommend the use of the multiple PowerShell command-line switches available with rulesets.  It makes applying multiple rulesets faster and less prone to errors.  For the rulesets in this example, I used the following switches (using the Paris ruleset as an example):
 .\FR-Paris-Lync.ps1 -SiteID 1 -DialPlanType user -LeastCostRouting:$FALSE -OverwriteSiteVoicePolicy:$FALSE -LocationBasedRouting:$FALSE -PSTNGateway gatewayname -MediationPool mediationpoolname
For a full listing of the available command-line options, use the command:
Get-Help .\FR-Paris-Lync.ps1 -Full

Once done, the dial plans and voice policies will look as below.
Dial Plan page after running Berlin and Paris rulesets

Voice Policy page after running Berlin and Paris rulesets

This might seem more complicated than absolutely necessary, and you would be right. You could certainly just use London-based rules for all locations, but it would limit your options severely.

Setting up things in this way will allow users to dial numbers exactly as they do at home, and also allows administrators very granular control over dialing capabilities.  Common area phones in Paris can be limited to local dialing in Paris.  First level helpdesk employees in Germany can be limited to dialing only German numbers.

This shows how easy it is to use the Dialing Rule Optimizer to quickly setup flexible dial plans and voice policies in even the largest voice deployments.  If you have any questions about this method, please leave a comment.

Tuesday, March 8, 2016

Dealing with Trunk Prefixes in Phone Numbers in Skype for Business

First of all, I spend far too much time thinking about dial rules.

For those who don't know, a trunk prefix is a number (or numbers) that typically has to be dialled prior to dialling a phone number within the subscriber's country, but outside the subscriber's home area code.  The trunk prefix is a way to signal the telephone network that the dialed number is a long-distance or national-level call.

For example, the trunk prefix for UK national calls is 0.  If I were sitting in Liverpool and wanted to dial someone in London, I would have to dial 0, then the area code for London (20), then the subscriber phone number (say 3456 7890).  For example, 02034567890.

Not every country uses a trunk prefix, but those that do generally use 0 as a trunk prefix. The actual national trunk prefix usage breakdown (based on my research) is below:
  • 101 countries don't use a trunk prefix at all
  • 95 countries use 0
  • 26 countries use 1 (these are countries that are part of NANPA, like US/Canada and many Caribbean countries)
  • Russia and 5 other former USSR states use 8
  • Mexico uses 01
  • Hungary uses 06

NANPA countries are unique in that the country code 1 is also used as a sort of trunk prefix for long-distance calls. Italy is another unique case in that they don't have a national trunk prefix, but they DO use 0 as the beginning of the area code, which can confuse people.

Trunk prefixes are NOT used outside of the country. Continuing my earlier example, dialling that same London, UK phone number from the US would start with my international call prefix (011), then the UK country code (44), the London area code (20), and finally the subscriber number (3456 7890), resulting in 011442034567890.  If that US person tried to insert the UK national call prefix in the number (0114402034567890), it would fail to connect because trunk prefixes are not used outside the home country.

Now, the UK user shouldn't be expected to know the international trunk prefix for countries other than his own, so when presenting their number on business cards, web pages or emails, they should use the internationally recognized standard for phone number presentation, which is (DRUMROLL)......E.164!!!!  Regular readers of my blog and anyone who knows Lync/Skype for Business should be intimately familiar with E.164 formatted numbers.  If you need a refresher, look back at a very old (from 2010!), but still useful blog post on the subject.

The UK user should present their number like this:
+44 20 3456 7890

However, in many cases the UK user presents their number with the trunk prefix, like this:
+44 (0) 20 3456 7890
+44 (0) 203 456 7890

People within the UK understand what the (0) represents, but people outside the UK would assume the 0 is part of the phone number and try to dial it as shown, resulting in call failure. This particular practice is very common in the UK, France and to a lesser extent, Australia. There are likely others (please let me know). Below are some screenshots from various webpages from GLOBAL companies that exhibit this problem:

Astra Zeneca

BAE Systems

Rolls Royce

Since this blog sadly only reaches a very small percentage of the world's population, there's no way I can expect everyone to get their act together and format their phone numbers properly.  Others have ranted about this exact practice before, and it hasn't seemed to help.

As you may be aware, when you're using Internet Explorer with the Skype for Business/Lync client plug-in, many phone numbers on web pages can be directly dialled by clicking the wee little Skype/Lync icon beside the phone number on the right.  You can see this on the Astra Zeneca and Rolls Royce page samples above. You can click those icons, and it will dial the phone number as its presented on the page.  Clicking the number for Astra Zeneca's media department will give you this:

This is not going to work for anybody, since the trunk prefix 0 will be dialled as part of the number and will fail. You might be thinking that you could write a normalization rule that could deal with this, but Lync/Skype for Business will not normalize a number that has a + in front, because it assumes that any number starting with a + is already normalized.  No matter what you do, a phone number that already has the + can't be modified at the source.

If you're anywhere other than the UK, your international routes probably aren't so strict that it would "know" that putting the UK trunk prefix in the number is wrong, so Lync/SfB will happily send the call on through, only to get dumped by your PSTN carrier as an invalid number.

What you CAN do, is create a trunk translation rule that will catch any instances where a 0 is immediately after the country code and strip it before sending it to the PSTN.  There are only a few countries where 0 is actually a part of the phone number (land line numbers in Italy and mobile numbers in the Republic of the Congo), so the regular expression we create can deal with this.

The below rule should work nicely:
Pattern: ^\+(1|7|2[07]|3[0-46]|39\d|4[013-9]|5[1-8]|6[0-6]|8[1246]|9[0-58]|2[1235689]\d|24[013-9]|242\d|3[578]\d|42|5[09]\d|6[789]\d|8[035789]\d|9[679]\d)(?:0)?(\d{6,14})(;ext=\d+)?$
Translation: +$1$2
The first part of the expression with all the digits is the regex for every possible country code in the world. The (?:0) means that if 0 is present immediately after the country code, it will be dropped. Then we accept anything from 6 to 14 other digits.  We separate out Italy (39\d), Republic of the Congo (242\d) to allow for 0 in the cases where they are allowed in those countries.

If you are located in a country that is guilty of the crime of not using E.164 formatted numbers everywhere (I'm looking at you, United Kingdom), then there is the additional step of modifying the appropriate voice routes to allow for numbers coming through the system with a 0 in the wrong spot.  Using UK's national route as an example:

Old route pattern: ^\+44(1[1-9]\d{7,8}|2[03489]\d{8}|3[0347]\d{8}|5[56]\d{8}|8((4[2-5]|70)\d{7}|45464\d))$
New route pattern: ^\+440?(1[1-9]\d{7,8}|2[03489]\d{8}|3[0347]\d{8}|5[56]\d{8}|8((4[2-5]|70)\d{7}|45464\d))$

Note the presence of the 0? after the +44, which indicates that 0 may or may not be present for the route pattern to match. As of right now, the Optimizer will create modified routes only for countries that I'm aware of that flaunt the E.164 rules on a regular basis:

  • UK
  • France
  • Australia

If there are others, please let me know, either directly or via blog comments.

All the above logic has been incorporated into the Lync/Skype4B Dialing Rule Optimizer, so if you are using that tool for creating your EV dialplans/routing, then you'll be covered.

Friday, October 16, 2015

Capturing Network Traceroutes in Lync/Skype for Business

If you've ever trolled through the QoEMetrics database (a great way to while away a lazy Saturday afternoon) you may have come across a few tables that made you go "What the....?".  One table that might catch your eye is the TraceRoute table.  In all likelihood,  this table is probably empty, which might make you wonder what it's for.

Microsoft publishes pretty detailed information on the structure of the QoEMetrics database where you can find information about the TraceRoute table in this Technet article.

As you can surmise by the name, this table is meant to capture trace route information for calls. However, the table description gives no information on how to enable this feature.

You can enable this by adding a custom policy entry to a Lync/SfB client policy.  Custom policy entries are used to enable features that Microsoft has decided not to make too obvious for users for one reason or another.  If you've got custom policy entries, you'll see them at the top of the list when you run Get-CsClientPolicy.  If you have several of those, you can see it in a more readable format by typing (Get-CsClientPolicy policyname).PolicyEntry

The relevant policy entry for enabling tracerouting is called "EnableTraceRouteReporting", and you can add it to a client policy by running the following commands:
$x = New-CsClientPolicyEntry -Name "EnableTraceRouteReporting" -Value "TRUE"

$y = Get-CsClientPolicy -Identity policyname

Set-CsClientPolicy -Instance $y
Whoever has that policy applied to them will now publish trace route reports to the QoEMetrics database. However, the built-in Lync/SfB reports do not expose this anywhere, so you would only want to turn this setting on if you are using a 3rd party reporting and analytics tool such as Event Zero's UC Commander (FYI, if you don't already know, I work for them).

Sample screenshot of traceroute data as it appears in UC Commander

This can be very useful to help track down network issues in the call path. This won't necessarily point the finger at a specific switch or router in every circumstance, but it can help.

Be warned that enabling this will add a bit of size to your QoEMetrics database. The additional data isn't huge but its not negligible either.  You should carefully evaluate the impact before turning this on.

Monday, September 28, 2015

Response Groups Stop Responding

Our company (Event Zero - makers of the best Skype for Business analytics software out there BTW) relies on Skype for Business response groups for our sales and support queues.  Last week, I noticed that every single one of them on two separate pools were no longer accepting calls.  It wasn't a SIP trunk or mediation issue, because I couldn't get to them by directly entering in their SIP address in the Skype for Business client.  They appeared available (green presence) but would not accept calls. Snooper logs showed they were throwing 480 Temporarily Unavailable errors.

It was especially odd that it happened on two separate S4B pools at roughly the same time.  I tried numerous things, from restarting the RGS service on the affected servers to restarting the servers.  So, yeah, not a lot of tools in my RGS troubleshooting arsenal apparently.

What I found DID work, was to change the Tel URI of one of the workflows to a slightly different number, then changing it back.  Within a few minutes, that particular workflow started working again. 

Rather than doing the same thing to all 20-odd response groups (which would take a LOOONG time because the RGS Workflow web page is so slow), I created a Powershell script to do the same thing.

WARNING: Use script at your own risk. It worked fine for me, but hey I'm not a programmer. Also, this script will use the Description field of the workflow to store the original Tel URI of the response group.  Whatever is there now will get wiped out. You can do this script in other ways, but I was strapped for time and we weren't using the Description field for anything.
$Workflows = Get-CsRgsWorkflow 
Foreach ($WF in $Workflows)
 Write-Host "Adding dummy extension to " $WF.Name
 $WF.Description = $WF.LineURI
 $WF.LineURI = $WF.LineURI + ";ext=0"
 Set-CSRgsWorkflow -Instance $WF
Foreach ($WF in $Workflows)
 Write-Host "Reverting back to original number for " $WF.Name
 $WF.LineURI = $WF.Description
 Set-CSRgsWorkflow -Instance $WF

I don't know what caused the problem to start with, but at least this fixed it. Presumably, something happened to "break" the connection between RGS and the associated contact objects, and resetting the Tel URI re-linked them.

Hopefully, this might help others who come across the same thing. If anybody has any insight into why it broke, and why this fix worked, please enlighten me!

Friday, September 18, 2015

Testing Inbound PSTN Connectivity Directly from Lync/Skype for Business

File this one under "It's obvious once you think about it for even a second"...

I was recently attempting to get a SIP trunk working to a new Sonus gateway located in Event Zero's head office in Brisbane, Australia.  I wanted to test inbound calls to the gateway by calling one of our response group workflows.

Normally, I would pick up my mobile or home phone and dial the number to test, but since I was dialing Australia, and I'm a cheap bastard, I didn't want to do that.  It was also the middle of the night in Australia, so I couldn't get someone local to do it for me.  If I dialed the number from Skype for Business, then it would do a reverse-number lookup, find an internal match and route me to the response group internally without going through the PSTN, which wouldn't achieve my goal.

My goal was to route a call to our Australian response group via Skype for Business via the PSTN in the easiest, laziest, least-expensive-to-my-own-wallet way possible (this is sounding like an exam question).  Pick one of the following answers:
  1. Suck it up princess and eat the $2.05 it will cost to call Australia for a few seconds.
  2. Wake up someone in Australia at 2 in the morning, claiming its an emergency.
  3. Convince yourself that since outbound calls work fine, inbound should too.
  4. Use a trunk translation rule to route a dummy phone number to the correct destination.
  5. Give up and start drinking, because its Friday afternoon dammit!
The correct answer is #4 (although the judges will also accept #5).

If you've ever been at one of Doug Lawty's Enterprise Voice sessions at LyncConf or Ignite, you have probably seen his infamous diagram showing how a PSTN call works its way through the system.  To save you the pain, here the relevant details in a very small nutshell:
  1. Dialed number gets normalized to E.164
  2. Skype for Business does a reverse number lookup to see if there's an internal user or service that is assigned that number. 
  3. If there's a match, routes the call internally directly to the user/service.
  4. If there isn't an internal match, it finds a valid route through a PSTN trunk
  5. It applies any outbound number translation to make the number conform to local standards and sends it on its merry way out the gateway to the PSTN.
Since reverse number lookup happens before handing off the call to the routing engine, we can simply call a dummy PSTN number of our choosing, and once it reaches the trunk translation rule stage, change that dummy number back to the proper number assigned to the Response Group workflow.

In the above example, all I do is create a trunk translation rule to change the non-existent North American phone number +1 (555) 999-0000, and translate it to the proper number.  Once the call has progressed to the point where outbound translation is happening, Skype for Business won't be doing another reverse number lookup, so it will continue to route that number out to the PSTN, at which point it should route to the correct destination (which could be right back where it came from). 

So, a simple solution to overcome my unwillingness to waste money on stuff I don't have to.