Laying Down The Hammer

One of the exciting things about my day job is that I get to work in a rather large environment by most standards.  I consult for a Fortune 500 company that has around 5500 Cisco UCCE and CVP Agents in a dozen counties and another 10,000 normal Cisco IP Phone users in another 24 locations around the globe.   What this does on a daily basis is present us with interesting and often never before seen problems to tackle.

Last week’s problem involved load testing our new UCS B-Series virtualized platform set to go live in April.  Now, 5500 agents can produce one hell of a load, not the load your typical call center sees, we have four (4) Voice DS3 between our 2 main facilities.  That’s 672 channels each, or about 28 T1’s.  We do nearly 1 million minutes a month between these circuits, and almost 2 million a month on a global scale, pretty awesome.

So when we talk about this much traffic, it’s important to performance test any new hardware before going live.  The performance test tool used to do this is generally called a ‘hammer’.  We typically hire a company for this, there are a few out there I will leave nameless, but in general it can cost as much as $50-100k for 8 hours of testing decent load for what we need.  Not something you get a lot of shots at, so the goal is to try and get it to work the first time.  The main points of focus for our first performance test are testing CVP 9.x, UCCE 9.x, AS5400 ingress gateways (DS3 terminators), and new 3945E VXML gateways (~800 VXML sessions each).

2 weekends ago we got on our first call with the vendor, they fired up the hammer, pumped in 200 calls to start…and it died after about 130.  Our 3945E locked up in a datacenter in the frigid north at around 50 VXML sessions.  To make this more awesome, no one is on site, it didn’t crash so it just hung, and there are no logs.  Finally after 90 minutes someone gets on site reboots it, we consider it may have been our debugs and we fix that.  We run it again, and it dies again.  2 hours and $15k later, we have literally accomplished nothing, and we follow up with TAC.

TAC can’t find anything because there is no logs, and well, you need to get lucky to have a TAC engineer who knows their stuff when talking about call center related problems.  They want us to run the test again with them on the horn, which costs us a mini fortune.  We are running 15.2(4)M5 on the 3945E and during the test we realize our lab 2921 took a bunch of these calls (accidentally) and it hit 100 VXML sessions on 15.2(4)M3.  We have a hunch we are running a bad IOS, but TAC can’t confirm.  We need a cheap retest to figure all this out.

In comes interesting solution of the day.  We don’t want to pay another $15k, but we need a ton of calls, these are the problems I live for.  There are a few options to generate a ton of traffic that I came across and a few ideas I had myself.  First off,  SIPp is a sip traffic generator, the problem for us is we need to test the DS3, this guy is out.  Now we are almost immediately into building our own custom software, so that’s what we will do.

Solution

Our options are, TCL on the router, used to generate a ton of calls.  I am pretty good at writing TCL, but it’s hard to debug, and I am pretty sure this make me start crashing more routers, last choice here.  Next build a small CTI app and have it control some ports on CUCM like the outbound dialer does, tons of work not going happen in a few hours.  Next we consider using the Cisco Outbound Dialer we already have installed in production.  We only have 50 ports however and need hundreds of calls.  Furthermore it’s an awful product, and I would literally do anything to avoid using it, so I am back to TCL or CTI.

My last option is something I have never used before, its called Twilio (www.twilio.com)  Twilio is an API driven in the cloud, completely programmable ‘phone system’.  They provide a host of REST API’s where you can build, outbound campaigns, custom IVR’s, voicemail systems, SMS platforms and all with API’s.  I take a quick look at Twilio, their pricing (2 cents per minute) and their features and decide this is the obvious option.  Twilio provides ruby, C#, Java, PHP and many other languages libraries for easy interfacing.  I am a total ruby buff so this is an easy call for me.

This is the part where I preface if you don’t know what your doing, make sure you DO NOT pump 350 calls into your system.  There is a good chance you will destroy something.  START WITH A LAB SYSTEM.

Before we build what do we need?

  • An application that can put a ton of calls into an IVR (unlimited?!?)
  • The call is answered by the IVR and put into an infinite MOH loop, since no agents are online.
  • The hammer notices it has been on, and generated some sound to keep the call up as well as test quality.
  • It needs to get done quick.

Alright, well good news Twilio gives us some ruby code here after we sign up to easily generate a single call….  https://www.twilio.com/user/account/developer-tools/api-explorer/call-create

 


require 'rubygems' # not necessary with ruby 1.9 but included for completeness
require 'twilio-ruby' # put your own credentials here

account_sid = 'Ibnsn034gnvuh9HierubviubIUGIUHH'
auth_token = '[AuthToken]' # set up a client to talk to the Twilio REST API

@client = Twilio::REST::Client.new account_sid, auth_token

@client.account.calls.create({
:from => '+11234567890',
:method => 'GET',
:fallback_method => 'GET',
:status_callback_method => 'GET',
:record => 'false' 
})

This is pretty simple, load the library, create  twilio client with associated security token’s and generate a call.  One crucial thing missing here is that you need some more fields to make this work.

One is the URL of the TwiML file.  It basically a custom made XML format by Twilio, but is very close to VXML.  We are going to build a small XML file and toss it on Dropbox. https://dl.dropboxusercontent.com/u/56846391/playRecording.xml

The file contains the content below which is pretty simple.  It’s saying play this wav file X times.  In our case this cowbell.mp3 is 52 seconds long, and we are playing it 6 times.  This means the call is up about 5 minutes.  This is important because Twilio is not made for a hammer, so it takes a while for hundreds / thousands of calls to start.  So sometimes you need to keep them up awhile to get all the calls you want into the system.  We got about 350 into our system using the configuration I am showing you below.

<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Play loop="6">https://api.twilio.com/cowbell.mp3</Play>
</Response>

The next parameter is the to_number, which is the number you want the call delivered to.  The last important step is to generate more than one call, we will do this by tossing it in a simple for loop.  Final code for generating 350 calls is shown below.

require 'rubygems' # not necessary with ruby 1.9 but included for completeness
require 'twilio-ruby' # put your own credentials here

account_sid = 'IUG8b8kG9ig87g1&GibIUHiuGIvb'
auth_token = '9ubi8ybuHBuyguVugfytfJg7h8i89'

# set up a client to talk to the Twilio REST API
@client = Twilio::REST::Client.new account_sid, auth_token
for i in 0..350
  @client.account.calls.create({
    :from => '+11234567890',
    :to => '+19876543210',
    :url => 'https://dl.dropboxusercontent.com/u/56846391/playRecording.xml',
    :method => 'GET',
    :fallback_method => 'GET',
    :status_callback_method => 'GET',
    :record => 'false'
  })
end

Verification

 

In this case we are testing the VXML gateway, since it was our failure component.  We log into the 3945E and see what it looks like with no calls using the ‘show voip rtp conn’ command.

VXML3945RTR#sh voip rtp conn

VoIP RTP Port Usage Information:

Max Ports Available: 8091, Ports Reserved: 101, Ports in Use: 0

Port range not configured, Min: 16384, Max: 32767

Ports       Ports       Ports

Media-Address Range                     Available   Reserved    In-use

Default Address-Range                   8091        101         0

No active connections found

Now we will take a look at it with our Hammer generating 4 calls.

VXML3945RTR#sh voip rtp conn

VoIP RTP Port Usage Information:

Max Ports Available: 8091, Ports Reserved: 101, Ports in Use: 4

Port range not configured, Min: 16384, Max: 32767

Ports       Ports       Ports

Media-Address Range                     Available   Reserved    In-use

Default Address-Range                   8091        101         4

VoIP RTP active connections :

No. CallId     dstCallId  LocalRTP RmtRTP LocalIP                                RemoteIP

1     2956       2962       17566    21308  10.180.153.3                            10.180.153.1

2     2959       2964       17568    22254  10.180.153.3                            10.180.153.1

3     2966       2969       17570    18582  10.180.153.3                            10.180.153.1

4     2971       -1         17572    20810  10.180.153.3                            10.180.153.1

Found 4 active RTP connections

Conclusion

 

Twilio built a great quick and dirty hammer for generating some traffic.  The 350 calls for 5 minutes cost us about 6 dollars, as opposed to the 7500 for one hour, a traditional vendor would have cost.  It took us about 3 hours from start to finish not knowing anything before hand and just taking shots in the dark.

It also proved that 15.2(4)M5 is the culprit, as it did crash again with this hammer.  We loaded up 15.2(4)M3 and it ran like a champion up to 350 calls.  Thanks for all the help TAC.

There are a ton of improvements in the tool for using it on a regular basis that should be considered before using this in any sort of intense situation.  Here are some of the things anyone should consider long-term when building a hammer.

  • Analytics – people want to know how many calls failed, were busy, which numbers these were, call durations, etc.  All this is super easy with Twilio’s API, but you have to build all the rest after you get the data.
  • Call Termination – There is seriously no worse situation with Twilio then having generated hundreds of never ending calls and having no way to kill them.  Twilio has an API call to do this per call, but you need a serious tracking system for this and analytics on a large scale.  I built this before I decided to potentially destroy my clients datacenter routers
  • IVR navigation – most clients like us have deep and complex IVR’s and also want to test self service etc.  Twilio does have functionality for IVR navigation, but it is its own ballgame.
  • Test Cases – Most clients don’t want you to call one number 350 times, they want you to call a smattering of numbers randomly with random input.

Thanks to Josh Kittle (@ciscovoicedude on twitter)for spurring me to write this. I hope you enjoyed it.

Chad Stachowicz
cstachowicz@cloverhound.com
Twitter: https://www.twitter.com/chadstachowicz
LinkedIn: http://www.linkedin.com/pub/chad-stachowicz/1/981/a6