Drinking From The Fire Hose – The Challenge Of ‘Big Data’ In The Contact Center

July, 2012

Data produced by virtual/ cloud-based contact centers has the potential to grow massively in a short time, and is virtually unlimited. Managers must carefully consider how to cope with, and extract value from, this big data load.

As more contact centers escape the confines of physical location and become virtual/ cloud-based, the customer interaction data produced can start to increase. This has the potential to grow massively, even exponentially, in a short time, and is virtually unlimited. This phenomenon has been dubbed ‘big data’ and if it hasn’t reached you yet, it may be closer than you think.

Managers must carefully consider how to cope with, and extract value from, this big data load; high performance tools are needed that, as some have put it, can drink from the fire hose.

Infinite Scalability

In the good old days, call center systems were largely finite; ‘X’ number of agents over ‘Y’ period will produce ‘Z’ Specific Data Items creating ‘V’ volume of data. X and Y could be predicted and Z was static, so V could be calculated fairly accurately, and storage and retrieval systems for that data could be built accordingly.

In the new world of cloud/ hosted contact centers, systems have become almost infinitely elastic and therefore inherently unpredictable. As the variables increase within these systems, data becomes ‘big data’. For example:

  • Number of agents
    Hosted/cloud systems provide ease of scalability and agent numbers will increase/ decrease as demand dictates. 100 today might be many times that tomorrow.
  • Specific data items
    Systems once only collected known KPIs – talk time, wrap time, per call, etc. The new world calls for infinite extensibility, even the definition of new metrics, and certainly the ability to customise and extend.

The period a report could cover has not changed and must still be configurable, spanning anything from minutes to years.

Tools that can cope

The speed of storage and retrieval of high data volumes is a major issue. Traditional SQL databases do not lend themselves to ultra-fast read/ write, so users are looking elsewhere.

The breed of tools known as ‘noSQL’ (e.g. mongoDB) is built to deliver in this environment. These maintain many of the familiar SQL features, and facilitate speed by adding some unique functionality, e.g. the ability to process many transactions simultaneously via contention-free, non-locking updates; fast handling of unstructured data in a file system rather than a database.

But capture of big data is only one side of the coin. The other is that the data is practically useless unless it can be ordered, analysed and processed. Only then can comparisons be made, patterns emerge and big data become business data.

In order to start making sense of the data explosion, the right tools are needed and are coming on stream. In order to aid the speed at which meaningful results can be produced, many of these employ several methods of data aggregation.

  1. Real-time aggregation
    With this method, real-time data is used to update counts of events – e.g. live connects, abandoned calls, etc – and other calculated metrics – e.g. average talk time. Running totals can then be used in reports without any need for further processing, database interaction, etc., producing results much faster than otherwise possible. This method also reduces the need for the more time-consuming periodic aggregations below.
  2. Periodic aggregation (MapReduce)
    Periodic aggregation (a.k.a. MapReduce) is commonly available off the shelf within noSQL systems. Using this method, the work of aggregation is carried out simultaneously by many processors, maybe within the same server, maybe in a widely distributed cluster. The results are then fed back to a master process which presents the results as output to a reporting system or writes to a database. As this method is more processor intensive, it cannot be done in real-time with big data.

Using a combination of real-time and periodic processes, various levels of aggregation are possible, depending on the view of the data required, e.g. an hourly report, weekly, monthly, or even yearly. As resolution zooms out, the level of detail required reduces and a higher level of aggregation is possible. As the metrics are already available, fast load time is maintained. Of course, no two users have the same requirements, so the levels of aggregation must be customisable.

But care must be taken in order to maintain accuracy when the user needs it. Aggregation must occur alongside the capture of individual interaction events, so that data is not lost and then has to be approximated.

Maybe you have not had to make any decisions about a ‘big data’ strategy yet; the fire hose may be safely trickling right now. But we suspect that any call center of 100+ agent seats will have to deal with this pretty soon. When the time comes, make sure you choose technology that is flexible enough to cope with future demand, so you will be waving, not drowning, when the fire hose is opened up.