Category Archives: big data

Banks, Declutter Your Data Architecture!

IT-guy

Banks do not need to be wedded to complexity, says Navin Suri, Percipient’s CEO 

Marie Kondō’s bestseller, The Life-Changing Magic of Tidying Up: The Japanese Art of Decluttering and Organizing, is sweeping the world. Her message that simplicity pays off applies as much to a bank’s data architecture as it does to a person’s wardrobe.

Few bankers would argue with the notion that the IT architecture in banks is overly complex and as a result, far less productive than it could be. So how did we get here? Rather than a single blueprint, most banks’ IT evolved out of the global financial industry’s changing consumer demands, regulatory requirements, geographic expansion, and M&As. This has led to a tangled web of diverse operational systems, databases and data tools.

Rapid Digitisation

But rapid digitisation has put this complex architecture under further stress. Amid dire warnings, such as the one from Francisco Gonzales, then CEO of BBVA, that non-tech ready banks “face certain death”, many rushed to pick up the pace of their digital transformation.

Banks rolled out their mobile apps and digital services by adopting a so-called “two-speed infrastructure”, that is, enhanced capabilities at the front, built on a patchwork of legacy systems at the back. Now over a third of all banks, according to a 2015 Capgemini survey, say “building the architecture/ application infrastructure supporting transformation of the apps landscape” is their topmost priority.

Fragmented Infrastructure

Meanwhile a key reward of digitisation – high value business intelligence – remains elusive. Banking circles may be abuzz with talk of big data, but the lack of interoperability across systems makes this difficult to achieve. In some cases, cost effective big data processing technologies like Hadoop have actually deepened the problem by introducing yet more elements to an already unwieldy architecture.

To address the problem, financial institutions have opted for two vastly contrasting approaches. Either paper over the cracks with a growing number of manual processes, or bite the bullet, as UBS is doing. The world’s largest private bank announced in October last year that it will be spending US$ 1 billion on an IT overhaul to integrate its “historically fragmented infrastructure”.

Attack On Complexity

However, for those banks unable or unwilling to rip out and replace their existing sytems, there is a third way. The availability of highly innovative open source software offer banks the option of using middleware to declutter and integrate what they have.

Percipient’s data technology solutions, for example, enable banks to pull together all their data without the need for data duplication, enterprise data warehouses, an array of data transformation tools, or new processes and skills. These solutions are, at their core, an attack on the architectural complexity that banks have come to grudgingly accept.

Visible Order

As Marie Kondō points out, “Visible mess helps distract us from the true source of the disorder.” In the case of most banks, the true source of the disorder appears to be an IT infrastructure derived, rather than designed, to meet the huge demands placed on it by digitisation. There is now a real opportunity to turn this visible mess into visible order.

This article was a contribution to, and originally appeared in, finews.asia

Packing Real Punch Into Customer 360s

In marketing circles, the buzzphrase for the first quarter of 2017 was Customer Data Platform (CDP).

Although coined in 2013, it was Gartner’s decision in July 2016 to introduce this as a new industry category within its digital marketing “hype cycle” that has given the term real legs.

Un-holistic

To date, enterprises have relied on their CRM, channel or transaction systems to provide them with customer views. But these have been far from “holistic”, with the ambition to build a Customer 360 platform largely hampered by data silos and technology bottlenecks.

According to advocates, CDP platforms elevate UYC (Understand Your Customer) initiatives to a whole new level by unifying all customer data from marketing, sales and service channels into one database or interface. This is then made available to the entire organisation as an integrated view of each customer, rather than as an anonymised view of broad customer segments, as is the case with other data platforms.

Hence your platform isn’t a CDP unless it boasts the following features:

  • The ability to track a customer’s activity within an enterprise. This must apply to all touch points, regarding whether traditional or digital, and the when, what, how and why of every transaction.
  • The ability to plot the customers’ complete and personalised journey by piecing together data gathered from the customer’s devices, channels, and engagements. By so doing, enterprises are able to define the customer’s choices, experiences, and ultimately, sentiment.
  • The ability to support marketers across multiple customer facing applications. That includes helping such teams design their product recommendations, conduct cross sell optimisations, track customer retention and attrition, and manage their advertising and branding.
  • The ability to present a single source of truth by maintaining a persisted and updated profile of each customer. This profile should be usable across the entire enterprise and hence drive real time insight, decision making and execution relevant to the individual customer.
  • The ability to ensure data privacy and governance standards are maintained despite the shift from segmented to individualised customer data. This includes strict limits on the number of data copies and minimising the risks of data leakage.

More than just a Single Customer View

Importantly, CDPs are not just about embellishing a customer’s profile data or even establishing a single customer view. Embedded in the concept of CDPs is the ability to act on the intelligence that CDPs provide.

The ownership of the CDP is also important. According to technologist David Raab, founder of the Customer Data Platform (CDP) Institute,  CDPs represent “one of the few fundamental changes in marketing technology in the past decade, because it shifts control of the customer database from IT to marketers.”

At the core of a CDP is a marketer-managed database that is accessible to other systems. CDPs are accessible by external systems and are designed to support, for example, web-marketing campaigns that go beyond simple targeted promotions. Instead, CDPs must be capable of delivering pin point customer specificity. This means web content, product recommendations and service alerts that are entirely customised to the individual.

The making of a Customer Data Platform

Enterprises seeking to build their own CDP, or use one or more vendors to do so, can leverage on the many open source technologies now readily available. Elements of a CDP include connectors to a variety of data sources, a data store for structured and unstructured data, tools for data preparation flows, identity resolution processes, artificial intelligence systems, and integration to customer applications.

While some or all of these elements may already be in operation at an enterprise, CDPs force technology and data science teams to support a key digital shift: personalised interactions based on a holistic, data-driven view of every customer. For most marketers today, this remains wholly out of reach.

2016 Revelations

 

Big data can be funny too…no…seriously! 

funny-bigdata

As the year draws to a close, and we treat ourselves to some well-earned merry making, here is a look back at some 2016 big data events that prove big data isn’t all dull and serious.

A World of How, What and Why

20070416-how-to-tie-a-tie

Google’s annual list of most googled terms are always revealing and this year’s is no different. Australia’s Top Ten “How to” searches in 2016 suggest that Australians are still concerned with the challenges of daily life: “How to tie a tie?” and “how to get rid of pimples?”

UK-based Googlers, on the other hand, were concerned with a somewhat more esoteric question: “How to make slime?”. Their Top Ten also included the more sadly philosophical: “How to accept myself for who I am?”.

Meanwhile, the “Why” question uppermost with Swedish Googlers in 2016 was “Why are eggs brown or white?”. Clearly, the world is still full of innocents.

 (Extra) Ordinary Gifts

legal-us-raw-milk-sales-behind-rapid-increase-in-outbreaks-cdc

This year’s Singles Day has been deemed Alibaba’s most successful yet. Sales this year reached USD 17.79 billion, compared to USD 14.3 billion last year. Unsurprisingly, top sellers were phones and appliances.

But everyday household items are also big hits on Singles Day. Hera BB cream from Korea and Laurier sanitary napkins from Japan are traditional favourites, as is milk.

Last year, one German supplier alone accounted for 2.35 million litres (USD14.3 million) of Singles Day sales of liquid milk. On the same auspicious day this year, an Australian manufacturer was able to shift 350,000 goat soap items worth over USD 1 million.

Deep Learners?

0036_04711893_0270_f

Alongside top trending searches, Google’s 2016 Breakout Searches (ie yoy searches that rose by 5,000% or more) were equally eye catching.

While “Severe Weather” featured in Germany’s list, “Nobel Prize” in Sweden, and “Melania Trump” in Slovenia, China’s list contained only two: “Deep Learning” and “Machine Learning”.

This presumably has something to do with ordinary people wanting to understand what organisations – from the government to corporates – are doing with their data. For example, e-retailer Shangpin.com has determined that Chinese consumers prefer to shop for underwear in the late evening. While it is still unclear how this will drive future marketing, the retailer says such “subtleties” provide valuable insights into the Chinese psyche.

 Better Restroom Service

images

2016 also provided some insight into Japanese restroom habits and preferences. A report by the Japan Times revealed that highway operator, Central Nippon Expressway Co. (NEXCO Central) had installed 3,000 sensors, including motion detectors for toilet bowls, in 51 restroom locations along the Shin-Tomei Expressway.

Analysis of the data collected suggest that cubicle use is gaining on urinals, and an average cubicle visit is four minutes and four seconds, up 35 seconds in seven years, which NEXCO puts down to the increased use of mobile phones in restrooms.

NEXCO expect this data to help it improve its service. For male commuters faced with long cubicle waiting times, this will be a relief.

Its Been Weird

spotify-band 

Spotify dug deep into its data stores to come up with one of the year’s most interesting end-of-year campaigns. Displayed on billboards at select locations around the world, Spotify highlighted the behaviour of some of the site’s users, and their reaction to it.

Under a general tagline of “Thanks, 2016. Its been weird”, examples included:

  • Dear 3,749 people who streamed “Its The End Of The World As We Know It” the day of the Brexit vote. Hang on in there.
  • To the 1,235 guys who loved the Girls’ Night playlist this year. We Love You.
  • Dear person who played “Sorry” 42 times on Valentine’s Day. What did you do?

Developed by Spotify’s in-house team, Chief Marketing Officer, Seth Farbman said their data was inspiring and gave an insight into people’s emotions.

We agree – where can you detect more emotion than in a person’s playlist?.

To personalise or not: WhatsApp re-ignites the data sharing debate

WhatsApp announced last month that it will allow its parent company Facebook to sell its user data to advertisers. The news was met with widespread consternation.

The timing too couldn’t be more unfortunate, coming just a fortnight after a landmark US Court of Appeal ruling in favour of Microsoft’s bid not to hand over its customer emails to the US federal government.

Also playing on many minds is the EU’s General Data Protection Regulation (GDPR), adopted in April this year after four years of debate, and due to come into force by mid-2018. Corporations located in EU member states are already scrambling to ensure compliance with requirements that are variously described as “onerous”, “radical” or “doesn’t go far enough”.

As both the private and public sectors struggle to draw the line between data privacy and data utility, it is perhaps time for individuals to ask themselves the same questions. What do I stand to gain from government and corporate use of my data? And when does this cross the line?

This is not just a philosophical exercise. In fact, data privacy advocates strongly espouse the concept that an individual’s personal data belongs to the individual. Governments and corporations are deemed “temporary custodians” of the data they collect, and can choose to offer personalised services, but based only on the information that individuals are willing to share with them.

Billed as a first of its kind, the high profile MyData 2016 conference taking place in Helsinki this week promotes this view of “human-centric data management”. Organisers of the event take pains to stress that their intention is not to stifle innovation, but rather to lay the ground rules for the ethical use of data. However, implicit in the conference themes is the notion that the scales may now have tilted too far in favour of the organisation versus the individual.

Aside from the issues of data security (ie keeping data safe from hackers and unauthorised persons), research suggests that individuals greatly fear the improper use of big data to drive key decisions made about them. Media, internet, telecommunication and insurance companies are said to face the greatest “data trust deficit” and need to make the most effort to ensure that their brand is associated with data transparency and accountability.

On the other hand, individuals should not under-estimate the extent to which big data mining has become an expectation. Research by the Aberdeen Group suggest that 74 percent of online consumers actually get frustrated with website offers and promotions that have nothing to do with their interests. The research also found that more than half of all consumers are now more inclined to use a retailer if it offers a good personalised experience.

The key appears to be control. According to CMO.com, more than 60% of online users, while valuing personalisation, sought to understand how websites select such content. A similar number wanted the ability to influence the final results by proactively providing or editing personal information about themselves.

However, it is not marketing websites but IoTs and wearables that will be the biggest test of users’ embrace or disaffection with big data. Today’s low cost wearables are effectively subsidised by the potential monetisation of the data that these devices are able to generate. This data is vast and highly personal, and while the IoT trend is not new, experts expect wearable technology to escalate dramatically over the next few years.

It is therefore increasingly incumbent on the industry to put in place measures that protect the data from abuse. Manulife’s MOVE and AIA’s Vitality programmes are classic examples of how it is possible to align individual and corporate objectives to the benefit of both. Individual policy holders are provided with fitness trackers that record workout data, and rewarded for healthy behaviour through discounts or points. This is done by ensuring workout data is explicitly collected and consent explicitly provided, with a firm undertaking that the data will not be shared or used for other purposes.

It is said that with great power comes great responsibility. Big data has the power to transform organisations and disrupt industry practices. But in delivering the personalisation that individuals now demand, organisations cannot lose sight of their responsibility to maintain the individual’s innate desire for self-determination, even in the digital world.

Four Myths About Real Time Analytics

Analytics-Now

Five years ago, businesses beyond the e-commerce world were only dipping their toes into real time analytics. Today, brick and mortar businesses have embraced real time analytics in a range of applications, including error detection, price adjustment, inventory tracking, customer experience management, and more.

So what has changed? Certainly, advances in both hardware and software have helped. But at the heart of this reassessment is the data industry’s ability to dispel a number of myths surrounding real time data processing and analytics.

Here are four of the most deeply entrenched:

  1. Real time intelligence is only a nice-to-have

 A common misunderstanding about real time analytics is that it is the same as batch analytics, just conducted over a shorter time frame. Therefore, as long as important business reports are meeting required deadlines, intelligence in real time is regarded as less than critical.

In fact, in contrast to the analytics conducted on batch (ie static) data, real time analytics is not designed to diagnose key business trends. Rather, real time analytics is performed continually on streamed data in order to detect unusual events as they happen, and to trigger an immediate or quick response.

As such, real time analytics is best suited for processes which stream large volumes of data. Where the analysed data is signalling something amiss, fast action – usually dictated by pre-set business rules – can prevent it from escalating into a full blown problem.

For this reason, real time analytics’ greatest value is its ability to avert losses, for example, by detecting financial fraud or manufacturing defects. There is also the potential to improve routine operational efficiencies, such as through retail inventory management or telecoms network optimisation. Real time analytics is therefore deemed to deliver “operational intelligence”, and is complementary to, rather than a substitute for, “business intelligence”.

  1. The prices of my products/services do not change very often

Businesses are increasingly using real time analytics for another innovation – real time pricing. E-commerce sites have long used real time factors and complex algorithms to drive active price adjustments. What is perhaps less well known is the magnitude and velocity of these adjustments. Amazon, for example, changes the prices of about 40 million products many times during a single day, according to Internet Retailer Magazine.

The confidence that e-commerce sites have in their pricing models is now reflected in many traditional businesses. A report in the Wall Street Journal noted that, for example, adult passes to Indiana Zoo can range between USD 8 and 30, by responding in real time to weather changes, school group bookings, surprise closure of an attraction, and a variety of other factors.

Many other facilities now adopt similar real time pricing mechanisms, including highway tolls, parking lots, taxi services, golf courses, ski resorts, theme parks, and entertainment events. In the case of highway tolls, prices can fluctuate by as much as 500% in a single week. This real time approach to pricing is said to be capable of lifting revenues by 10 to 20%.

However, real time pricing offer benefits beyond that of higher revenues. Bottom line consistency and customer experience are often as important, with dynamic price adjustments used to control demand peaks and troughs, and associated resourcing costs. The question traditional businesses need to ask themselves is therefore not ‘How often should we review our prices?’ but rather ‘How can we approach pricing differently?’

  1. Real time information is unsubstantiated and therefore less valuable 

The term “real time” gives the impression of fleeting observations that are not verifiable. However, in most cases, real time analytics is not simply based on what has just occurred. Rather, such analyses involves correlations between real time and historical datasets, thereby placing a real time event in its proper context.

Take for example how a GPS now re-routes vehicles in order to avoid traffic congestion. The analyses required to do this is based not just on real time traffic information, but on the correlation with previous congestion events, in order to determine the best alternative routes that vehicles should take.

It is the ability to combine streamed real time data with historical data and data from other sources that gives real time analytics its greatest potency.

  1. Real time analytics is too expensive to implement

 Real time analytics is not without its complexities, but does not necessarily require a large business investment. This is because the evolution of open-source data technology has resulted in significantly more cost-efficient real time software. Spark, for example, is an open source project that is set to underpin much of the industry’s real time analytics capabilities.

In addition, rather counter-intuitively, the data architecture stack required to support the integration of real time feeds with existing batch based data processes can actually be a cheaper alternative to current storage technologies.

This is because the architecture needed to run real time analytics eliminates existing storage inefficiencies. Adoption of these stacks, which is entirely reliant on open source platforms, enables analytics to be performed incrementally and in memory, while also avoiding the need for large hardware investments.

Today, there are few excuses left for businesses not to adopt real time analytical capabilities. While the insights generated by analysing historical data can help businesses arrive at strategic decisions, real time analytics produce a different kind of insight. They are the triggers that enable business to tweak processes and refine experiences.

Are You Open To Open Source?

keep-calm-3

Navin Suri, Percipient’s CEO, gives six compelling reasons why organisations should leverage open source software

Copy of a copy of a copy…

Already faced with crippling data storage bills, the last thing enterprises need is to waste money on storing multiple copies of the same data

In a ground-breaking study by Veritas Technologies published last month, it was found that on average, 41% of an enterprise’s data is “stale”, that is, has not been modified in the past three years. The study estimates that, per enterprise, this amounts to as much as USD 20.5 million of additional data management costs that could be otherwise saved.

Orphans on the rise

For some industries, for example banking, a portion of this data is stored to meet regulatory requirements. But according to Veritas, the majority is just the result of a passive approach to data storage. For example, “orphaned data”, ie data without an associated owner because of personnel changes, is a particular culprit. Not only is such data on the rise, but comprises of presentations and images that take up a disproportionate share of disk space, and is unattended for even longer than other stale data.

Veritas’s report highlights the fact that data growth, estimated to be as much as 39% p.a. in the US, is caused both by the number of files stored, and the doubling of average file sizes in the past decade. Veritas stresses the need for enterprises to prioritise how they manage their data, that is, what to store, archive or delete.

Needless duplication

What the report fails to mention however is how much of this stale data is actually the result of multiple data duplicates. In large organisations that house several functional departments, analytics teams and data warehouses, it is common for one team to copy already-copied data, which is then copied again for a different purpose. Copies are also made where back-ups are required, further exacerbating storage costs, error risks and the risk of contravening data security laws.

Ironically, many big data management solutions are actually contributing to the problem. For example, in order to combine data stored in a traditional EDW with that moved to a Hadoop data lake, the data is often re-copied and re-stored into expensive solution-specific servers. It is also not unusual for whole databases to be copied in order to query only a limited number of datasets.

Painful lessons

In fact, there could be as many as 30 copies of the same data within a single organisation, according to Iain Chidley, GM at Software company, Delphix. He warns that enterprises can end up storing ten times more data than they had orginally anticipated. While cloud storage is now a cheaper alternative to traditional data warehouses, substantial amounts of money is still being wasted if all this duplicated data is moved wholesale into the cloud.

Clearly, major clean-ups are called for in many large companies. But avoiding the copying process wherever possible can help drastically limit further storage challenges. And for those not yet facing these challenges, prevention is of course better than cure.

RAMp it up

The solution lies in in-memory processing, which eliminates the need to write data to physical disks, track copies, and delete them when no longer required. Instead, data is processed in a computer’s memory or RAM, which has seen price declines of over 200% in the past three years. Meanwhile, newer 64-bit operating systems now offer one terabyte or more of addressable memory, potentially allowing for an entire data warehouse to be cached.

Conducting analytics in-memory also reduces the need for some ETL processes, such as data indexing and storing pre-aggregated data in aggregate tables, thereby further reducing IT costs. The application of the revolutionary open source software, Apache Spark, which supports in-memory computing, is yet another step towards low cost big data processing and analytics.

These technological advances make it possible for enterprises to embrace a vast uptick in the amount of data that can be accessed and analysed, while at the same time making substantial storage cost savings.

Who says you can’t have your cake and eat it too?