Search This Blog

Thursday, 18 June 2015

Another "What is BIG DATA" blog

Every other technical blogger must have written something about Big Data.  Big data is not a new concept.  In the Gartner’s Hype cycle this year, big data has already gone past its peak point and is in the downward trend.  So why another Big data blog and why should you read it.  It is about the Big data that has been existing from the time of earth.

I love simplicity and strongly believe if you have the basics strong, everything else will be simple when you put it in perspective.  This blog is inspired by 9 year old son.

Call it father time, I have trying to reason out everything with my kids.  I believe in God and as a traditional father I want my kid to do the same.  The conversation with my kid, last week is below

·         Me         :  Do not talk rude to your mother or brother

·         Son        :  why?

·         Me         :  it is  bad to say rude things to others.  God is always watching what you do

·         Son        :  Why is he always watching me?  Does he not have any other work?

·         Me         :  God watches everyone and everything son.  That is why he is God.  He looks at all good or bad everyone does and rewards / punishes them accordingly

·         Son        :  He has only 2 eyes, how can he watch everyone at the same time, all the things that everyone is doing.  How does he remember so many things?

Damn.  BIG DATA!!!!.  The almighty has been using BIG DATA from the day of the earth and it took us geniuses so long to find out.  Heck, I had watched Bruce Almighty and remember the scene where Jim Carey is dealing with all those prayer requests, it did not strike me, it was indeed Big Data.  The kids are awesome.  They teach you so many things.

In theory / practice, depending on which side of the equation you are, God looks at all the activities, the good, the bad & Ugly and instantaneously processes them.  Real time data – VELOCITY

He also looks at different types of activities – what you say, what you do, how you treat different set of people … VARIETY

Of course, he stores them for dealing with you during and after your time - VOLUME

Finally when your time is done, he validates your history to decide where you end up (Heaven / Hell) – VALUE

In addition to this, he is constantly analyzing the data to provide with alerts, reward and teach you a lesson depending on the data you generate for him – DATA ANALYTICS (Descriptive and Prescriptive)

That is it friends, this is BIG DATA and its 4 V’s
As usual for more articles, please visit my blog site http://data-management-matters.blogspot.com/

Sunday, 14 June 2015

Legos and Transformers

Analytics is the buzzword within every industry.  The transformation in Analytics field over the past couple of years have been dramatic.  The focus has shifted from Reports to Diagnosis to predications.  Remember the days of Data Warehousing (jobs that run over night) to create specific dimension of data for reporting purposes.  Those days are slowly taking a back seat. Before proceeding any further, let us review the types of analytics.  Analytics at the outset, tries to address few basic questions regarding a particular event

What happened :  Description of the event.  Often referred to as Descriptive Analytics.  
Why it happened :  Reasoning.  Often referred as Diagnostic Analytics

Exactly how kids react to anything.  Their basic questions at all time.  What and Why.  In fact kids focus on the why more than the what.  A normal conversation with my 4 year old prompted me to write this.  

Me :  Son, you have to start studying
Son :  Why?
Me :  When you grow up you can make Money
Son :  Why?
Me :  To buy things that you need
Son :  Like what
Me :  House, food, dress…..
Son :  Why?  I am already living in a house
Me :  This is my house.  You have to get one for your own
Son :  Why?  There are 4 rooms and i have my own bedroom.

You get the point.  Further analytics have taken a deep dive in to Predictions and prescription.

When will it happen again :  Will it happen again, if so when.  Often referred to as Predictive analytics
What should I do :  If it happens or how to avoid it.  More of prescription.  Often referred to as Prescriptive Analytics.

In the current environment of booming analytics, how can organizations truly succeed with leveraging analytics.  I read a good article on Gartner's take on Business Intelligence and Analytics.  Gartner talks about 2 sides of Analytics - The Dull Side and the Dark side.  

Dull Side :  Requirements and IT driven analytics.  Traditional way of doing analytics.  Business may not get what they need at the right time and there is always a time lapse due to requirements collection, development …. Hence the dull

Dark side :  With the rise in IT consumerization, Business can do their own analytics with their own data and tools without the knowledge of others.  This is going in to dark side.  

Clearly these are 2 extremes and neither of which can truly benefit the organization in today's dynamic environment.  The reasons for above extremes of working is driven by Volatile requirements and need for personalization.  

Volatile Requirements :  Business requirements are always changing.  It is even fair to say there is no requirement and everything is a requirement.  In the upstream environment, today they might be interested in Low producing wells, tomorrow, Wells with Category 1 integrity issues, day 3 something else.  It keeps changing depending on what they like to address.  Business is wanting a solution as quickly as the requirements changes. No time for requirement document and IT driven processes.

Personalization :  Everyone of them have their own way of doing analysis and would like to personalize for their need and ease.  A common implementation with standardized set of reports and analytics will drive their efficiencies and thought process out.  Hence business prefers the dark side.  

So what is the Solution.  Here is where Kids help us understand the business better. One day the kid wants a Train toy, the next day a car, and the list goes on.  The volatility of the requirements.  Once they have a toy, they want to change the shape of the toy, twist and turn.  Move few pieces across.  Personalization. 

The toys that satisfy them are Legos and Transformers.  Kids can build what they want and personalize the way they want.  So in short treat your business like Kids.

From an architectural and implementation purposes,

1. Create an Data Management engine / Platform.  Architect and design your Data Management platform to be flexible and scalable.   
2. Provide ability for the users to pick and chose the data through a open data model / Services from the Data management Platform
3. Keep the consuming applications open - range from spreadsheets to Statistical / Data science tools

4. Have a sustainable organization to manage Data (Quality and Availability) and Data Management Platform

Thursday, 4 June 2015

IT - Get busy living or get busy dying

“Well I believe in God, and the only thing that scares me is Information Technology (IT)” is slowly becoming the hidden truth within the business organizations where IT is not a core business.  Once seen as a partner and innovator for the business is slowly getting reduced to keeper of software and hardware.  IT conferences and CIO summit’s are now focussed on redefining the IT landscape and role of the CIO in the current environment.  We have to revisit the ways of working and adapt to the real needs of the business.  A brief look at what has caused this transformation.

Cost Center : IT department is one of the cost center’s of the organization.  As with every cost center, the organization is focussed on keep the spend in check.  Budgets usually run very thin and when the going gets tough IT budgets are the first ones to get slashed.  Usual targets of IT organization from the CEO is to cut next year IT spend by 10% - 20%

Organization construct :  IT Manager / CIO usually reports in to the CFO (Chief Financial Officer).  Go figure.  CIO’s seldom get a seat at the table of the Level 1 leadership of the organization.    This cripples the head of the IT organization to have a clear view on the company goals and how he/she can help reshape IT to support the same

Traditional IT approach :  The current environment is fast paced and agile.  Everyday there are new things out in the market from Big Data, Cloud, Mobility, Internet of things, predictive and prescriptive analytics, Machine learning ….. Unfortunately, the incumbent IT organizations are laggers - look for detailed requirements to be captured, a long drawn project - usually waterfall, multiple processes with checks and balances.  When Cloud is promising to stand up your environment in 2 weeks, the project to implement Cloud takes more than a year.  

IT Consumerization :  With SaaS and PaaS models, Apple store / market place models, the business can do what they need to do without IT.  To complicate further, with PaaS models, there is heavy rise in Shadow IT.  (Check out Gartner’s article on BiModal IT to supplement Shadow IT).  Honestly, today’s software vendors selling point is, “You do NOT need your IT to run our software”.  This shift has transformed IT to a Asset Repository (or License Manager)

IT’s Way of working :  Typical IT organizations outsource 80-90% of their activities predominantly to keep the cost in check.  The key here is how the model is designed.  In an attempt to oversimplify, IT bundles the similar technologies or platform together.  From a business perspective, for a support of single application, they are looking at multiple support teams with limited to no understanding of the business context or problem.  Ever heard this quote from Taken.  “I don't know who you are, where you are, but I will find you and …..”  That is exactly how they will feel. (I will follow up with an article explaining this very subject)

Expensive / Takes too long :  This reminds me of Dr Evil and his famous Quote - “One Million Dollars”.  With the IT toolkits and the support model, any change would essentially mean a redesign, rearchitecture leading to expensive and long projects.  The sad part of this is the end product either misses the mark or little too late 


IT needs to go through a complete overhaul right from the way the organization is structured - Leadership, Projects, Support, Architecture to take advantages and keep up with the fast paced changes.  Hadoop and the Big data came in to existence in 2005.  A decade later there are only handful of organizations using it in upstream.  I had to end this with a closing remark from my favorite movie - “Get Busy Living or Get Busy Dying”

Tuesday, 2 June 2015

As a Service

"The greatest trick the Devil ever pulled was convincing the world he didn't exist" -  One of the famous quote from the movie "The Usual Suspects".  I am confident the entire software industry is familiar with IaaS, PaaS, SaaS, BaaS...  Is this concept of "As a Service", entirely new or has it been existing for a quite a while.

After attempting to teach my kids about anything, i have started to believe that things are simple.  If you think about it they are simple.  Few years back, my older one, out of the blue, asked me  "Dad, what do you do in Office".  Ha.  as a proud father of a curious son, "Why do you ask?".  "No, every time i call you, you say you are in meeting.  I was wondering, do they pay you for sitting in meetings".  oops.  Simple conclusion for him.  I was then forced to explain him about what i do.  "You see, I work in Data Management and we are implementing a cloud framework" and wanting to prove that his dad is actually smart (?), started talking about cloud, Infrastructure as a Service....  All the cloud he knew was the one that gives you the rain.  He looked extremely puzzled and probably started feeling "Why did i even bother what he does in office".  I hear my wife's voice from her home office.  "Why don't you KISS?"  She meant, "Why don't you KEEP IT SIMPLE, STUPID?".  probably the nicest way she had addressed me the entire year.  Here is an attempt to simplify "As a Service"

A very long time ago, you could one use the things you own.  Simply put, you will have to buy.  This changed, don't know how long ago, with the concept of rent or lease.  For instance, renting cars started way back in 1904.  Software industry also went through similar phase like any other industry.  If you have to use Infrastructure (servers), Software.... you will have to buy them and own them.  Now with "As a Service", we can rent them.  In reality we are the laggers to a concept that has been prevalent for centuries in every other industry.  

Let us dig a little deeper on the fundamental principles of "as a Service".  Say you go in to restaurant (another service), the chronological order of things that happen are as follows

Transparency : Clear understanding on options, ingredients, cost ....
Personalization :  Like 'Shaken not Stirred".  Always there is a need to add another layer of cheese, amount of spice. 
Decide :  Once comfortable, you decide and place your order

While the food is getting prepared in the kitchen, you trust there is 
Governed :  A master chef who manages the overall process
Quality :  the ingredients, the environment is quality controlled

when the food is served
Timely :  you did not have to wait for long
Responsive : You expect the waiter/waitress to check on you / answer questions
Scalable :  You like to see more offerings in food variety, proximity of the place…..


These are the same principles that will apply for "as a Service" within our industry.  It is advisable to ensure these principles are taken in to account for any new IaaS, PaaS, SaaS offering that is required by your organization.

Sunday, 31 May 2015

Who am I?

The name is Bond, James Bond.  All one need is hear is the name.  Everything about this person (entity) is evident.  What he does, how he dresses, what he drinks, how he reacts and what he likes the most (i don't have to explain this).  Such is the power of the right identifier.

This is every Data Manager's dream.  Ability to drive the history of the entity by its identifier.  In reality, this is a huge challenge.  For the sake of this discussion, let us assume there is a fair and consistent understanding of "What is a Well".  For decades, the industry has not been successful to create a true identifier for the Well and associate data to the identifier.  The challenge is that over the course of the life of the well, it gets identified differently.  In my humble opinion there is no better person to explain this than me for having lived with a similar challenge throughout my life.

My full name is Meenakshisundaram Thandavarayan.  My dad did not trust me to learn english so made sure my name had most of the alphabets.  He did not want to trouble the rest of the family so shortened it to Sundar just for them. Then came my school and college friends, they address me as Meenakshi.  If you have slightest knowledge about Indian names, Meenakshi is actually a name of a goddess.  Every year this confused the heck out of new students and teachers who came across this name.  Then came my office friends, who found Meenakshi difficult and shortened it further to Meena, another famous woman name in India.  I am not sure how many countless times i have disappointed fellow employees on the phone, when they expect to hear a female voice. :)  


My name went further modifications.  Meena Sunderam (see the spelling change from Sundaram to Sunderam), Meenakshi S,  Thandavarayan (see the picture for more variations).  Few of the occasions due to system limitations - name too long to handle, unable to fit it in a card....  This is the MDM, Meena's Data Management problem that i am dealing with.  Imagine if one of my office colleague meets my school friend, they both might be talking about me but probably could not connect the dots due to the man made identity crisis.  So how do I fix this.  Is it logical for me to go back to everyone i know and reset expectations.  Not practical isn't it.  


This is very similar to the scenario within the Upstream industry where the Well identifier goes through multiple transformations and lose its meaning and history.  As with the above scenario, it is not practical to identify and change every reference of the identifier.  As we go through attempting to fix this identity crisis within the industry, there needs to be well thought through and calculated approach.  Follow these 3 principles
Do not lose the Big Picture - what is the end goal 
Follow an Opportunity based approach - Address the high business value ones

Make it Agile - be methodical but be agile to deliver value quickly

Saturday, 30 May 2015

Respect the Holy Grail

Remember the final act from the Indiana Jones and the last Crusade.  With handful of options on the table, the words of the Grail knight goes like this -"But choose wisely, for while the true Grail will bring you life, the false Grail will take it from you".  I feel this is the message to all upstream Data Managers.  It is up to us to choose wisely or poorly.  

So what is the Holy Grail?  Growing up, i was always told by my teachers, parents and mentors - There is no one size fits all.  i.e. There is no one answer for all problems.  The upstream business proved this theory wrong.  They have the answer and the tool for solving every problem within the industry.  The holy grail is not pretty, not expensive just as in the movie.  Their Holy Grail - Microsoft Excel & Access.  

Downloading data, yes there is Excel. Data Integration moving data across for different business process, oh yes Excel can do it.  Calculation, Analysis, Charting and Visualization, Excel can do it all.  Data Size increases are handled with either multiple Excels or fit for purpose Microsoft Access.   Thank you Microsoft !!

As Data Managers, this is a consistent challenge practically hindering implementation of any kind of data management practice.  No wonder all the MDM, Data Governance and Data Quality implementation are thrown out of the widow and still debating every year at the conferences on the same topics for a decade. 

It is time to decide.  Do we choose poorly or do we choose wisely.  Yes i said it, embrace the Holy Grail by choosing wisely and using it wisely.  We (Data Managers) cannot get Excel and Access from the business but can design your Enterprise architecture where data can be provided as a Service to Excel.  By using Data as a Service model, Data Managers can Govern the data and the Business users with the Holy Grail consume the data as a service.

This is not a myth.  This has been proven in my professional life.  So embrace the Holy Grail, Data Managers.  


To be continued.... 

Are Data Managers, Hypocrites!

Every other industry has now bought in to or considering Cloud, Big Data, Advanced analytics, Machine learning, Internet of things.  It is a fact that 80% of information generated is unstructured.  Extracting the information from the unstructured content is now possible with the advent of NoSQL, text parsers, context search... as long as the information is digitized.  We as Data Managers have long been preaching the importance of going digital - avoiding hard copies. Do we practice what we preach or are we hypocrites?

In one of the recent Data Manager's conference, I witnessed 80% of data managers taking notes in a notebook (not the digital one).  Most of the data management trainings conducted by vendors distribute hard copies of training material.  All notes taken in the training is also hardcopy.  We are still handing out flyer's, business cards, hard copies of presentations.  One of my fellow Data Manager maintains 3 notebooks of data for an year!  All of these contain data / information that we need referencing.  As data managers are we truly digital? Are we practicing what we preach?  


It is time we change and start leading by example.   

Thursday, 28 May 2015

PNEC 2015 - Reflections

PNEC 2015 marks the 19th year of the conference and continues to attract good attendance (~650 people) across geological boundaries.   This reflects on the quality and the value of the conference.  Kudos to its organizers for continuing to keep the momentum with the industry going through difficult times.

PNEC recognized deserved candidates in Pam Koscinski, (Consultant, PPDM); Janet Hicks (Senior Manager, Halliburton-Landmark) and Matthias Martung (Director, Vice President Technical Data, Shell) with Cornerstone awards

Yet again 75% of the talks were focussed on 3 key areas - Master Data Management, Data Quality, Data Governance, an indication that industry is still struggling in establishing the Data Management back bone.  However there were refreshing talks on Big Data, Data Analytics & Visualization, Machine learning and Data as a service.  Good that finally we are catching up with the emerging technology trends.  Better late than never.  

The exhibits / booths predominantly had the usual suspects with the addition of Big data / No SQL vendors - Cloudera, Hortonworks and MarkLogic.  The common theme was that most of the industry is running dry and window shopping this year.   In my opinion PNEC should consider workshops which is an emerging trend in other similar conferences.


Overall a good conference and trimming it down to 2 days would be apt going forward.

Friday, 22 May 2015

Information Technology and Data Management


Having worked in the Information Technology (IT) for the past 3 years have taught me many things about how IT is treated within a larger organization. IT usually gets the step child treatment. The below paper talks about how IT is perceived by the larger organization, what IT brings to the table and how to leverage IT to do Data and Information Management. IT and Data Management is a tribute to all IT folks out there

Monday, 4 May 2015

Where is your Data Management Organization

DAMA defines Data management as development, execution and supervision of plans, policies, programs and practices that control, protect, deliver and enhance the value of data and information assets.  

On a colloquial note, Right Data at the Right time to the Right person defines good Data Management practice.  But what is the Right place for the Data management organization within the Business.

From the definition, it is reasonably clear to assume that data management is a culmination of activities that run across both IT and Business.  Within the E&P upstream industry, there has always been an ambiguity around where the Data management organization be - within IT, within Business, across both IT and Business, a siloed organization on its own.  Every approach has its pros and cons.  
Business owns the data, so should the business have a team to manage their data.  The advantage of this approach is (a)  Business is a profit center.  This will help in funding and sustenance of Data management (b) Good understanding of business helps in effective and efficient Data management.  
On the other hand, this promotes technology and solution created within the business which may not be healthy on a longer run.  The emergence of Shadow IT and Siloed / pointed solutions are a result of doing DM within Business

Data Management within IT can help provide a overarching enterprise wide Data management Platform and can help in sustainable solutions.  However IT is a cost center and the IT mantra year on year is how to reduce cost.  Building and sustaining DM needs requires consistent funding and the IT mantra will degrade the value of Data management within the Organization.  Further more the translation of Business requirements to IT requirements adds to the gaps in entirety of the solution.  

Having data management organization split across IT and Business will ensure that the activities belong where they should belong.  It is however a well known secret on how Business and IT work with each other.  Processes, Documentation, Communication gap, Friction and preconceived thoughts will become the routine that will hinder the progress of Data management.  A typical light hearted example is shown.

Data management as its own organization is healthy in many ways.  It provides a clear focus on Information and Data management.  The end to end accountability is very clear and not barred by the disadvantages of Business or IT rules of engagement.  There might be occasional rifts with IT but with strong relations this can be subsided.  When the climate is good, this model provides the best of both worlds and becomes a value add to the enterprise.  However it also comes with its disadvantages.  Delivering a initial business case and make the leader of the organization buy in to this model is a uphill task.  When the chips are down and the company / industry is not doing well, this will an easy target to kill / outsource and could be demotivating factor for the employee.

In conclusion there is no one size fits all.  With the world becoming digital and everything getting driven by data, it is time for industries to consider a Chief Data office to run a Data Management Organization.