Sunday, April 15, 2012

Balanced Scorecard and BI

As we move from data analysis to strategy, there is an obvious need to align your analytics with the overall strategy of the organization. This is where the interface between BI and strategy lies. While we know that there are a lot of tools to measure performance, the strategy of the organization tells what performance measures are relevant and this information can be used to customize the BI tools. But before that it would definitely make sense to know the components of a strategy in order to maximize the relevance of BI tools. One widely used strategic management tool is the Balanced Score Card.

Balanced score card is a performance management tool that helps executives at different levels guage the performance of their respective units in such a way that their performance matches with the performance of the overall organization. Strategy by definition gives a unique position to a company while making some trade offs to achieve that position (unless you adopt a blue ocean strategy).  In order to achieve that unique position, the company has to look at four interrelated perspectives as show in the figure below.


BI tools in short can be used to provide the different performance metrics on each of these perspectives. However the process is seldom one sided meaning it is almost never the case that the metrics are formulated and the BI tools are just used as another other information system to provide those metrics. The information provided by these tools help the organization change its strategy which in turn can make some other metrics relevant to their strategy.  So in short, BI has an instrumental role in framing an organizations strategy. Here is a video on the value of BI.

                                 



Sunday, March 25, 2012

Dimensional Modeling

As I took a week's break from blogging - since this is the first week of classes after the spring break - I  decided to adopt this new Q/A format for blogging  for the rest of the semester. And so I begin blogging about dimensional modeling in this format

What is Dimensional Modeling?

It is a set of techniques used to build a data warehouse. Each model has a set of dimensions, which are analogous to the operational tables in the database and a fact table which is analogous to the associative entity in an ER model.

How does Dimensional Modeling differ from ER Models?

Dimensional models are quiet different from ER models in the sense that they do not necessarily involve objects from the relational database. The dimensions can be even flat files. Moreover, the fact tables in dimensional modeling are loaded at fixed intervals when the operational tables are either under maintenance or under least load due to customer transactions while the dimension tables are loaded  in real time as and when a transaction takes place. Also the data in dimension tables is seldom deleted while  that in dimension tables/Associative entities in the ER model may be deleted once it is regarded as obsolete under the business rules.


What are some of the properties of facts?

Facts are designed to capture interesting patterns about your business that may not be evident in your transactional tables.  Facts can be typically aggregated across dimensions and provide knowledge valuable to businesses.

 Is it possible to normalize/de-normalize dimensions as we do in normal transactional databases?

Normalization of dimensions is possible though it is an expensive operation and thus not done usually.





Saturday, March 3, 2012

Properties of Networks

It's a very well known fact that we live in an era dominated by online social networks. However, in order to leverage these networks, it would be interesting to know the science behind the success of these networks. An understanding of this science provides a preliminary step towards leveraging these networks for various purposes like advertising, promotions, campaigns etc., And so here I provide you with some interesting properties of networks.

Degree : - It is the number of nodes to which a particular node is connected. In case of a directed graph it can be split into in degree and out degree according to the number of incoming links to and out going links from a particular node.

Centrality
-----------------
In general centrality is a measure of the importance of a node. However, to just state that a node is important is vague. There has to be a criterion for the same. Hence we have the following types of centrality

Betweenness Centrality : -It is the number of shortest paths - between every pair of nodes- on which a particular node lies. A high betweenness centrality implies a that a node act  as a mediator or bridge between two components of a network. The node "Heather" in the graph below has a high betweenness centrality.



Closeness Centrality : - It is the inverse of the distance from a node to all other nodes. A higher value of this metric indicates a higher closeness of this node to all other nodes in the network. It essentially indicates the importance of a node in terms of its ability to reach all other nodes easily.  The largest node in the figure below as a high closeness centrality.

Eigen Vector Centrality : - It is a measure of the influence of the node in a network. It assigns relative scores to nodes on the assumption that nodes closer to important nodes are more important. Google's page rank is a variant of Eigen-vector centrality.

Sunday, February 26, 2012

Short Note on Long Tail

The title must have caught the attention of every one. Nevertheless I feel it is very apt and as you read through the blog you will also find the same.


Long Tail is a concept pioneered by e-commerce giants like Amazon and e-Bay to leverage the benefits of selling those goods that are rare to find in the shelves of stores and super markets. The unique selling point of this concept is making everything and anything available to the customers by leveraging the fact that esoteric items are hard to find in the store shelves. In short a large number of less frequently sold items have as much sales as a small number of more frequently sold items.


Long Tail Keywords also known as narrow keywords are those  keyword phrases that bring up as much traffic to the website as those few generic keywords that account for a majority of the traffic. There are two unique selling points for this kind of strategy. Firstly, long tail key words are less competitive and hence it is easy to get ranked first on the google ads than short tail key words. Secondly, when the customer is looking to purchase something specific, they tend to use specific search terms and hence you are quiet likely to have more conversions in this case than when you are using short tail generic keywords.


Here are some interesting links to learn more about Long Tail Keywords


Long Tail Vs Short Tail Keywords


Keyword Strategies - Long Tail vs Short Tail


Research and Find Out Long Tail Keyword for your ROI


Three Good Reasons to Target Long Tail Keywords

Sunday, February 19, 2012

Networks, analyses and predictions

We live in a age of social networks. Independent of wherever we go, we are always connected to the people we want to. Added to that we maintain different types of contacts depending upon the people we befriend. However, most of these connections are implicit and are not disclosed for the reasons of privacy.But we might end up in a situation where we need to investigating these connections becomes a necessity.



The figure shown above represents the pattern of e-mail communication inside Enron at the time of Scandal. The size of the nodes represents the relative importance of the personnel in the network and the thickness of the edges represent the number of messages communicated.  At the first sight it is clearly palpable that Tim Belden is in the middle of the communication during the crisis with a few others like Amy FitzPatrick and Chirstopher F. Calger  playing an important role in the crisis communication. Thus we can clearly identify the crisis leaders just from the visualization of the network.

The example in the previous paragraph is one of a rare case and company e-mail networks are not investigated unless a scandal of the sort of Enron breaks out. However, in our daily life we come across many networks. Twitter is an example of a network which if visualized can unravel daily communication patterns.  Of course, we have other networks in the form of Wikipedia - where authors collaboratively edit articles and YouTube- where users post comments and replies. An analysis of all of these can reveal significant hidden information that can be exploited in various area like advertising, investigation, prediction etc ..

Sunday, February 5, 2012

Lecture 5: Network Analysis

In short, the theme of this lecture is "Networks in an Online World".

We live in a world dominated by Online Social Networks. While people connect with each other in different ways, it would be interesting to know whether there are any connection patterns that can reveal some interesting patterns about the way they connect and about them in general.

The figure below is an example of  a network for a Linkedin User.



While we can see many clusters on the network, it is clearly palpable that there is one big cluster that is densely connected. This reveals that the user has a strong connected network in one area.

The network shown above doesn't have any directionality. However, there are some social networks that have directionality like the Twitter network. Visualization of these networks can reveal some interesting properties.

However, studying networks goes far beyond visualization and gets into the study of objective parameters like authority, hub, density etc., In short network analysis is a comprehensive tool for objective analysis.


Sunday, January 29, 2012

Lecture4 : Data Driven Decision making with Google Analytics

Google Analytics is one of the most comprehensive tools used for tracking the Web presence.  It is the gateway for some body who wants to measure the pulse of their web presence.

The lecture was an interesting one considering the myriad of suggestions that came up for the Alumni and Partnering opportunities sections. It captured the diverse views and various ways of approaching the same problem. While one can understand that people interpret the same data in different ways, it must also be noted that there are some ground rules for making data driven decisions. It is certainly not possible to have a comprehensive list of these nevertheless here are a few

(1) It is very important to state the metrics used for decision making and the time period for analysis. Also, there has to be a proper justification for choosing the same.

(2) Every suggestion you make should be supported by a relevant metric. Also it is important to justify the relevance of the metric in the context. ( Bounce rate may be relevant in one context and Exit rate in the other).

(3) Percentages make sense in some context while numbers make sense in some other. It is important to understand the context to choose between the two.

(4) Lastly it is important to understand that data driven facts should be supplemented by grounded non data driven facts. Some of them are obvious to the context we are analyzing so they also play an important role in the suggestions we make. There is nothing called "Pure Data Driven Decisions".

Sunday, January 22, 2012

Second Post but Fresh Thoughts

As the semester grains traction and we move into the third week of the classes, I am for sure trying hard to  garner the required momentum for the rest of the four months.

Of course the class this time starts with the standard and basic question - "What is BI"?. I know there is know single hard and fast definition for this but we will go with the "Wisdom of the Crowds" consensus in defining it

" Business intelligence refers to the class of Tools, Technologies and Techniques required for Collection, measurement, understanding and analysis of data used by firms for either tracking or predicting performance"

This definition encompasses a very broader view of BI but of course as the saying goes the more broad something is the more vague it tends to be. Hence it throws up a myriad of questions like - "What are the Tools?" "What are the Techniques?", "How do you analyse data?" etc.,So let us slice the different parts of the definition so that we have a preliminary understanding of these. This way I guess we will have the agenda for the rest of the semester.

What are the Tools used in BI?

There is no hard and fast list of tools or a comprehensive list of tools documented. In fact the choice of the tool used depends on the metrics the firm is interested in tracking and predicting. How ever the most basic tool used by any company to track its Website metrics is Google Analytics (http://www.google.com/analytics/).  This has the most comprehensive list of web metrics for your Website visits and slices and dices the data in different ways. Other commercial BI tools include Microsoft BI Suite, Microstrategy BI Platform, Inetsoft etc., For a pretty comprehensive list of tools, click here .

 What are BI Technologies and name a few technologies used?

Technologies here refer to data integration,data quality, data warehousing, text and content analytics etc., There is no hard and fast definition of a Technology.

What are Techniques used in BI?

There are actually a myriad of them. Predictive Modelling, Descriptive data mining, Association, Correlation and OLAP are a few. For a more comprehensive list refer to this.

I would not certainly go into collection, measurement, analysis and prediction as that would be the topic of my posts for the coming weeks but one last thing that will conclude this post is the KPI.

What is KPI?

Key Performance Indicators are used by the industry to measure performance on one or more dimensions.  They evaluate the success of a company on that particular dimension. Click here for a more comprehensive understanding of KPIs.

As always, I conclude this post with some interesting links for this week.

Saturday, January 14, 2012

First Post and First Thoughts

Its been quiet  some time since I started blogging and so I was perplexed when I initially had to think about what to blog. And of course this is the first time that I am blogging about a class/lecture and hence I am totally lost. Yet random thoughts continue to flow in my brain as I pen them down.

The first thing that pops into my mind about this course is "8:00 A.M" lecture !!!!! Yes it's been more than half a decade since I had attended a lecture as early as this. But of course the fact that this course seems to involve more interactive sessions so for sure I am not going to feel sleepy!!! All these things apart the course structure seems very comprehensive and is set to engross me for the next 16 weeks. 

Business intelligence is a buzz word today in the industry. In a virtual world dominated by FLAT (Facebook, Linkedin, Amazon and Twitter), companies are continuously faced with the uphill task of making sense of the deluge of data that is publicly available on the internet. Every business knows that this data has a lot of implications to it but they lack the intelligence to make the maximum sense from it. Hence I would say the intelligence required to make this data sensible to the business is what Business intelligence is all about. If so the obvious question that comes into picture is "When did this Business Intelligence start and how did it start?" I am sure people reading this blog are perplexed now and hence I would say this is a question for discussion in the next lecture.

I would conclude this short blog with a set of some websites that I feel are the most important ones I came across in this week. I would welcome suggestions from people to add to this body of knowledge


Basics of Business Intelligence
(1) Wikipedia 
(6) Google+

Interesting News Articles