MIS 587 - Spring 2012

Monday, April 23, 2012

Sunday, April 15, 2012

As we move from data analysis to strategy, there is an obvious need to align your analytics with the overall strategy of the organization. This is where the interface between BI and strategy lies. While we know that there are a lot of tools to measure performance, the strategy of the organization tells what performance measures are relevant and this information can be used to customize the BI tools. But before that it would definitely make sense to know the components of a strategy in order to maximize the relevance of BI tools. One widely used strategic management tool is the Balanced Score Card.

Balanced score card is a performance management tool that helps executives at different levels guage the performance of their respective units in such a way that their performance matches with the performance of the overall organization. Strategy by definition gives a unique position to a company while making some trade offs to achieve that position (unless you adopt a blue ocean strategy). In order to achieve that unique position, the company has to look at four interrelated perspectives as show in the figure below.

BI tools in short can be used to provide the different performance metrics on each of these perspectives. However the process is seldom one sided meaning it is almost never the case that the metrics are formulated and the BI tools are just used as another other information system to provide those metrics. The information provided by these tools help the organization change its strategy which in turn can make some other metrics relevant to their strategy. So in short, BI has an instrumental role in framing an organizations strategy. Here is a video on the value of BI.

Sunday, March 25, 2012

Dimensional Modeling

As I took a week's break from blogging - since this is the first week of classes after the spring break - I decided to adopt this new Q/A format for blogging for the rest of the semester. And so I begin blogging about dimensional modeling in this format

What is Dimensional Modeling?

It is a set of techniques used to build a data warehouse. Each model has a set of dimensions, which are analogous to the operational tables in the database and a fact table which is analogous to the associative entity in an ER model.

How does Dimensional Modeling differ from ER Models?

Dimensional models are quiet different from ER models in the sense that they do not necessarily involve objects from the relational database. The dimensions can be even flat files. Moreover, the fact tables in dimensional modeling are loaded at fixed intervals when the operational tables are either under maintenance or under least load due to customer transactions while the dimension tables are loaded in real time as and when a transaction takes place. Also the data in dimension tables is seldom deleted while that in dimension tables/Associative entities in the ER model may be deleted once it is regarded as obsolete under the business rules.

What are some of the properties of facts?

Facts are designed to capture interesting patterns about your business that may not be evident in your transactional tables. Facts can be typically aggregated across dimensions and provide knowledge valuable to businesses.

Is it possible to normalize/de-normalize dimensions as we do in normal transactional databases?

Normalization of dimensions is possible though it is an expensive operation and thus not done usually.

Saturday, March 3, 2012

Properties of Networks

It's a very well known fact that we live in an era dominated by online social networks. However, in order to leverage these networks, it would be interesting to know the science behind the success of these networks. An understanding of this science provides a preliminary step towards leveraging these networks for various purposes like advertising, promotions, campaigns etc., And so here I provide you with some interesting properties of networks.

Degree : - It is the number of nodes to which a particular node is connected. In case of a directed graph it can be split into in degree and out degree according to the number of incoming links to and out going links from a particular node.

Centrality
-----------------
In general centrality is a measure of the importance of a node. However, to just state that a node is important is vague. There has to be a criterion for the same. Hence we have the following types of centrality

Betweenness Centrality : -It is the number of shortest paths - between every pair of nodes- on which a particular node lies. A high betweenness centrality implies a that a node act as a mediator or bridge between two components of a network. The node "Heather" in the graph below has a high betweenness centrality.

Closeness Centrality : - It is the inverse of the distance from a node to all other nodes. A higher value of this metric indicates a higher closeness of this node to all other nodes in the network. It essentially indicates the importance of a node in terms of its ability to reach all other nodes easily. The largest node in the figure below as a high closeness centrality.

Eigen Vector Centrality : - It is a measure of the influence of the node in a network. It assigns relative scores to nodes on the assumption that nodes closer to important nodes are more important. Google's page rank is a variant of Eigen-vector centrality.

Sunday, February 26, 2012

Short Note on Long Tail

The title must have caught the attention of every one. Nevertheless I feel it is very apt and as you read through the blog you will also find the same.

Long Tail is a concept pioneered by e-commerce giants like Amazon and e-Bay to leverage the benefits of selling those goods that are rare to find in the shelves of stores and super markets. The unique selling point of this concept is making everything and anything available to the customers by leveraging the fact that esoteric items are hard to find in the store shelves. In short a large number of less frequently sold items have as much sales as a small number of more frequently sold items.

Long Tail Keywords also known as narrow keywords are those keyword phrases that bring up as much traffic to the website as those few generic keywords that account for a majority of the traffic. There are two unique selling points for this kind of strategy. Firstly, long tail key words are less competitive and hence it is easy to get ranked first on the google ads than short tail key words. Secondly, when the customer is looking to purchase something specific, they tend to use specific search terms and hence you are quiet likely to have more conversions in this case than when you are using short tail generic keywords.

Here are some interesting links to learn more about Long Tail Keywords

Long Tail Vs Short Tail Keywords

Keyword Strategies - Long Tail vs Short Tail

Research and Find Out Long Tail Keyword for your ROI

Three Good Reasons to Target Long Tail Keywords

Sunday, February 19, 2012

Networks, analyses and predictions

We live in a age of social networks. Independent of wherever we go, we are always connected to the people we want to. Added to that we maintain different types of contacts depending upon the people we befriend. However, most of these connections are implicit and are not disclosed for the reasons of privacy.But we might end up in a situation where we need to investigating these connections becomes a necessity.

The figure shown above represents the pattern of e-mail communication inside Enron at the time of Scandal. The size of the nodes represents the relative importance of the personnel in the network and the thickness of the edges represent the number of messages communicated. At the first sight it is clearly palpable that Tim Belden is in the middle of the communication during the crisis with a few others like Amy FitzPatrick and Chirstopher F. Calger playing an important role in the crisis communication. Thus we can clearly identify the crisis leaders just from the visualization of the network.

The example in the previous paragraph is one of a rare case and company e-mail networks are not investigated unless a scandal of the sort of Enron breaks out. However, in our daily life we come across many networks. Twitter is an example of a network which if visualized can unravel daily communication patterns. Of course, we have other networks in the form of Wikipedia - where authors collaboratively edit articles and YouTube- where users post comments and replies. An analysis of all of these can reveal significant hidden information that can be exploited in various area like advertising, investigation, prediction etc ..

Sunday, February 5, 2012

Lecture 5: Network Analysis

In short, the theme of this lecture is "Networks in an Online World".

We live in a world dominated by Online Social Networks. While people connect with each other in different ways, it would be interesting to know whether there are any connection patterns that can reveal some interesting patterns about the way they connect and about them in general.

The figure below is an example of a network for a Linkedin User.

While we can see many clusters on the network, it is clearly palpable that there is one big cluster that is densely connected. This reveals that the user has a strong connected network in one area.

The network shown above doesn't have any directionality. However, there are some social networks that have directionality like the Twitter network. Visualization of these networks can reveal some interesting properties.

However, studying networks goes far beyond visualization and gets into the study of objective parameters like authority, hub, density etc., In short network analysis is a comprehensive tool for objective analysis.