Thursday, February 26, 2015

Wednesday, February 18, 2015

What is Dispersion? An extract from the book "Practical Business Analytics using SAS"



What is Dispersion

Dispersion is the variation in data—the non uniformity or inconsistency in the values of a variable. The measures of dispersion indicate nothing about the middle value of the data. Rather, they give you an idea about the spread in the data. Dispersion can be measured using Range, Variance and Standard Deviation.

Anderson Wants to Cross a River 

Mr. Anderson, who can’t swim, wants to cross a small waterway. He asked a neighbor to describe the depth of that river, and the neighbor said its depth is 4 feet on average. Mr. Anderson is happy and starts to cross it. His happiness does not last long. The reason is that although the average is 4 feet, the depth at some places might have been 7 feet, which is more than Mr. Anderson’s height. If he had inquired about the deviation from average depth or the inconsistency of depth at various points, or at least the range of depth(minimum and maximum depeth) apart from the average depth of the river, it would have saved Mr. Anderson from drowning.


Therefore, merely knowing the average or the center value may not be sufficient in all cases. The deviation from center (or the dispersion) or the spread of a variable is also important. Given next are a few measures of dispersion.















Wednesday, February 11, 2015

(TH101)Peer Comparison case study - Testing of Hypothesis


Business Problem


     This is a peer comparison project. Suppose that you are working for Samsunge in customer experience management team. The idea is to regularly monitor the customer satisfaction levels and peer company moves. The competitor company is Appleo. The objective is to test two main hypothesis.
1.The Samsunge Average customer satisfaction score is minimum 75%.
2.The overall average satisfaction score of Samsunge is same as  Appleo. There is no significant difference in the satisfaction scores


It might be possible that both hypothesis are correct, one of them is correct or both of them are wrong. Perform the relevant testing to verify these assumptions

The Data


The data is collected for 100 Samsunge customers and 100 Appleo customers. Their satisfaction scores are recorded. The sample represent the data and it is unbiased




Approach



Download the data and import it to SAS

Part-1
Take Samsunge_Score Coolum
Identify the right test(Testing sample mean)
Accept or reject the null hypothesis based on P-value
Part-2
Calculate the mean of Samsunge and Appleo
Perform mean comparison test / two sample equal mean tes
Accept or reject null hypothesis based on P-Value

References:

Chapter -8 Testing of Hypothesis from the book Practical Business Analytics Using SAS: A Hands-on Guide http://www.amazon.com/Practical-Business-Analytics-Using-Hands/dp/1484200446
SAS code from Chapter-8 of the book Practical Business Analytics Using SAS: A Hands-on Guide http://www.amazon.com/Practical-Business-Analytics-Using-Hands/dp/1484200446
Case study id: TH 101-Peer Comparison