Word Document PDF Document |
Appendix A
ACR Questions
1. A standardized Z-test is proposed for purposes of determining compliance with parity. Explain why this standard textbook statistical test cannot serve as a measurement tool at least for the duration of the six-month trial pilot test period? Keep in mind that the incentive phase of the model can calibrate for measurement outcomes through various incentive plan structures and amounts.
2. Benchmark measures without any statistical tests are proposed for purposes of determining a performance failure. Explain why this simple approach cannot serve as a measurement tool at least for the duration of the six-month trial pilot test period? Keep in mind that the incentive phase of the model can incorporate information on underlying data values and distributions.
1. A minimum sample size of thirty, aggregated in up to three-month time periods, is proposed. Explain why this standard textbook statistical proposal cannot serve as a minimum sample size rule at least for the duration of the six-month trial test period? Keep in mind that the test would still be performed using whatever sample size is achieved at the end of three months.
Ten percent Type I alpha level for parity tests is proposed. Explain why this standard textbook statistical proposal cannot serve as an alpha level/critical value rule at least for the duration of the six-month trial pilot test period? Again, keep in mind that the penalty phase of the plan can calibrate the size of the payments as a function of the critical values.
Appendix B
References
Bartz, A. (1988). Basic statistical concepts. 3rd ed. New York: Macmillan.
Bickel, P. & Doksum, K. (1977). Mathematical statistics: Basic ideas and selected topics. San Francisco: Holden-Day.
Brownie, C., Boos, D., & Hughes-Oliver, J. (1990). Modifying the t and ANOVA F tests when treatment is expected to increase variability relative to controls. Biometrics, 46, 259-266.
Brubaker, K. & McCuen, R. (1990) Level of significance selection in engineering analysis. Journal of Professional Issues in Engineering, 116, 375-387.
Das, C. (1994). Decision making by classical test procedures using an optimal level of significance. European Journal of Operational Research, 73, 76-84.
Gold, D. (1969). Statistical tests and substantive significance. The American Sociologist, 4, 42 - 46.
Good, P. (2000). Permutation tests: A practical guide to resampling methods for testing hypotheses. 2nd Ed. New York: Springer Verlag.
Hays, W. (1994). Statistics. 5th ed. Fort Worth: Harcourt Brace.
Hubbard, R.; Parsa, R.; Luthy, M. (1997). The spread of statistical significance testing in psychology: The case of the Journal of Applied Psychology, 1917-1994. Theory & Psychology, 7, 545-554.
Hunter, J. (1997). Needed: A ban on the significance test. Psychological Science, 8, 3-7.
Johnstone, D. & Lindley, D. (1995). Bayesian inference given data "significant at _": Tests of point hypothesis. Theory & Decision, 38, 51 - 60.
Khazanie, R. (1997). Statistics in a world of applications. 4th ed. Harper Collins.
McNemar, Q. (1962). Psychological statistics. New York: John Wiley & Sons.
Raiffa, H. (1970). Decision analysis. Reading, Mass.: Addison-Wesley
Sheskin, D. (1997). Handbook of parametric and nonparametric statistical procedures. Boca Raton: CRC Press.
Skipper, J., Guenther, A., & Nass, G. (1970). The sacredness of .05: A note concerning the uses of statistical levels of significance in social science. The American Sociologist, 2, 16-18.
Verma, R. & Goodale, J. (1995) Statistical power in operations management research. Journal of Operations Management, 13, 139-152.
Welsh, A.H. (1996). Aspects of statistical inference. New York: Wiley & Sons.
Winer, B.J. (1971). Statistical principles in experimental design. New York: McGraw-Hill