Conference on Measuring and Assessing Skills

Assembling Leading Economists, Psychologists, and Measurement Specialists to Examine and Evaluate Alternative Approaches

This conference assembled leading economists, psychologists and measurement specialists to examine and evaluate alternative approaches. Sessions are organized around three distinct but potentially complementary approaches. One approach is based on traditional interviews, self reports, paper and pencil tests and observer reports. A second approach is based on how people perform in response to various incentives and situations in controlled settings, such as laboratories. A third approach uses manifest behaviors in uncontrolled “real world” settings to measure skills. Thus, this conference is organized in the three blocks, each dedicated to discussing the relative strengths and bene ts of each of these elicitation strategies.


Program & Resources

Program, Videos, and Slides
Opening Remarks
James J. Heckman, The University of Chicago

The OECD’s Longitudinal Study of Social and Emotional Skills
Koji Miyamoto, OECD

The Big Five Personality Taxonomy: Conceptualization and Psychological Measurement
Oliver John, University of California, Berkeley


Download Slides

Anchoring Vignettes, Forced Choice, and Ranking Methods for Improving the Quality of Self-Reports
Patrick Kyllonen, Educational Testing Service


Download Slides

Working Lunch Discussion


Download Slides

Measuring and Assesing Non-Cognitive Skills
Tim Kautz, Mathematica


Download Slides

Discussion Session
John Barker, Chicago Public Schools


Download Slides

Emergence of Skills and The Importance of Social Context
Steven Durlauf, University of Wisconsin–Madison


Download Slides

Discussion Session
Burton Singer, University of Florida


Download Slides

The Non-Market Bene ts of Abilities and Education
John Eric Humphries, University of Chicago


Download Slides

Discussion Session
Daniel Silverman, Arizona State University


Download Slides

How Developments in Technology and Learning Psychology Challenge Educational Assessment
Robert Mislevy, Educational Testing Service


Download Slides

Using Markov Decision Processes to Understand Student Thinking in Performance Tasks
Michelle LaMar, Educational Testing Service


Download Slides

Experimental Validation of Preference Surveys: Findings from a Cross-National Sample in 76 Countries
Armin Falk, University of Bonn


Download Slides

Experimental Measures of Decision-Making Quality and Their Correlates in Observational Data
Daniel Silverman, Arizona State University


Download Slides

Special Discussion: Games vs. Survey Based Measures
Teodora Boneva, University College London


Download Slides

A Theory of Learning and Attention for Evaluating Skill Elicitation Experiments
Andrew Caplin, New York University


Download Slides

Cooperation and Personality
Aldo Rustichini, University of Minnesota


Download Slides

Round Table Discussion
James J. Heckman, The University of Chicago
Michael McPherson, Spencer Foundation
Stephen W. Raudenbush, The University of Chicago

Reading Lists

Anderson, J., S. Burks, C. DeYoung, and A. Rustichini (2011). "Toward the integration of personality theory and decision theory in the explanation of economic behavior". Unpublished manuscript. Presented at the IZA Workshop: Cognitive and Non-Cognitive Skills, January 27, 2011.

Andreoni, J. (1995, September). "Cooperation in public goods experiments: Kindness or confusion?"  American Economic Review, 85(4): 891–904.

Andreoni, J. and J. Miller. (2002, March). "Giving according to GARP: An experimental test of the consistency of preferences for altruis."  Econometrica, 70(2): 737–753.

Baker, C., R. Saxe, J. Tenenbaum "Bayesian Theory of Mind: Modeling Joint Belief-Desire Attribution."  Proceedings of the thirty second annual conference of the cognitive science society. 2469–2474 (2011)

Bartling, B., E. Fehr, and D. Schunk (2012). "Health effects on children’s willingness to compete."  Experimental Economics, 15(1): 58–70.

Blanden, J., P. Gregg, and L. Macmillan (2007, March). "Accounting for intergenerational in come persistence: Noncognitive skills, ability and education." Economic Journal, 117(519): C43–C60.

Caplin, A. (2015). "Measuring and modeling attention." Draft for Annual Review of Economics.

Caplin, A. and J. Leahy (2001). "Psychological expected utility theory and anticipatory feelings." Quarterly Journal of Economics, 116(1): 55–79.

Caplin, A. and D. Martin (2015). "The dual-process drift diffusion model: Evidence from response times." Working paper.

Caplin, A. and B. Nalebuff (1991, January). "Aggregation and imperfect competition: On the existence of equilibrium." Econometrica, 59(1): 25–59.

Carvalho, L., and D. Silverman (2015). “Complexity and Sophistication,” Working paper, Arizona State University.

Choi, S., S. Kariv, W. Müller, and D. Silverman (2014). "Who is (more) rational?" American Economic Review, 104(6): 1518–1550.

Dohmen, T., A. Falk, D. Huffman, U. Sunde, J. Schupp, and G. G. Wagner (2011). "Individual risk attitudes: Measurement, determinants and behavioral consequences." Journal of the European Economic Association, 9(3): 522–550.

Englmaier, F., S. Strasser, and J. Winter (2011). "Productivity, trust, and wages: The impact of cognitive and noncognitive skills on contracting in a gift exchange experiment." CESifo Working Paper No. 3637.

Fehr, E. and A. Falk (2002). "Psychological foundations of incentives." European Economic Review, 46(4): 687–724.

Fischhoff, B. (2013).  "The real world."  In Judgment and Decision Making, Chapter 15, pp. 272–299. New York, NY: Routledge Press.

Hsin, A. and Y. Xie (2012, February). "Hard skills, soft skills: The relative roles of cognitive and noncognitive skills in intergenerational social mobility." Research Report 12–755, Population  Studies Center.

John, O. P., L. P. Naumann, and C. J. Soto (2008). "Paradigm shift to the integrative big five trait taxonomy." In O. P. John, R. W. Robins, and L. A. Pervin (Eds.): Handbook of Personality: Theory and Research (3 ed.)., pp. 114–158. New York, NY: The Guilford Press.

John, O. P. and S. Srivastava (1999). "The Big Five trait taxonomy: History, measurement and theoretical perspectives." In L. A. Pervin and O. P. John (Eds.): Handbook of Personality: Theory and Research (2nd ed.)., Chapter 4, pp. 102–138. New York: The Guilford Press.

Kariv, S., and D. Silverman (2013). “An Old Measure of Decision-Making Quality Sheds New Light on Paternalism,” Journal of Institutional and Theoretical Economics, 169(1): 29-44.

Kautz, T. and W. Zanoni (2015). "Measuring and fostering noncognitive skills in adolescents: Evidence from Chicago public schools and the One Goal program." Unpublished manuscript, University of Chicago, Department of  Economics.

Kyllonen, P. C. (1993, January). "Aptitude testing inspired by information processing: A test of the four sources model." Journal of General Psychology, 120(3): 375–405.

Kyllonen, P. C. (2015). "Designing Tests to Measure Personal Attributes and Noncognitive Skills," In S. Lane, M. R. Raymond, & T. M. Haladyna (Eds.), Handbook of Test Development, 2nd Edition. New York: Routledge.

Kyllonen, P. C. (2016). "Human Cognitive Abilities: Their Organization, Development, and Use," In. L. Corno & E. M. Anderman (Eds.), Handbook of Educational Psychology, Third Edition (pp. 121-134).  New York: Routledge.

Kyllonen, P. C., and Bertling, J. (2013). "Innovative Questionnaire Assessment Methods to Increase Cross-Country Comparability," In L. Rutkowski & M. von Davier, & D. Rutkowski (Eds.), A Handbook of International Large-Scale Assessment Data Analysis: Background, Technical Issues, and Methods of Data Analysis. London: Chapman Hall/CRC Press.

Kyllonen, P. C. and D. L. Stephens (1990). "Cognitive abilities as determinants of success in acquiring logic skill." Learning and Individual Differences, 2(2): 129–160.

Leight, J., P. Glewwe, and A. Park (2014). "The impact of early childhood shocks on the evolution of cognitive and noncognitive skills." Forthcoming, American Economic Review.

Mislevy, R. J. (1994). "Evidence and inference in educational assessment."  Psychometrika 59 (4): 439–483.

Mislevy, R. J. (2011). "Evidence-centered design for simulation-based assessment." Report 800, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).

Mislevy, R. J. and N. Verhelst (1990). "Modeling item responses when different subjects employ different solution strategies."  Psychometrika 55 (2): 195–215.

Rafferty, A., M. LaMar, T. Griffiths (2015). "Inferring Learners' Knowledge From Their Actions." Cognitive Science 39: 584–618.

West, M. R., M. A. Kraft, A. S. Finn, R. Martin, A. L. Duckworth, C. F. Gabrieli, and J. D. Gabrieli (2014). "Promise and paradox: Measuring students’ noncognitive skills and the impact of schooling." Presented at the CESifo Area Conference on Economics of Education, Munich, September 2014.

Hertzog, C., & Robinson, A.E (2005). Metacognition and intelligence. In O. Wilhelm & R. W. Engle (Eds.), Handbook of understanding and measuring intelligence (pp. 101-124). Thousand Oaks, CA: Sage.

Nelson, T.O., & Narens, L. (1990).  Metamemory: A theoretical framework and new findings. In G. Bower (Ed.), The psychology of learning and motivation, vol. 26. Academic Press. 

Kleitman, S., & Stankov, L. (2007). Self-confidence and metacognitive processes. Learning and individual differences, 17, 161-173. 

Stankov, L. (1998). Calibration curves, scatterplots and the distinction between general knowledge and perceptual tasks. Learning and individual differences, 10(1), 29-50. 

Jackson, S. A., Kleitman, S., Howie, P., & Stankov, L. (2015). Cognitive abilities, monitoring and control explain individual differences in heuristics and biases. Unpublished manuscript. Sydney University Psychology Department.

Kruger, J., & Dunning, D. (1999). Unskilled and Unaware of It: How Difficulties in Recognizing One's Own Incompetence Lead to Inflated Self-Assessments.  Journal of Personality and Social Psychology,  77 (6), "1121–34."

Dunning, D., Johnson, K., Ehrlinger, J., & Kruger, J. (2003). Why people fail to recognize their own incompetence.  Current Directions in Psychological Science, 12 (3),  83–87. "

General Review Papers

Schnipke, D. L., & Scrams, D. J. (2002). Exploring issues of examinee behavior: Insights gained from response-time analyses. In C. N. Mills, M. Potenza, J. J. Fremer & W. Ward (Eds.), Computer-based testing: Building the foundation for future assessments (pp. 237–266). Hillsdale, NJ: Lawrence Erlbaum Associates.

Lee, Y.-H., & Chen, H. (2011). A review of recent response-time analyses in educational testing. Psychological test and assessment modeling, 53(3), 359-379.

Carroll, 1993 (chapter 11). Human cognitive abilities: A survey of factor analytic studies. Chapter 11, Abilities in the domain of cognitive speed (pp. 440-509). New York: Cambridge University Press.

Modern Models of Response Time

Tuerlinckx, F., & De Boeck, P. (2005). Two interpretations of the discrimination parameter. Psychometrika, 70, 629–650. doi:10.1007/s11336-000-0810-3

Van der Linden, W. J. (2007). A hierarchical framework for Modeling Speed and Accuracy on Test Items. Psychometrika, 72(3), 287-308.

Van der Linden, W. J. (2009). Conceptual issues in response-time modeling. Journal of Educational Measurement, 46 (3), 247–272.

Klein Entink, R. H., Fox, J. -P., & van der Linden, W. J. (2009). A multivariate multilevel approach to the modeling of accuracy and speed of test takers. Psychometrika, 74 ,21−48.

Klein Entink, R. H., Kuhn, J.-T., Hornke, L. F., & Fox, J.-P. (2009). Evaluating cognitive theory: A joint modeling approach using responses and response times. Psychological Methods, 14, 54–75.

van der Maas, H. L., Molenaar, D., Maris, G., Kievit, R. A., & Borsboom, D. (2011). Cognitive psychology meets psychometric theory: On the relation between process models for decision making and latent variable models for individual differences. Psychological Review, 118(2), 339–177. doi:10.1037/a002274

Molenaar, D., Tuerlinckx, F., van der Maas, H. L. J. (2014). A generalized linear factor model approach to the hierarchical framework for responses and response times. British Journal of Mathematical and Statistical Psychology, DOI:10.1111/bmsp.12042.

Lee, Y.-H., & Ying, Z. (2015). A mixture cure-rate model for responses and response times in time-limit tests. Psychometrika, 80(3),748-775. doi: 10.1007/s11336-014-9419-8.

Rouder, J. N., Province, J. M., Morey, R. M., Gomez, P. & Heathcote, A. (2015). The Lognormal Race: A Cognitive-Process Model of Choice and Latency with Desirable Psychometric Properties. Psychometrika, 80(2), 491-513.

Use of Response Times in Studying Test-taking Behaviors

Lee, Y.-H., & Jia, Y. (2014). Using response time to investigate students’ test-taking behaviors in a NAEP computer-based study. Large-scale Assessments in Education, 2:8. doi:10.1186/s40536-014-0008-1. 

Lee, Y.-H., & Haberman, S.J. (2015). Investigating test-taking behaviors using timing and process data. International Journal of Testing. doi: 10.1080/15305058.2015.1085385

Goldhammer, F., & Klein Entink, R. H. (2011). Speed of reasoning and its relation to reasoning ability. Intelligence, 39, 108–119.  

Partchev, I., & De Boeck, P. (2012). Can fast and slow intelligence be differentiated? Intelligence, 40(1), 23-32.

Dodonova, Y. A., & Dodonov, Y. S. (2013). Faster on easy items, more accurate on difficult ones: Cognitive ability and performance on a task of varying difficulty. Intelligence, 41, 1-10. 

Borsboom, D., & Molenaar, D. (2015). Psychometrics. International Encyclopedia of the Social and Behavioral Sciences (2nd edition). 418-422.

Experiments Manipulating Time Availability, Generally Showing Little Effect

Bridgeman, B., & Cline, F. (2000). Variations in mean response time for questions on the computer-adaptive GRE general test: Implications for fair assessment. ETS Research Report, GRE No. 96-20P  Princeton, NJ: ETS.

Wainer, H., Bridgeman, B., & Najarian, M. (2012). How Much Does Extra Time on the SAT Help? Chance, 17(2), 19-24.

Mandinach, E.B., Bridgeman, B., Laitusis, C., & Trapani, C. (2005). The impact of extended time on SAT test performance. ETS Research Report 2005-8. Princeton, NJ: ETS.

Bridgeman, B., Trapani, C., & Curley, E. (2004). Impact of fewer questions per section on SAT I scores. Journal of Educational Measurement, 41, 291–310.

Van der Linden, W. J., Scrams, D. J., Schnipke, D. L. (1999). Using Response-Time Constraints to Control for Differential Speededness in Computerized Adaptive Testing. Applied Psychological Measurement, 23(3), 195-210.

Distraction and Persistence as Separate Parameters

Pieters, L. P. M., & van der Ven, A. H. G. S. (1982). Precision, speed, and distraction in time limit-tests. Applied Psychological Measurement, 6, 93–109.

Furneaux, W.D. (1961). Intellectual abilities and problem solving behavior. In H.J. Eysenck (Ed.), Handbook of Abnormal Psychology (pp. 167-192). New York: Basic Books.

Eysenck, H.J. (1981). A model for intelligence. New York: Springer-Verlag.

Speed-accuracy Tradeoff

Wickegren, W. (1977). Speed-accuracy tradeoff and information processing dynamics. Acta Psycholgica, 41, 67-85. 

Dennis, I., & Evans, J. S. B. T. (1996). The speed-error trade-off problem in psychometric testing. British Journal of Psychology, 87(1), 105-129.