### Import Test Data

#### Import Settings

MC Test Analysis can import data in CSV or TSV form. Choose the appropriate settings for your data files below and then upload the answer key and students' test results data in the columns to the right.

The Answer Key file should contain four columns in the following order:
1. Question number, name or identifier
• Eg. Q1 , 1 , etc.
2. The correct answer for the question
• Eg. 1 , A , etc.
3. A descriptive title for the question
• Eg. Talyor Function , Tensor Flow , etc.
4. An identifier for the concept group to which the question belongs
• Eg. Taylor Series , A , Concept 1 , etc.

#### Test Data

The test results data should contain as rows each student's response, with each question assigned a column. If the test results data contains student identifiers, these identifiers should be included in the first column, prior to the test answers. The MC Test Analysis Tool assumes that the question columns are in the same order as reported in the answers data. Download an example test data file.

### View Test Results

This table shows the number of students who chose a given option for each question.

### Classic Test Theory Results

The plot below shows the histogram of overall test scores for all students, with a (scaled) normal curve overlay for reference.
The below scatter plots compares the selected measure of item discrimination with the item difficulty. Dotted guidelines indicate the recommended ranges for each index. A discrimination index (or PBCC or Modified PBCC) of less than 0.2 is not recommended, while item difficulty should generally be between 0.2 and 0.8.
This plot compares the overall test scores against correct selection of individual items. Generally, it is best for the boxplot of the correct group to be mostly above the boxplot of the incorrect group. Questions that have complete overlap between the two boxplots should be reviewed.
Options
This table uses a number of heuristics and guidelines to guide the user in reviewing individual items when deciding to keep, modify or discard an item.

### Item Response Theory Results

Plot the Item Characteristic Curves for 1-, 2- or 3-PL IRT models.

### Factor Analysis Results

The following plot shows the item-by-item tetrachoric correlation for all questions in the test. The tetrachoric correlation estimates the correlation between two variables whose measurement is artificially dichotomized but whose underlying joint ditribution is a bivariate normal distribution. Structural features of the tetrachoric matrix directly correpsond to the structure of the underlying latent variables measured by the test. For more information and resources, visit the Personality Project webpage.

#### Options

### Distractor Analysis Results

The following plot and table compare the percentage of all respondents who select a given option for each item. These tables allow the test administrator to analize the performance of item options and to determine if the choice of distracting items reveals information about the misconceptions in students' knowledge.

#### Test Information

Download a report containing a summary of the analysis demonstrated throughout this interface. The report will use the test data you chose in the "Import" tab. Fill in the test details below and choose your desired output options from the selections on the right.
Generating the report may take a little while once the button is clicked.

#### Report Settings

##### Exploratory Factor Analysis
See the psych package documentation for more information about these options.

Many educators design multiple-choice question examination. How do we know that these tests are valid and reliable? How can we improve upon the test by way of modifying, revising and deleting items based on student responses?

In a paper in the highly regarded Journal of Engineering Education, Jorion, et al (2016) developed "an analytical framework for evaluating the validity of concept inventory claims". We believe that we can use this framework to help educators design their multiple-choice tests as well, especially, if they are designed as the final mastery examination in a course. An open source software to analyze a multiple-choice question examination would be encouraging to educators who have minimal programming experience and promising to contributors who would enhance the program.

### Authors

Garrick Aden-Buie is a doctoral candidate in Industrial and Management Systems Engineering at the University of South Florida. He is an avid R enthusiast and programmer. His research focus is on collecting, storing, processing, visualizing and learning from passive sensors networks in smart homes. He is also passionate about bringing together education, data science and interactive R tools to improve education outcomes in higher education.

Autar Kaw is a professor of mechanical engineering and Jerome Krivanek Distinguished Teacher at the University of South Florida. He is a recipient of the 2012 U.S. Professor of the Year Award from the Council for Advancement and Support of Education (CASE) and Carnegie Foundation for Advancement of Teaching. Professor Kaw's related main scholarly interests are in engineering education research, open courseware development, and the state and future of higher education. His education research has been funded by National Science Foundation since 2002.

### References

#### Test Theory References

Baker, F. B. (2001). The basics of item response theory (2nd ed.). ERIC Clearinghouse on Assessment; Evaluation. Retrieved from http://echo.edres.org:8080/irt/baker/

Bond, T. G., & Fox, C. M. (2007). Applying the rasch model: Fundamental measurement in the human sciences (1st ed.). Mahwah, N.J.: Lawrence Erlbaum Associates Publishers.

DiBello, L. V., Henson, R. A., & Stout, W. F. (2015). A family of generalized diagnostic classification models for multiple choice option-based scoring. Applied Psychological Measurement, 39(1), 62–79. https://doi.org/10.1177/0146621614561315

Haertel, E. H., & Lorie, W. A. (2004). Validating standards-based test score interpretations. Measurement: Interdisciplinary Research and Perspectives, 2(2), 61–103. https://doi.org/10.1207/s15366359mea0202_1

Jorion, N., Gane, B. D., James, K., Schroeder, L., DiBello, L. V., & Pellegrino, J. W. (2015). An analytic framework for evaluating the validity of concept inventory claims. Journal of Engineering Education, 104(4), 454–496. https://doi.org/10.1002/jee.20104

Revelle, W. (2017). Northwestern University; http://personality-project.org/r/book/.

Sleeper, R. (2011). Keep, toss or revise? Tips for post-exam item analysis. http://www.ttuhsc.edu/sop/administration/enhancement/documents/Sleeper_Handout.ppt (URL no longer valid).

#### Packages Used

Revelle, W. (2016). Psych: Procedures for psychological, psychometric, and personality research. Evanston, Illinois: Northwestern University; https://CRAN.R-project.org/package=psych. Retrieved from https://CRAN.R-project.org/package=psych

Rizopoulos, D. (2006). Ltm: An r package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17(5), 1–25. Retrieved from http://www.jstatsoft.org/v17/i05/