Tutorial 7:  Differential Item Functioning

      (2013-8-1)

Home
1. Install R
2. Install TAM
3. Rasch Model
4. CTT, Fit
5. Partial Credit Model
6. Population Model
7. DIF

More resources

Summary of Tutorial

This tutorial shows how to carry out differential item functioning analysis.

R script file

The R script file for this tutorial can be downloaded through the link Tutorial7.R .

Data files

The data file is from FIMS (The First International Mathematics Study, IEA), the same file as for the Exercises in Tutorial 4. The test questions and data file can be downloaded from the following links.

   FIMS Test A (part) Questions
   FIMS Test A Australian data

Load R libraries

Before TAM functions can be called, the TAM library has to be loaded into the R workspace. Line 2 loads the TAM library.

Read data file

Line 5 sets the working directory and line 6 reads the data file "FIMS_AUS_TestA.csv" into a variable called "FIMSdata". Line 7 extracts the item responses into a variable called "raw_resp",  as column 1 contains the gender variable.
          setwd("C:/G_MWU/TAM/Tutorials/data")
          FIMSdata <- read.csv("FIMS_AUS_TestA.csv")
          raw_resp <- FIMSdata[,2:15]

Score the data file

Lines 10-11 specify the keys and score the item responses.
          key <- c(1, 3, 1, 4, 1, 3, 3, 1, 5, 1, 1, 4, 1, 1)
          scored <- sapply( seq(1,length(key)), FUN = function(ii){ 1*(raw_resp[,ii] == key[ii]) } )

Run IRT analysis

Line 14 runs an IRT analysis with MML estimation.
          mod1 <- tam(scored)

Results of the IRT analysis can be show using R command "summary(mod1)".

Line 17 extracts values of the gender variable into a variable called "gender".
          gender <- FIMSdata[,1]

Line 20 computes the test score for each student by calculating the row sum of each student's scored responses.
          raw_score <- rowSums(scored)

Line 23 computes the mean test score for each gender group: 1=male, and 2=female
          aggregate(raw_score,by=list(gender),FUN=mean)

The mean test score is 6.12 for group 1 (males) and 6.27 for group 2 (females). That is, the two groups performed similarly, with girls having a slightly higher mean test score. The step of computing raw test scores is not necessary for the IRT analyses. But it's always a good practice to explore the data a little before delving into more complex analyses.

Facets analysis

To conduct a DIF analysis, we set up the variable "gender" as a facet and re-run the IRT analysis. Lines 26-28 show the R code to do this.
          formulaA <- ~item+gender+item*gender
          facets <- as.data.frame(gender)
          mod2 <- tam.mml.mfr( resp= scored, facets= facets , formulaA = formulaA )

Line 31 shows the results.
          summary(mod2)

The interaction term in the facet formula, item*gender, provides the magnitudes of item DIF. An excerpt of the output is shown below.
         15 gender1 gender -0.003 0.010
         16 gender2 gender 0.003 NA
         17 item1:gender1 item:gender 0.222 0.025
         18 item2:gender1 item:gender -0.161 0.025
         19 item3:gender1 item:gender 0.182 0.026
         20 item4:gender1 item:gender 0.180 0.024
         21 item5:gender1 item:gender -0.332 0.030

A quick significance test can be conducted by dividing the interatcion estimate by its standard error. For example, for item 5, calculate -0.332/0.030= -11. Compare this with a z-statistic (within -2 and 2?)

The second way is to consider the effect size of DIF. For item 5, the difference in item difficulty between the two gender groups is 2x0.332 = 0.66. How siginificant in real terms is this magnitude?

Exercises

The following data set contains FIMS data for both Australia and Japan. The country code is in column 16. The gender code is in column 1. Carry out a DIF analysis with respect to "country" (i.e., difference between Australia and Japan).
   FIMS Test A - data for Australia and Japan