Chapter 7 Refining R Markdown and R code
In this session, we will learn
- how to read in a text data file rather than a csv file
- how to make nice tables in R Markdown
7.1 Read a data file that is a text file.
Sometimes the item response data file may be a text file rather than a csv file. To read a text file, we can use the “fixed width format (fwf)” read command to read the file. The data file and test questions can be downloaded here. The following shows an excerpt of the data file.
1 123 114 2 13131213421322
2 113 214 2 13132144342132
3 123 314 2 13131213421322
4 113 414 2 13131213421321
5 123 514 2 13131213421322
6 113 614 2 13131211213229
In this data file, there are 13 item responses per student. The item responses are in columns 12 to 24.
The following R code will read this data file, and score the raw responses using the CTT score function. We will also read in the key as a single string of characters. We will get R to break up the key into individual characters.
setwd("C:/G_MWU/ARC/Philippines/files") #set working directory to your folder
<- read.fwf("Maths13.dat", widths=c(11,rep(1,13)), stringsAsFactors = FALSE)
d <- d[,c(2:14)]
resp_raw colnames(resp_raw) <- paste0("Item",seq(1,13))
<- "3131213223324" #Easier to type the key as one string, than with quote marks around each one.
key <- unlist(strsplit(key,"")) #Let R split the key into individual characters
key <- score(resp_raw,key,output.scored = TRUE) # score the raw responses using CTT score function
s <- s$scored resp
Insert the above R code into the R Markdown document you created in the previous session (6.5). Produce an R Markdown item analysis for Maths13. Are there items that can be improved?
7.2 Further Refine R Markdown document
The R Markdown document can be revised for further readability. The following is an R Markdown file with better presentation and more detailed results.
---
title: "Test R Markdown"
author: "MWU"
date: "21/11/2021"
output: html_document
---
```{r echo=FALSE, results='hide', warning=FALSE, message=FALSE}
rm(list=ls())
library(TAM)
library(CTT)
library(WrightMap)
library(knitr)
testname <- "Maths13"
setwd("C:/G_MWU/ARC/Philippines/files") #set working directory to your folder
d <- read.fwf("Maths13.dat", widths=c(11,rep(1,13)), stringsAsFactors = FALSE)
resp_raw <- d[,c(2:14)]
colnames(resp_raw) <- paste0("Item",seq(1,13))
key <- "3131213223324" #Easier to type the key as one string, than with quote marks around each one.
key <- unlist(strsplit(key,"")) #Let R split the key into individual characters
s <- score(resp_raw,key,output.scored = TRUE)
resp <- s$scored
IA <- itemAnalysis(resp)
mod1 <- tam.jml(resp)
tamDistr <- tam.ctt(resp_raw,mod1$WLE)
ItemStats <- data.frame(IA$itemReport$itemName, IA$itemReport[,c(2,3,5)], mod1$xsi)
colnames(ItemStats) <- c("Item Name", "Item Mean", "pBis", "alphaIfDeleted", "IRT difficulty")
lowdisc <- "No item"
d <- which(IA$itemReport$pBis<0.2)
if(length(d)>0) lowdisc <- paste("Item", d)
target <- "on target"
if(mean(mod1$xsi)-mean(mod1$theta)> 0.2) target <- "difficult"
if(mean(mod1$xsi)-mean(mod1$theta)< -0.2) target <- "easy"
```
# Introduction
This report provides a summary of an item analysis for a numeracy test called \"`r testname`\". Results were generated in the open source statistical software program R (R Core Team, 2017). The aim of the analysis is to identify poor quality test items, and whether the difficulty of the test targets the ability of the students.
# The data
A total of `r nrow(resp)` students completed the assessment. The instrument included `r ncol(resp)` items. The CTT test reliability is `r round(IA$alpha,2)`.
# Analysis results
The following is a table of item statistics
```{r echo=FALSE, comment=NA}
kable(ItemStats, align="ccccc", caption="Table 1: Summary of Item Statistics",row.names = FALSE, digits = 3)
```
# Item discrimination index
Item discrimination indexes for all items ranged from `r round(min(IA$itemReport$pBis),2)` to `r round(max(IA$itemReport$pBis),2)`.
Items with discrimination < 0.2: `r lowdisc`
# Test targeting
The mean student abilities is `r round(mean(mod1$theta),5)`, while the mean item difficulties is `r round(mean(mod1$xsi),5)`.
The test is `r target` for the students.
# The Wright Map
```{r echo=FALSE, fig.align='center'}
p <- wrightMap(mod1$WLE,mod1$xsi,item.side=itemClassic)
```
# Appendix A: Item distractor analysis
```{r, echo=FALSE, comment=NA, message=FALSE, warning=FALSE,results="asis"}
ctt_raw <- cbind(tamDistr[,3:7],round(tamDistr[,8:10],2))
colnames(ctt_raw) <- c("Seq","Item", "Total", "Category", "Count", "Percent", "Pbs", "MeanAbility")
for (i in 1:ncol(resp_raw)){
itemi <- ctt_raw[ctt_raw$Item==colnames(resp_raw)[i],]
print(kable(itemi,row.names=FALSE,align='ccccrrrr',caption="===================================================================================================="))
}
```
# Appendix B: Item characteristic curves
The IRT item characteristic curves are plotted below.
```{r echo=FALSE, results='hide', comment=NA, message=FALSE, warning=FALSE, fig.height = 3.5, fig.width = 6, fig.align='center'}
plot(mod1)
```
7.3 Exercise
Use the data set, GrammarB2raw (See Homework 5.4), to run the above Rmd code to produce a report.
7.4 Homework
Download the data set for a general ability test here. This file is in text format, and the item responses are not scored. Use the R Markdown file in this session to analyse this data file.