Title: | Variational Autoencoder Models for IRT Parameter Estimation |
---|---|
Description: | Based on the work of Curi, Converse, Hajewski, and Oliveira (2019) <doi:10.1109/IJCNN.2019.8852333>. This package provides easy-to-use functions which create a variational autoencoder (VAE) to be used for parameter estimation in Item Response Theory (IRT) - namely the Multidimensional Logistic 2-Parameter (ML2P) model. To use a neural network as such, nontrivial modifications to the architecture must be made, such as restricting the nonzero weights in the decoder according to some binary matrix Q. The functions in this package allow for straight-forward construction, training, and evaluation so that minimal knowledge of 'tensorflow' or 'keras' is required. |
Authors: | Geoffrey Converse [aut, cre, cph], Suely Oliveira [ctb, ths], Mariana Curi [ctb] |
Maintainer: | Geoffrey Converse <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0.1 |
Built: | 2024-10-27 03:34:38 UTC |
Source: | https://github.com/cran/ML2Pvae |
Display a message upon loading package
.onLoad(libnam, pkgname)
.onLoad(libnam, pkgname)
libnam |
the library name |
pkgname |
the package name |
Build a VAE that fits to a standard N(0,I) latent distribution with independent latent traits
build_vae_independent( num_items, num_skills, Q_matrix, model_type = 2, enc_hid_arch = c(ceiling((num_items + num_skills)/2)), hid_enc_activations = rep("sigmoid", length(enc_hid_arch)), output_activation = "sigmoid", kl_weight = 1, learning_rate = 0.001 )
build_vae_independent( num_items, num_skills, Q_matrix, model_type = 2, enc_hid_arch = c(ceiling((num_items + num_skills)/2)), hid_enc_activations = rep("sigmoid", length(enc_hid_arch)), output_activation = "sigmoid", kl_weight = 1, learning_rate = 0.001 )
num_items |
an integer giving the number of items on the assessment; also the number of nodes in the input/output layers of the VAE |
num_skills |
an integer giving the number of skills being evaluated; also the dimensionality of the distribution learned by the VAE |
Q_matrix |
a binary, |
model_type |
either 1 or 2, specifying a 1 parameter (1PL) or 2 parameter (2PL) model; if 1PL, then all decoder weights are fixed to be equal to one |
enc_hid_arch |
a vector detailing the size of hidden layers in the encoder; the number of hidden layers is determined by the length of this vector |
hid_enc_activations |
a vector specifying the activation function in each hidden layer in the encoder; must be the same length as |
output_activation |
a string specifying the activation function in the output of the decoder; the ML2P model always uses 'sigmoid' |
kl_weight |
an optional weight for the KL divergence term in the loss function |
learning_rate |
an optional parameter for the adam optimizer |
returns three keras models: the encoder, decoder, and vae.
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q, enc_hid_arch = c(6, 3), hid_enc_activation = c('sigmoid', 'relu'), output_activation = 'tanh', kl_weight = 0.1) models <- build_vae_independent(4, 2, Q) vae <- models[[3]]
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q, enc_hid_arch = c(6, 3), hid_enc_activation = c('sigmoid', 'relu'), output_activation = 'tanh', kl_weight = 0.1) models <- build_vae_independent(4, 2, Q) vae <- models[[3]]
A symmetric positive definite matrix detailing the correlations among three latent traits.
correlation_matrix
correlation_matrix
A data frame with 3 rows and 3 columns
Generated using the python package SciPy
Difficulty parameters for an exam with 30 items.
diff_true
diff_true
A data frame with 30 rows and one column. Each entry corresponds to the true value of a particular difficulty parameter.
Each entry is sampled uniformly from [-3,3]
.
Difficulty parameters for an exam of 30 items assessing 3 latent abilities.
disc_true
disc_true
A data frame with 3 rows and 30 columns. Entry [k,i]
represents the discrimination
parameter between item i
and ability k
.
Each entry is sampled uniformly from [0.25,1.75]
.
If an entry in q_matrix.rda
is 0, then so is the corresponding entry in disc_true.rda
.
Feed forward response sets through the encoder, which outputs student ability estimates
get_ability_parameter_estimates(encoder, responses)
get_ability_parameter_estimates(encoder, responses)
encoder |
a trained keras model; should be the encoder returned from either |
responses |
a |
a list where the first entry contains student ability estimates and the second entry holds the variance (or covariance matrix) of those estimates
data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4) Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q, model_type = 2) encoder <- models[[1]] ability_parameter_estimates_variances <- get_ability_parameter_estimates(encoder, data) student_ability_est <- ability_parameter_estimates_variances[[1]]
data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4) Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q, model_type = 2) encoder <- models[[1]] ability_parameter_estimates_variances <- get_ability_parameter_estimates(encoder, data) student_ability_est <- ability_parameter_estimates_variances[[1]]
Get trainable variables from the decoder, which serve as item parameter estimates.
get_item_parameter_estimates(decoder, model_type = 2)
get_item_parameter_estimates(decoder, model_type = 2)
decoder |
a trained keras model; can either be the decoder or vae returned from |
model_type |
either 1 or 2, specifying a 1 parameter (1PL) or 2 parameter (2PL) model; if 1PL, then only the difficulty parameter estimates (output layer bias) will be returned; if 2PL, then the discrimination parameter estimates (output layer weights) will also be returned |
a list which contains item parameter estimates; the length of this list is equal to model_type - the first entry in the list holds the difficulty parameter estimates, and the second entry (if 2PL) contains discrimination parameter estimates
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q, model_type = 2) decoder <- models[[2]] item_parameter_estimates <- get_item_parameter_estimates(decoder, model_type = 2) difficulty_est <- item_parameter_estimates[[1]] discrimination_est <- item_parameter_estimates[[2]]
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q, model_type = 2) decoder <- models[[2]] item_parameter_estimates <- get_item_parameter_estimates(decoder, model_type = 2) difficulty_est <- item_parameter_estimates[[1]] discrimination_est <- item_parameter_estimates[[2]]
The ML2Pvae package includes functions which build a VAE with the desired architecture, and fits the latent skills to either a standard normal (independent) distrubution, or a multivariate normal distribution with a full covariance matrix. Based on the work "Interpretable Variational Autoencdoers for Cognitive Models" by Curi, M., Converse, G., Hajewski, J., and Oliveira, S. Found in International Joint Conference on Neural Networks, 2019.
A custom kernel constraint function that forces nonzero weights to be equal to one, so the VAE will estimate the 1-parameter logistic model. Nonzero weights are determined by the Q matrix.
q_1pl_constraint(Q)
q_1pl_constraint(Q)
Q |
a binary matrix of size |
returns a function whose parameters match keras kernel constraint format
A custom kernel constraint function that restricts weights between the learned distribution and output. Nonzero weights are determined by the Q matrix.
q_constraint(Q)
q_constraint(Q)
Q |
a binary matrix of size |
returns a function whose parameters match keras kernel constraint format
The Q-matrix determines the relation between items and abilities.
q_matrix
q_matrix
A data frame with 3 rows and 30 columns. If entry [k,i] = 1
,
then item i
requires skill k
.
Generated by sampling each entry from Bernoulli(0.35)
, but ensures
each item assess at least one latent ability
Simulated response sets for 5000 students on an exam with 30 items.
responses
responses
A data frame with 30 columns and 5000 rows.
Entry [j,i]
is 1 if student j
answers item i
correctly, and 0 otherwise.
Generated by sampling from the probability of student success
on a given item according to the ML2P model. Model parameters can be found in
diff_true.rda
, disc_true.rda
, and theta_true.rda
.
A reparameterization in order to sample from the learned standard normal distribution of the VAE
sampling_independent(arg)
sampling_independent(arg)
arg |
a layer of tensors representing the mean and variance |
Three correlated ability parameters for 5000 students.
theta_true
theta_true
A data frame with 5000 rows and 3 columns. Each row represents a particular student's three latent abilities.
Generated by sampling from a 3-dimensional multivariate Gaussian distribution
with mean 0 and covariance matrix correlation_matrix.rda
.
Trains a VAE or autoencoder model. This acts as a wrapper for keras::fit().
train_model( model, train_data, num_epochs = 10, batch_size = 1, validation_split = 0.15, shuffle = FALSE, verbose = 1 )
train_model( model, train_data, num_epochs = 10, batch_size = 1, validation_split = 0.15, shuffle = FALSE, verbose = 1 )
model |
the keras model to be trained; this should be the vae returned from |
train_data |
training data; this should be a binary |
num_epochs |
number of epochs to train for |
batch_size |
batch size for mini-batch stochastic gradient descent; default is 1, detailing pure SGD; if a larger batch size is used (e.g. 32), then a larger number of epochs should be set (e.g. 50) |
validation_split |
split percentage to use as validation data |
shuffle |
whether or not to shuffle data |
verbose |
verbosity levels; 0 = silent; 1 = progress bar and epoch message; 2 = epoch message |
a list containing training history; this holds the loss from each epoch which can be plotted
data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4) Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q) vae <- models[[3]] history <- train_model(vae, data, num_epochs = 3, validation_split = 0, verbose = 0) plot(history)
data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4) Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4) models <- build_vae_independent(4, 2, Q) vae <- models[[3]] history <- train_model(vae, data, num_epochs = 3, validation_split = 0, verbose = 0) plot(history)
A custom loss function for a VAE learning a standard normal distribution
vae_loss_independent(encoder, kl_weight, rec_dim)
vae_loss_independent(encoder, kl_weight, rec_dim)
encoder |
the encoder model of the VAE, used to obtain z_mean and z_log_var from inputs |
kl_weight |
weight for the KL divergence term |
rec_dim |
the number of nodes in the input/output of the VAE |
returns a function whose parameters match keras loss format
Give error messages for invalid inputs in exported functions.
validate_inputs( num_items, num_skills, Q_matrix, model_type = 2, mean_vector = rep(0, num_skills), covariance_matrix = diag(num_skills), enc_hid_arch = c(ceiling((num_items + num_skills)/2)), hid_enc_activations = rep("sigmoid", length(enc_hid_arch)), output_activation = "sigmoid", kl_weight = 1, learning_rate = 0.001 )
validate_inputs( num_items, num_skills, Q_matrix, model_type = 2, mean_vector = rep(0, num_skills), covariance_matrix = diag(num_skills), enc_hid_arch = c(ceiling((num_items + num_skills)/2)), hid_enc_activations = rep("sigmoid", length(enc_hid_arch)), output_activation = "sigmoid", kl_weight = 1, learning_rate = 0.001 )
num_items |
the number of items on the assessment; also the number of nodes in the input/output layers of the VAE |
num_skills |
the number of skills being evaluated; also the size of the distribution learned by the VAE |
Q_matrix |
a binary, |
model_type |
either 1 or 2, specifying a 1 parameter (1PL) or 2 parameter (2PL) model |
mean_vector |
a vector of length |
covariance_matrix |
a symmetric, positive definite, |
enc_hid_arch |
a vector detailing the number an size of hidden layers in the encoder |
hid_enc_activations |
a vector specifying the activation function in each hidden layer in the encoder; must be the same length as |
output_activation |
a string specifying the activation function in the output of the decoder; the ML2P model alsways used 'sigmoid' |
kl_weight |
an optional weight for the KL divergence term in the loss function |
learning_rate |
an optional parameter for the adam optimizer |