Package 'ML2Pvae' reference manual

Title:	Variational Autoencoder Models for IRT Parameter Estimation
Description:	Based on the work of Curi, Converse, Hajewski, and Oliveira (2019) <doi:10.1109/IJCNN.2019.8852333>. This package provides easy-to-use functions which create a variational autoencoder (VAE) to be used for parameter estimation in Item Response Theory (IRT) - namely the Multidimensional Logistic 2-Parameter (ML2P) model. To use a neural network as such, nontrivial modifications to the architecture must be made, such as restricting the nonzero weights in the decoder according to some binary matrix Q. The functions in this package allow for straight-forward construction, training, and evaluation so that minimal knowledge of 'tensorflow' or 'keras' is required.
Authors:	Geoffrey Converse [aut, cre, cph], Suely Oliveira [ctb, ths], Mariana Curi [ctb]
Maintainer:	Geoffrey Converse <[email protected]>
License:	MIT + file LICENSE
Version:	1.0.0.1
Built:	2025-01-25 03:18:52 UTC
Source:	https://github.com/cran/ML2Pvae

Display a message upon loading package

Description

Display a message upon loading package

Usage

.onLoad(libnam, pkgname)
.onLoad(libnam, pkgname)

Arguments

`libnam`	the library name
`pkgname`	the package name

Build the encoder for a VAE

Description

Build the encoder for a VAE

Usage

build_hidden_encoder(
  input_size,
  layers,
  activations = rep("sigmoid", length(layers))
)
build_hidden_encoder(
  input_size,
  layers,
  activations = rep("sigmoid", length(layers))
)

Arguments

`input_size`	an integer representing the number of items
`layers`	a list of integers giving the size of each hidden layer
`activations`	a list of strings, the same length as layers

Value

two tensors: the input layer to the VAE and the last hidden layer of the encoder

Build a VAE that fits to a normal, full covariance N(m,S) latent distribution

Description

Build a VAE that fits to a normal, full covariance N(m,S) latent distribution

Usage

build_vae_correlated(
  num_items,
  num_skills,
  Q_matrix,
  mean_vector = rep(0, num_skills),
  covariance_matrix = diag(num_skills),
  model_type = 2,
  enc_hid_arch = c(ceiling((num_items + num_skills)/2)),
  hid_enc_activations = rep("sigmoid", length(enc_hid_arch)),
  output_activation = "sigmoid",
  kl_weight = 1,
  learning_rate = 0.001
)
build_vae_correlated(
  num_items,
  num_skills,
  Q_matrix,
  mean_vector = rep(0, num_skills),
  covariance_matrix = diag(num_skills),
  model_type = 2,
  enc_hid_arch = c(ceiling((num_items + num_skills)/2)),
  hid_enc_activations = rep("sigmoid", length(enc_hid_arch)),
  output_activation = "sigmoid",
  kl_weight = 1,
  learning_rate = 0.001
)

Arguments

`num_items`	an integer giving the number of items on the assessment; also the number of nodes in the input/output layers of the VAE
`num_skills`	an integer giving the number of skills being evaluated; also the dimensionality of the distribution learned by the VAE
`Q_matrix`	a binary, `num_skills` by `num_items` matrix relating the assessment items with skills
`mean_vector`	a vector of length `num_skills` specifying the mean of each latent trait; the default of `rep(0, num_skills)` should almost always be used
`covariance_matrix`	a symmetric, positive definite, `num_skills` by `num_skills` matrix giving the covariance of the latent traits
`model_type`	either 1 or 2, specifying a 1 parameter (1PL) or 2 parameter (2PL) model; if 1PL, then all decoder weights are fixed to be equal to one
`enc_hid_arch`	a vector detailing the size of hidden layers in the encoder; the number of hidden layers is determined by the length of this vector
`hid_enc_activations`	a vector specifying the activation function in each hidden layer in the encoder; must be the same length as `enc_hid_arch`
`output_activation`	a string specifying the activation function in the output of the decoder; the ML2P model always used 'sigmoid'
`kl_weight`	an optional weight for the KL divergence term in the loss function
`learning_rate`	an optional parameter for the adam optimizer

Value

returns three keras models: the encoder, decoder, and vae

Examples


Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
cov <- matrix(c(.7,.3,.3,1), nrow = 2, ncol = 2)
models <- build_vae_correlated(4, 2, Q,
          mean_vector = c(-0.5, 0), covariance_matrix = cov,
          enc_hid_arch = c(6, 3), hid_enc_activation = c('sigmoid', 'relu'),
          output_activation = 'tanh',
          kl_weight = 0.1)
vae <- models[[3]]

Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
cov <- matrix(c(.7,.3,.3,1), nrow = 2, ncol = 2)
models <- build_vae_correlated(4, 2, Q,
          mean_vector = c(-0.5, 0), covariance_matrix = cov,
          enc_hid_arch = c(6, 3), hid_enc_activation = c('sigmoid', 'relu'),
          output_activation = 'tanh',
          kl_weight = 0.1)
vae <- models[[3]]

Build a VAE that fits to a standard N(0,I) latent distribution with independent latent traits

Description

Build a VAE that fits to a standard N(0,I) latent distribution with independent latent traits

Usage

build_vae_independent(
  num_items,
  num_skills,
  Q_matrix,
  model_type = 2,
  enc_hid_arch = c(ceiling((num_items + num_skills)/2)),
  hid_enc_activations = rep("sigmoid", length(enc_hid_arch)),
  output_activation = "sigmoid",
  kl_weight = 1,
  learning_rate = 0.001
)
build_vae_independent(
  num_items,
  num_skills,
  Q_matrix,
  model_type = 2,
  enc_hid_arch = c(ceiling((num_items + num_skills)/2)),
  hid_enc_activations = rep("sigmoid", length(enc_hid_arch)),
  output_activation = "sigmoid",
  kl_weight = 1,
  learning_rate = 0.001
)

Arguments

`num_items`	an integer giving the number of items on the assessment; also the number of nodes in the input/output layers of the VAE
`num_skills`	an integer giving the number of skills being evaluated; also the dimensionality of the distribution learned by the VAE
`Q_matrix`	a binary, `num_skills` by `num_items` matrix relating the assessment items with skills
`model_type`	either 1 or 2, specifying a 1 parameter (1PL) or 2 parameter (2PL) model; if 1PL, then all decoder weights are fixed to be equal to one
`enc_hid_arch`	a vector detailing the size of hidden layers in the encoder; the number of hidden layers is determined by the length of this vector
`hid_enc_activations`	a vector specifying the activation function in each hidden layer in the encoder; must be the same length as `enc_hid_arch`
`output_activation`	a string specifying the activation function in the output of the decoder; the ML2P model always uses 'sigmoid'
`kl_weight`	an optional weight for the KL divergence term in the loss function
`learning_rate`	an optional parameter for the adam optimizer

Value

returns three keras models: the encoder, decoder, and vae.

Examples


Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q,
          enc_hid_arch = c(6, 3), hid_enc_activation = c('sigmoid', 'relu'),
          output_activation = 'tanh', kl_weight = 0.1)
models <- build_vae_independent(4, 2, Q)
vae <- models[[3]]

Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q,
          enc_hid_arch = c(6, 3), hid_enc_activation = c('sigmoid', 'relu'),
          output_activation = 'tanh', kl_weight = 0.1)
models <- build_vae_independent(4, 2, Q)
vae <- models[[3]]

Simulated latent abilities correlation matrix

Description

A symmetric positive definite matrix detailing the correlations among three latent traits.

Usage

correlation_matrix
correlation_matrix

Format

A data frame with 3 rows and 3 columns

Source

Generated using the python package SciPy

Simulated difficulty parameters

Description

Difficulty parameters for an exam with 30 items.

Usage

diff_true
diff_true

Format

A data frame with 30 rows and one column. Each entry corresponds to the true value of a particular difficulty parameter.

Source

Each entry is sampled uniformly from [-3,3].

Simulated discrimination parameters

Description

Difficulty parameters for an exam of 30 items assessing 3 latent abilities.

Usage

disc_true
disc_true

Format

A data frame with 3 rows and 30 columns. Entry [k,i] represents the discrimination parameter between item i and ability k.

Source

Each entry is sampled uniformly from [0.25,1.75]. If an entry in q_matrix.rda is 0, then so is the corresponding entry in disc_true.rda.

Feed forward response sets through the encoder, which outputs student ability estimates

Description

Feed forward response sets through the encoder, which outputs student ability estimates

Usage

get_ability_parameter_estimates(encoder, responses)
get_ability_parameter_estimates(encoder, responses)

Arguments

`encoder`	a trained keras model; should be the encoder returned from either `build_vae_independent()` or `build_vae_correlated`
`responses`	a `num_students` by `num_items` matrix of binary responses, as used in training

Value

a list where the first entry contains student ability estimates and the second entry holds the variance (or covariance matrix) of those estimates

Examples


data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4)
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q, model_type = 2)
encoder <- models[[1]]
ability_parameter_estimates_variances <- get_ability_parameter_estimates(encoder, data)
student_ability_est <- ability_parameter_estimates_variances[[1]]

data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4)
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q, model_type = 2)
encoder <- models[[1]]
ability_parameter_estimates_variances <- get_ability_parameter_estimates(encoder, data)
student_ability_est <- ability_parameter_estimates_variances[[1]]

Get trainable variables from the decoder, which serve as item parameter estimates.

Description

Get trainable variables from the decoder, which serve as item parameter estimates.

Usage

get_item_parameter_estimates(decoder, model_type = 2)
get_item_parameter_estimates(decoder, model_type = 2)

Arguments

`decoder`	a trained keras model; can either be the decoder or vae returned from `build_vae_independent()` or `build_vae_correlated`
`model_type`	either 1 or 2, specifying a 1 parameter (1PL) or 2 parameter (2PL) model; if 1PL, then only the difficulty parameter estimates (output layer bias) will be returned; if 2PL, then the discrimination parameter estimates (output layer weights) will also be returned

Value

a list which contains item parameter estimates; the length of this list is equal to model_type - the first entry in the list holds the difficulty parameter estimates, and the second entry (if 2PL) contains discrimination parameter estimates

Examples


Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q, model_type = 2)
decoder <- models[[2]]
item_parameter_estimates <- get_item_parameter_estimates(decoder, model_type = 2)
difficulty_est <- item_parameter_estimates[[1]]
discrimination_est <- item_parameter_estimates[[2]]

Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q, model_type = 2)
decoder <- models[[2]]
item_parameter_estimates <- get_item_parameter_estimates(decoder, model_type = 2)
difficulty_est <- item_parameter_estimates[[1]]
discrimination_est <- item_parameter_estimates[[2]]

ML2Pvae: A package for creating a VAE whose decoder recovers the parameters of the ML2P model. The encoder can be used to predict the latent skills based on assessment scores.

Description

The ML2Pvae package includes functions which build a VAE with the desired architecture, and fits the latent skills to either a standard normal (independent) distrubution, or a multivariate normal distribution with a full covariance matrix. Based on the work "Interpretable Variational Autoencdoers for Cognitive Models" by Curi, M., Converse, G., Hajewski, J., and Oliveira, S. Found in International Joint Conference on Neural Networks, 2019.

A custom kernel constraint function that forces nonzero weights to be equal to one, so the VAE will estimate the 1-parameter logistic model. Nonzero weights are determined by the Q matrix.

Description

A custom kernel constraint function that forces nonzero weights to be equal to one, so the VAE will estimate the 1-parameter logistic model. Nonzero weights are determined by the Q matrix.

Usage

q_1pl_constraint(Q)
q_1pl_constraint(Q)

Arguments

`Q`	a binary matrix of size `num_skills` by `num_items`

Value

returns a function whose parameters match keras kernel constraint format

A custom kernel constraint function that restricts weights between the learned distribution and output. Nonzero weights are determined by the Q matrix.

Description

A custom kernel constraint function that restricts weights between the learned distribution and output. Nonzero weights are determined by the Q matrix.

Usage

q_constraint(Q)
q_constraint(Q)

Arguments

`Q`	a binary matrix of size `num_skills` by `num_items`

Value

returns a function whose parameters match keras kernel constraint format

Simulated Q-matrix

Description

The Q-matrix determines the relation between items and abilities.

Usage

q_matrix
q_matrix

Format

A data frame with 3 rows and 30 columns. If entry [k,i] = 1, then item i requires skill k.

Source

Generated by sampling each entry from Bernoulli(0.35), but ensures each item assess at least one latent ability

Response data

Description

Simulated response sets for 5000 students on an exam with 30 items.

Usage

responses
responses

Format

A data frame with 30 columns and 5000 rows. Entry [j,i] is 1 if student j answers item i correctly, and 0 otherwise.

Source

Generated by sampling from the probability of student success on a given item according to the ML2P model. Model parameters can be found in diff_true.rda, disc_true.rda, and theta_true.rda.

A reparameterization in order to sample from the learned multivariate normal distribution of the VAE

Description

A reparameterization in order to sample from the learned multivariate normal distribution of the VAE

Usage

sampling_correlated(arg)
sampling_correlated(arg)

Arguments

arg

a layer of tensors representing the mean and log cholesky transform of the covariance matrix

A reparameterization in order to sample from the learned standard normal distribution of the VAE

Description

A reparameterization in order to sample from the learned standard normal distribution of the VAE

Usage

sampling_independent(arg)
sampling_independent(arg)

Arguments

arg

a layer of tensors representing the mean and variance

Simulated ability parameters

Description

Three correlated ability parameters for 5000 students.

Usage

theta_true
theta_true

Format

A data frame with 5000 rows and 3 columns. Each row represents a particular student's three latent abilities.

Source

Generated by sampling from a 3-dimensional multivariate Gaussian distribution with mean 0 and covariance matrix correlation_matrix.rda.

Trains a VAE or autoencoder model. This acts as a wrapper for keras::fit().

Description

Trains a VAE or autoencoder model. This acts as a wrapper for keras::fit().

Usage

train_model(
  model,
  train_data,
  num_epochs = 10,
  batch_size = 1,
  validation_split = 0.15,
  shuffle = FALSE,
  verbose = 1
)
train_model(
  model,
  train_data,
  num_epochs = 10,
  batch_size = 1,
  validation_split = 0.15,
  shuffle = FALSE,
  verbose = 1
)

Arguments

`model`	the keras model to be trained; this should be the vae returned from `build_vae_independent()` or `build_vae_correlated`
`train_data`	training data; this should be a binary `num_students` by `num_items` matrix of student responses to an assessment
`num_epochs`	number of epochs to train for
`batch_size`	batch size for mini-batch stochastic gradient descent; default is 1, detailing pure SGD; if a larger batch size is used (e.g. 32), then a larger number of epochs should be set (e.g. 50)
`validation_split`	split percentage to use as validation data
`shuffle`	whether or not to shuffle data
`verbose`	verbosity levels; 0 = silent; 1 = progress bar and epoch message; 2 = epoch message

Value

a list containing training history; this holds the loss from each epoch which can be plotted

Examples


data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4)
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q)
vae <- models[[3]]
history <- train_model(vae, data, num_epochs = 3, validation_split = 0, verbose = 0)
plot(history)

data <- matrix(c(1,1,0,0,1,0,1,1,0,1,1,0), nrow = 3, ncol = 4)
Q <- matrix(c(1,0,1,1,0,1,1,0), nrow = 2, ncol = 4)
models <- build_vae_independent(4, 2, Q)
vae <- models[[3]]
history <- train_model(vae, data, num_epochs = 3, validation_split = 0, verbose = 0)
plot(history)

A custom loss function for a VAE learning a multivariate normal distribution with a full covariance matrix

Description

A custom loss function for a VAE learning a multivariate normal distribution with a full covariance matrix

Usage

vae_loss_correlated(
  encoder,
  inv_skill_cov,
  det_skill_cov,
  skill_mean,
  kl_weight,
  rec_dim
)
vae_loss_correlated(
  encoder,
  inv_skill_cov,
  det_skill_cov,
  skill_mean,
  kl_weight,
  rec_dim
)

Arguments

`encoder`	the encoder model of the VAE, used to obtain z_mean and z_log_cholesky from inputs
`inv_skill_cov`	a constant tensor matrix of the inverse of the covariance matrix being learned
`det_skill_cov`	a constant tensor scalar representing the determinant of the covariance matrix being learned
`skill_mean`	a constant tensor vector representing the means of the latent skills being learned
`kl_weight`	weight for the KL divergence term
`rec_dim`	the number of nodes in the input/output of the VAE

Value

returns a function whose parameters match keras loss format

A custom loss function for a VAE learning a standard normal distribution

Description

A custom loss function for a VAE learning a standard normal distribution

Usage

vae_loss_independent(encoder, kl_weight, rec_dim)
vae_loss_independent(encoder, kl_weight, rec_dim)

Arguments

`encoder`	the encoder model of the VAE, used to obtain z_mean and z_log_var from inputs
`kl_weight`	weight for the KL divergence term
`rec_dim`	the number of nodes in the input/output of the VAE

Value

returns a function whose parameters match keras loss format

Give error messages for invalid inputs in exported functions.

Description

Give error messages for invalid inputs in exported functions.

Usage

validate_inputs(
  num_items,
  num_skills,
  Q_matrix,
  model_type = 2,
  mean_vector = rep(0, num_skills),
  covariance_matrix = diag(num_skills),
  enc_hid_arch = c(ceiling((num_items + num_skills)/2)),
  hid_enc_activations = rep("sigmoid", length(enc_hid_arch)),
  output_activation = "sigmoid",
  kl_weight = 1,
  learning_rate = 0.001
)
validate_inputs(
  num_items,
  num_skills,
  Q_matrix,
  model_type = 2,
  mean_vector = rep(0, num_skills),
  covariance_matrix = diag(num_skills),
  enc_hid_arch = c(ceiling((num_items + num_skills)/2)),
  hid_enc_activations = rep("sigmoid", length(enc_hid_arch)),
  output_activation = "sigmoid",
  kl_weight = 1,
  learning_rate = 0.001
)

Arguments

`num_items`	the number of items on the assessment; also the number of nodes in the input/output layers of the VAE
`num_skills`	the number of skills being evaluated; also the size of the distribution learned by the VAE
`Q_matrix`	a binary, `num_skills` by `num_items` matrix relating the assessment items with skills
`model_type`	either 1 or 2, specifying a 1 parameter (1PL) or 2 parameter (2PL) model
`mean_vector`	a vector of length `num_skills` specifying the mean of each latent trait
`covariance_matrix`	a symmetric, positive definite, `num_skills` by `num_skills`, matrix giving the covariance of the latent traits
`enc_hid_arch`	a vector detailing the number an size of hidden layers in the encoder
`hid_enc_activations`	a vector specifying the activation function in each hidden layer in the encoder; must be the same length as `enc_hid_arch`
`output_activation`	a string specifying the activation function in the output of the decoder; the ML2P model alsways used 'sigmoid'
`kl_weight`	an optional weight for the KL divergence term in the loss function
`learning_rate`	an optional parameter for the adam optimizer

Package 'ML2Pvae'

Help Index

Display a message upon loading package

Description

Usage

Arguments

Build the encoder for a VAE

Description

Usage

Arguments

Value

Build a VAE that fits to a normal, full covariance N(m,S) latent distribution

Description

Usage

Arguments

Value

Examples

Build a VAE that fits to a standard N(0,I) latent distribution with independent latent traits

Description

Usage

Arguments

Value

Examples

Simulated latent abilities correlation matrix

Description

Usage

Format

Source

Simulated difficulty parameters

Description

Usage

Format

Source

Simulated discrimination parameters

Description

Usage

Format

Source

Feed forward response sets through the encoder, which outputs student ability estimates

Description

Usage

Arguments

Value

Examples

Get trainable variables from the decoder, which serve as item parameter estimates.

Description

Usage

Arguments

Value

Examples

ML2Pvae: A package for creating a VAE whose decoder recovers the parameters of the ML2P model. The encoder can be used to predict the latent skills based on assessment scores.

Description

A custom kernel constraint function that forces nonzero weights to be equal to one, so the VAE will estimate the 1-parameter logistic model. Nonzero weights are determined by the Q matrix.

Description

Usage

Arguments

Value

A custom kernel constraint function that restricts weights between the learned distribution and output. Nonzero weights are determined by the Q matrix.

Description

Usage

Arguments

Value

Simulated Q-matrix

Description

Usage

Format

Source

Response data

Description

Usage

Format

Source

A reparameterization in order to sample from the learned multivariate normal distribution of the VAE

Description

Usage

Arguments

A reparameterization in order to sample from the learned standard normal distribution of the VAE

Description

Usage

Arguments