Title: | Bayesian Psychometric Measurement Using 'Stan' |
---|---|
Description: | Estimate diagnostic classification models (also called cognitive diagnostic models) with 'Stan'. Diagnostic classification models are confirmatory latent class models, as described by Rupp et al. (2010, ISBN: 978-1-60623-527-0). Automatically generate 'Stan' code for the general loglinear cognitive diagnostic diagnostic model proposed by Henson et al. (2009) <doi:10.1007/s11336-008-9089-5> and other subtypes that introduce additional model constraints. Using the generated 'Stan' code, estimate the model evaluate the model's performance using model fit indices, information criteria, and reliability metrics. |
Authors: | W. Jake Thompson [aut, cre] , Nathan Jones [ctb] , Matthew Johnson [cph] (Provided code adapted for reliability.measrdcm()), Paul-Christian Bürkner [cph] (Author of eval_silent()), University of Kansas [cph], Institute of Education Sciences [fnd] |
Maintainer: | W. Jake Thompson <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.0.9000 |
Built: | 2024-11-10 03:18:39 UTC |
Source: | https://github.com/wjakethompson/measr |
measrfit
Coerce objects to a measrfit
as_measrfit(x, class = character()) ## Default S3 method: as_measrfit(x, class = character())
as_measrfit(x, class = character()) ## Default S3 method: as_measrfit(x, class = character())
x |
An object to be coerced to a |
class |
Additional classes to be added (e.g., |
An object of class measrfit.
measrfit, measrfit()
, is_measrfit()
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) new_obj <- as_measrfit(rstn_mdm_lcdm, class = "measrdcm")
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) new_obj <- as_measrfit(rstn_mdm_lcdm, class = "measrdcm")
Combine multiple measrprior objects into one measrprior
## S3 method for class 'measrprior' c(x, ..., replace = FALSE)
## S3 method for class 'measrprior' c(x, ..., replace = FALSE)
x |
A |
... |
Additional |
replace |
Should only unique priors be kept? If |
A measrprior
object.
Given the number of attributes, generate all possible patterns of attribute mastery.
create_profiles(attributes)
create_profiles(attributes)
attributes |
Positive integer. The number of attributes being measured. |
A tibble with all possible attribute
mastery profiles. Each row is a profile, and each column indicates whether
the attribute in that column was mastered (1) or not mastered (0). Thus,
the tibble will have 2^attributes
rows, and attributes
columns.
create_profiles(3L) create_profiles(5)
create_profiles(3L) create_profiles(5)
Default priors for diagnostic classification models
default_dcm_priors(type = "lcdm", attribute_structure = "unconstrained")
default_dcm_priors(type = "lcdm", attribute_structure = "unconstrained")
type |
Type of DCM to estimate. Must be one of lcdm, dina, dino, or crum. |
attribute_structure |
Structural model specification. Must be one of
unconstrained, or independent.
|
A measrprior
object.
default_dcm_priors(type = "lcdm")
default_dcm_priors(type = "lcdm")
This is data from the grammar section of the ECPE, administered annually by the English Language Institute at the University of Michigan. This data contains responses to 28 questions from 2,922 respondents, which ask respondents to complete a sentence with the correct word. This data set has been used by Templin & Hoffman (2013) and Templin & Bradshaw (2014) for demonstrating the log-linear cognitive diagnosis model (LCDM) and the hierarchical diagnostic classification model (HDCM), respectively.
ecpe_data ecpe_qmatrix
ecpe_data ecpe_qmatrix
ecpe_data
is a tibble containing ECPE
response data with 2,922 rows and 29 variables.
resp_id
: Respondent identifier
E1
-E28
: Dichotomous item responses to the 28 ECPE items
ecpe_qmatrix
is a tibble that identifies
which skills are measured by each ECPE item. This section of the ECPE
contains 28 items measuring 3 skills. The ecpe_qmatrix
correspondingly is
made up of 28 rows and 4 variables.
item_id
: Item identifier, corresponds to E1
-E28
in ecpe_data
morphosyntactic
, cohesive
, and lexical
: Dichotomous indicator for
whether or not the skill is measured by each item. A value of 1
indicates
the skill is measured by the item and a value of 0
indicates the skill is
not measured by the item.
The skills correspond to knowledge of:
Morphosyntactic rules
Cohesive rules
Lexical rules
For more details, see Buck & Tatsuoka (1998) and Henson & Templin (2007).
Buck, G., & Tatsuoka, K. K. (1998). Application of the rule-space procedure to language testing: Examining attributes of a free response listening test. Language Testing, 15(2), 119-157. doi:10.1177/026553229801500201
Henson, R., & Templin, J. (2007, April). Large-scale language assessment using cognitive diagnosis models. Paper presented at the Annual meeting of the National Council on Measurement in Education, Chicago, IL.
Templin, J., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(2), 37-50. doi:10.1111/emip.12010
Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317-339. doi:10.1007/s11336-013-9362-0
For diagnostic classification models, the M2 statistic is calculated as described by Hansen et al. (2016) and Liu et al. (2016).
## S3 method for class 'measrdcm' fit_m2(model, ..., ci = 0.9, force = FALSE)
## S3 method for class 'measrdcm' fit_m2(model, ..., ci = 0.9, force = FALSE)
model |
An estimated diagnostic classification model. |
... |
Unused, for extensibility. |
ci |
The confidence interval for the RMSEA. |
force |
If the M2 has already
been saved to the model object with |
A data frame created by dcm2::fit_m2()
.
fit_m2(measrdcm)
: M2 for
diagnostic classification models.
Hansen, M., Cai, L., Monroe, S., & Li, Z. (2016). Limited-information goodness-of-fit testing of diagnostic classification item response models. British Journal of Mathematical and Statistical Psychology, 69(3), 225-252. doi:10.1111/bmsp.12074
Liu, Y., Tian, W., & Xin, T. (2016). An application of M2 statistic to evaluate the fit of cognitive diagnostic models. Journal of Educational and Behavioral Statistics, 41(1), 3-26. doi:10.3102/1076998615621293
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) fit_m2(rstn_mdm_lcdm)
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) fit_m2(rstn_mdm_lcdm)
For models estimated with method = "mcmc"
, use the posterior distributions
to compute expected distributions for fit statistics and compare to values
in the observed data.
fit_ppmc( model, ndraws = NULL, probs = c(0.025, 0.975), return_draws = 0, model_fit = c("raw_score"), item_fit = c("conditional_prob", "odds_ratio", "pvalue"), force = FALSE )
fit_ppmc( model, ndraws = NULL, probs = c(0.025, 0.975), return_draws = 0, model_fit = c("raw_score"), item_fit = c("conditional_prob", "odds_ratio", "pvalue"), force = FALSE )
model |
A measrfit object. |
ndraws |
The number of posterior draws to base the checks on. Must be
less than or equal to the total number of posterior draws retained in the
estimated model. If |
probs |
The percentiles to be computed by the |
return_draws |
Proportion of posterior draws for each specified fit
statistic to be returned. This does not affect the calculation of the
posterior predictive checks, but can be useful for visualizing the fit
statistics. For example, if |
model_fit |
The posterior predictive model checks to compute for an
evaluation of model-level fit. If |
item_fit |
The posterior predictive model checks to compute for an
evaluation of item-level fit. If |
force |
If all requested PPMCs have already been added to the model
object using |
Posterior predictive model checks (PPMCs) use the posterior distribution of an estimated model to compute different statistics. This creates an expected distribution of the given statistic, if our estimated parameters are correct. We then compute the statistic in our observed data and compare the observed value to the expected distribution. Observed values that fall outside of the expected distributions indicate incompatibility between the estimated model and the observed data.
We currently support PPMCs at the model and item level. At the model level,
we calculate the expected raw score distribution (model_fit = "raw_score"
),
as described by Thompson (2019) and Park et al. (2015).
At the item level, we can calculate the conditional probability that a
respondent in each class provides a correct response (item_fit = "conditional_prob"
) as described by Thompson (2019) and Sinharay & Almond
(2007) or the overall proportion correct for an item (item_fit = "pvalue"
),
as described by Thompson (2019). We can also calculate the odds ratio for
each pair of items (item_fit = "odds_ratio"
) as described by Park et al.
(2015) and Sinharay et al. (2006).
A list with two elements, "model_fit" and "item_fit". If either
model_fit = NULL
or item_fit = NULL
in the function call, this will be
a one-element list, with the null criteria excluded. Each list element, is
itself a list with one element for each specified PPMC containing a
tibble. For example if
item_fit = c("conditional_prob", "odds_ratio")
, the "item_fit" element
will be a list of length two, where each element is a tibble containing the
results of the PPMC. All tibbles follow the same general structure:
obs_{ppmc}
: The value of the relevant statistic in the observed data.
ppmc_mean
: The mean of the ndraws
posterior samples calculated for
the given statistic.
Quantile columns: 1 column for each value of probs
, providing the
corresponding quantiles of the ndraws
posterior samples calculated for
the given statistic.
samples
: A list column, where each element contains a vector of length
(ndraws * return_draws)
, representing samples from the posterior
distribution of the calculated statistic. This column is excluded if
return_draws = 0
.
ppp
: The posterior predictive p-value. This is the proportion of
posterior samples for calculated statistic that are greater than the
observed value. Values very close to 0 or 1 indicate incompatibility
between the fitted model and the observed data.
Park, J. Y., Johnson, M. S., Lee, Y-S. (2015). Posterior predictive model checks for cognitive diagnostic models. International Journal of Quantitative Research in Education, 2(3-4), 244-264. doi:10.1504/IJQRE.2015.071738
Sinharay, S., & Almond, R. G. (2007). Assessing fit of cognitive diagnostic models. Educational and Psychological Measurement, 67(2), 239-257. doi:10.1177/0013164406292025
Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30(4), 298-321. doi:10.1177/0146621605285517
Thompson, W. J. (2019). Bayesian psychometrics for diagnostic assessments: A proof of concept (Research Report No. 19-01). University of Kansas; Accessible Teaching, Learning, and Assessment Systems. doi:10.35542/osf.io/jzqs8
mdm_dina <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "dina", method = "mcmc", seed = 63277, backend = "rstan", iter = 700, warmup = 500, chains = 2, refresh = 0 ) fit_ppmc(mdm_dina, model_fit = "raw_score", item_fit = NULL)
mdm_dina <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "dina", method = "mcmc", seed = 63277, backend = "rstan", iter = 700, warmup = 500, chains = 2, refresh = 0 ) fit_ppmc(mdm_dina, model_fit = "raw_score", item_fit = NULL)
When specifying prior distributions, it is often useful to see which parameters are included in a given model. Using the Q-matrix and type of diagnostic model to estimated, we can create a list of all included parameters for which a prior can be specified.
get_parameters( qmatrix, item_id = NULL, rename_att = FALSE, rename_item = FALSE, type = c("lcdm", "dina", "dino", "crum"), attribute_structure = c("unconstrained", "independent") )
get_parameters( qmatrix, item_id = NULL, rename_att = FALSE, rename_item = FALSE, type = c("lcdm", "dina", "dino", "crum"), attribute_structure = c("unconstrained", "independent") )
qmatrix |
The Q-matrix. A data frame with 1 row per item and 1 column per attribute. All cells should be either 0 (item does not measure the attribute) or 1 (item does measure the attribute). |
item_id |
Optional. Variable name of a column in |
rename_att |
Should attribute names from the |
rename_item |
Should item names from the |
type |
Type of DCM to estimate. Must be one of lcdm, dina, dino, or crum. |
attribute_structure |
Structural model specification. Must be one of
unconstrained, or independent.
|
A tibble with one row per parameter.
get_parameters(ecpe_qmatrix, item_id = "item_id", type = "lcdm") get_parameters(ecpe_qmatrix, item_id = "item_id", type = "lcdm", rename_att = TRUE)
get_parameters(ecpe_qmatrix, item_id = "item_id", type = "lcdm") get_parameters(ecpe_qmatrix, item_id = "item_id", type = "lcdm", rename_att = TRUE)
measrfit
objectCheck if argument is a measrfit
object
is_measrfit(x)
is_measrfit(x)
x |
An object to be checked |
A logical indicating is x
is a measrfit
object.
measrfit, measrfit()
, as_measrfit()
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) is_measrfit(rstn_mdm_lcdm)
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) is_measrfit(rstn_mdm_lcdm)
measrprior
objectChecks if argument is a measrprior
object
is_measrprior(x)
is_measrprior(x)
x |
An object to be checked |
A logical indicating if x
is a measrprior
object.
prior1 <- prior(lognormal(0, 1), class = maineffect) is_measrprior(prior1) prior2 <- 3 is_measrprior(prior2)
prior1 <- prior(lognormal(0, 1), class = maineffect) is_measrprior(prior1) prior2 <- 3 is_measrprior(prior2)
The loglik_array()
methods for measrfit objects calculates the
log-likelihood for an estimated model via the generated quantities
functionality in Stan and returns the draws of the log_lik
parameter.
loglik_array(model) ## S3 method for class 'measrdcm' loglik_array(model)
loglik_array(model) ## S3 method for class 'measrdcm' loglik_array(model)
model |
A measrfit object. |
A "draws_array
" object containing the
log-likelihood estimates for the model.
A loo::loo_compare()
method that is customized for measrfit
objects. See
the loo package vignettes for
details.
## S3 method for class 'measrfit' loo_compare(x, ..., criterion = c("loo", "waic"), model_names = NULL)
## S3 method for class 'measrfit' loo_compare(x, ..., criterion = c("loo", "waic"), model_names = NULL)
x |
A measrfit object. |
... |
Additional objects of class measrfit. |
criterion |
The name of the criterion to be extracted from the measrfit object for comparison. |
model_names |
Names given to each provided model in the comparison
output. If |
The object returned by loo::loo_compare()
.
A loo::loo()
method that is customized for measrfit
objects. This is a
simple wrapper around loo::loo.array()
. See the loo package
vignettes for details.
## S3 method for class 'measrfit' loo(x, ..., r_eff = NA, force = FALSE)
## S3 method for class 'measrfit' loo(x, ..., r_eff = NA, force = FALSE)
x |
A measrfit object. |
... |
Additional arguments passed to |
r_eff |
Vector of relative effective sample size estimates for the
likelihood ( |
force |
If the LOO criterion has already been added to the model object
with |
The object returned by loo::loo.array()
.
This is a small data set of multiplication item responses. This data contains responses to 4 items from 142 respondents, which ask respondents to complete an integer multiplication problem.
mdm_data mdm_qmatrix
mdm_data mdm_qmatrix
mdm_data
is a tibble containing responses
to multiplication items, as described in MacReady & Dayton (1977). There are
142 rows and 5 variables.
respondent
: Respondent identifier
mdm1
-mdm4
: Dichotomous item responses to the 4 multiplication items
mdm_qmatrix
is a tibble that identifies
which skills are measured by each MDM item. This MDM data contains 4 items,
all of which measure the skill of multiplication. The mdm_qmatrix
correspondingly is made up of 4 rows and 2 variables.
item
: Item identifier, corresponds to mdm1
-mdm4
in mdm_data
multiplication
: Dichotomous indicator for whether or not the
multiplication skill is measured by each item. A value of 1
indicates the
skill is measured by the item and a value of 0
indicates the skill is not
measured by the item.
MacReady, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2(2), 99-120. doi:10.2307/1164802
Estimate diagnostic classification models (DCMs; also known as cognitive diagnostic models) using 'Stan'. Models can be estimated using Stan's optimizer, or full Markov chain Monte Carlo (MCMC).
measr_dcm( data, missing = NA, qmatrix, resp_id = NULL, item_id = NULL, type = c("lcdm", "dina", "dino", "crum"), max_interaction = Inf, attribute_structure = c("unconstrained", "independent"), method = c("mcmc", "optim"), prior = NULL, backend = getOption("measr.backend", "rstan"), file = NULL, file_refit = getOption("measr.file_refit", "never"), ... )
measr_dcm( data, missing = NA, qmatrix, resp_id = NULL, item_id = NULL, type = c("lcdm", "dina", "dino", "crum"), max_interaction = Inf, attribute_structure = c("unconstrained", "independent"), method = c("mcmc", "optim"), prior = NULL, backend = getOption("measr.backend", "rstan"), file = NULL, file_refit = getOption("measr.file_refit", "never"), ... )
data |
Response data. A data frame with 1 row per respondent and 1 column per item. |
missing |
An R expression specifying how missing data in |
qmatrix |
The Q-matrix. A data frame with 1 row per item and 1 column per attribute. All cells should be either 0 (item does not measure the attribute) or 1 (item does measure the attribute). |
resp_id |
Optional. Variable name of a column in |
item_id |
Optional. Variable name of a column in |
type |
Type of DCM to estimate. Must be one of lcdm, dina, dino, or crum. |
max_interaction |
If |
attribute_structure |
Structural model specification. Must be one of
unconstrained, or independent.
|
method |
Estimation method. Options are |
prior |
A measrprior object. If |
backend |
Character string naming the package to use as the backend for
fitting the Stan model. Options are |
file |
Either |
file_refit |
Controls when a saved model is refit. Options are
|
... |
Additional arguments passed to Stan.
|
A measrfit object.
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" )
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" )
Used for determining examples that shouldn't be run on CRAN, but can be run for the pkgdown website.
measr_examples()
measr_examples()
A logical value indicating whether or not the examples should be run.
measr_examples()
measr_examples()
measrfit
object.Extract components of a measrfit
object.
Extract components of an estimated diagnostic classification model
measr_extract(model, ...) ## S3 method for class 'measrdcm' measr_extract(model, what, ...)
measr_extract(model, ...) ## S3 method for class 'measrdcm' measr_extract(model, what, ...)
model |
The estimated to extract information from. |
... |
Additional arguments passed to each extract method.
|
what |
Character string. The information to be extracted. See details for available options. |
For diagnostic classification models, we can extract the following information:
item_param
: The estimated item parameters. This shows the name of the
parameter, the class of the parameter, and the estimated value.
strc_param
: The estimated structural parameters. This is the base rate
of membership in each class. This shows the class pattern and the
estimated proportion of respondents in each class.
prior
: The priors used when estimating the model.
classes
: The possible classes or profile patterns. This will show the
class label (i.e., the pattern of proficiency) and the attributes
included in each class.
class_prob
: The probability that each respondent belongs to class
(i.e., has the given pattern of proficiency).
attribute_prob
: The proficiency probability for each respondent and
attribute.
m2
: The M2 fit statistic.
See fit_m2()
for details. Model fit information must first be added to
the model using add_fit()
.
rmsea
: The root mean square error of approximation (RMSEA) fit
statistic and associated confidence interval. See fit_m2()
for details.
Model fit information must first be added to the model using add_fit()
.
srmsr
: The standardized root mean square residual (SRMSR) fit
statistic. See fit_m2()
for details. Model fit information must first
be added to the model using add_fit()
.
ppmc_raw_score
: The observed and posterior predicted chi-square
statistic for the raw score distribution. See fit_ppmc()
for details.
Model fit information must first be added to the model using add_fit()
.
ppmc_conditional_prob
: The observed and posterior predicted conditional
probabilities of each class providing a correct response to each item.
See fit_ppmc()
for details.
Model fit information must first be added to the model using add_fit()
.
ppmc_conditional_prob_flags
: A subset of the PPMC conditional
probabilities where the ppp is outside the specified ppmc_interval
.
ppmc_odds_ratio
: The observed and posterior predicted odds ratios of
each item pair. See fit_ppmc()
for details.
Model fit information must first be added to the model using add_fit()
.
ppmc_odds_ratio_flags
: A subset of the PPMC odds ratios where the ppp
is outside the specified ppmc_interval
.
ppmc_pvalue
: The observed and posterior predicted proportion of correct
responses to each item. See fit_ppmc()
for details.
ppmc_pvalue_flags
: A subset of the PPMC proportion correct values where
the ppp is outside the specified ppmc_interval
.
loo
: The leave-one-out cross validation results. See loo::loo()
for
details. The information criterion must first be added to the model using
add_criterion()
.
waic
: The widely applicable information criterion results. See
loo::waic()
for details. The information criterion must first be added
to the model using add_criterion()
.
pattern_reliability
: The accuracy and consistency of the overall
attribute profile classification, as described by Cui et al. (2012).
Reliability information must first be added to the model using
add_reliability()
.
classification_reliability
: The classification accuracy and consistency
for each attribute, using the metrics described by Johnson & Sinharay
(2018). Reliability information must first be added to the model using
add_reliability()
.
probability_reliability
: Reliability estimates for the probability of
proficiency on each attribute, as described by Johnson & Sinharay (2020).
Reliability information must first be added to the model using
add_reliability()
.
The extracted information. The specific structure will vary depending on what is being extracted, but usually the returned object is a tibble with the requested information.
measr_extract(measrdcm)
: Extract components of an estimated diagnostic
classification model.
Cui, Y., Gierl, M. J., & Chang, H.-H. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49(1), 19-38. doi:10.1111/j.1745-3984.2011.00158.x
Johnson, M. S., & Sinharay, S. (2018). Measures of agreement to assess attribute-level classification accuracy and consistency for cognitive diagnostic assessments. Journal of Educational Measurement, 55(4), 635-664. doi:10.1111/jedm.12196
Johnson, M. S., & Sinharay, S. (2020). The reliability of the posterior probability of skill attainment in diagnostic classification models. Journal of Educational and Behavioral Statistics, 45(1), 5-31. doi:10.3102/1076998619864550
Templin, J., & Bradshaw, L. (2013). Measuring the reliability of diagnostic classification model examinee estimates. Journal of Classification, 30(2), 251-275. doi:10.1007/s00357-013-9129-4
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) measr_extract(rstn_mdm_lcdm, "strc_param")
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) measr_extract(rstn_mdm_lcdm, "strc_param")
measrfit
objectModels fitted with measr are represented as a measrfit
object. If a
model is estimated with Stan, but not measr, a measrfit
object can be
created in order to access other functionality in measr (e.g., model fit,
reliability).
measrfit( data = list(), type = character(), prior = default_dcm_priors(type = type), stancode = character(), method = character(), algorithm = character(), backend = character(), model = NULL, respondent_estimates = list(), fit = list(), criteria = list(), reliability = list(), file = NULL, version = list(), class = character() )
measrfit( data = list(), type = character(), prior = default_dcm_priors(type = type), stancode = character(), method = character(), algorithm = character(), backend = character(), model = NULL, respondent_estimates = list(), fit = list(), criteria = list(), reliability = list(), file = NULL, version = list(), class = character() )
data |
The data and Q-matrix used to estimate the model. |
type |
The type of DCM that was estimated. |
prior |
A measrprior object containing information on the priors used in the model. |
stancode |
The model code in Stan language. |
method |
The method used to fit the model. |
algorithm |
The name of the algorithm used to fit the model. |
backend |
The name of the backend used to fit the model. |
model |
The fitted Stan model. This will object of class
rstan::stanfit if |
respondent_estimates |
An empty list for adding estimated person parameters after fitting the model. |
fit |
An empty list for adding model fit information after fitting the model. |
criteria |
An empty list for adding information criteria after fitting the model. |
reliability |
An empty list for adding reliability information after fitting the model. |
file |
Optional name of a file which the model objects was saved to or loaded from. |
version |
The versions of measr, Stan, rstan and/or cmdstanr that were used to fit the model. |
class |
Additional classes to be added (e.g., |
A measrfit object.
measrfit, as_measrfit()
, is_measrfit()
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) new_obj <- measrfit( data = rstn_mdm_lcdm$data, type = rstn_mdm_lcdm$type, prior = rstn_mdm_lcdm$prior, stancode = rstn_mdm_lcdm$stancode, method = rstn_mdm_lcdm$method, algorithm = rstn_mdm_lcdm$algorithm, backend = rstn_mdm_lcdm$backend, model = rstn_mdm_lcdm$model, respondent_estimates = rstn_mdm_lcdm$respondent_estimates, fit = rstn_mdm_lcdm$fit, criteria = rstn_mdm_lcdm$criteria, reliability = rstn_mdm_lcdm$reliability, file = rstn_mdm_lcdm$file, version = rstn_mdm_lcdm$version, class = "measrdcm" )
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) new_obj <- measrfit( data = rstn_mdm_lcdm$data, type = rstn_mdm_lcdm$type, prior = rstn_mdm_lcdm$prior, stancode = rstn_mdm_lcdm$stancode, method = rstn_mdm_lcdm$method, algorithm = rstn_mdm_lcdm$algorithm, backend = rstn_mdm_lcdm$backend, model = rstn_mdm_lcdm$model, respondent_estimates = rstn_mdm_lcdm$respondent_estimates, fit = rstn_mdm_lcdm$fit, criteria = rstn_mdm_lcdm$criteria, reliability = rstn_mdm_lcdm$reliability, file = rstn_mdm_lcdm$file, version = rstn_mdm_lcdm$version, class = "measrdcm" )
measrfit
of models fitted with the measr packageModels fitted with the measr package are represented as a measrfit
object, which contains the posterior draws, Stan code, priors, and other
relevant information.
data
The data and Q-matrix used to estimate the model.
type
The type of DCM that was estimated.
prior
A measrprior object containing information on the priors used in the model.
stancode
The model code in Stan language.
method
The method used to fit the model.
algorithm
The name of the algorithm used to fit the model.
backend
The name of the backend used to fit the model.
model
The fitted Stan model. This will object of class
rstan::stanfit if backend = "rstan"
and
CmdStanMCMC
if backend = "cmdstanr"
was specified when fitting the model.
respondent_estimates
An empty list for adding estimated person parameters after fitting the model.
fit
An empty list for adding model fit information after fitting the model.
criteria
An empty list for adding information criteria after fitting the model.
reliability
An empty list for adding reliability information after fitting the model.
file
Optional name of a file which the model objects was saved to or loaded from.
version
The versions of measr, Stan, rstan and/or cmdstanr that were used to fit the model.
measrfit()
, as_measrfit()
, is_measrfit()
Create prior definitions for classes of parameters, or specific parameters.
measrprior( prior, class = c("structural", "intercept", "maineffect", "interaction", "slip", "guess"), coef = NA, lb = NA, ub = NA ) prior(prior, ...) prior_(prior, ...) prior_string(prior, ...)
measrprior( prior, class = c("structural", "intercept", "maineffect", "interaction", "slip", "guess"), coef = NA, lb = NA, ub = NA ) prior(prior, ...) prior_(prior, ...) prior_string(prior, ...)
prior |
A character string defining a distribution in Stan language. A list of all distributions supported by Stan can be found in Stan Language Functions Reference at https://mc-stan.org/users/documentation/. |
class |
The parameter class. Defaults to |
coef |
Name of a specific parameter within the defined class. If not defined, the prior is applied to all parameters within the class. |
lb |
Lower bound for parameter restriction. Defaults to no restriction. |
ub |
Upper bound for parameter restriction. Defaults to no restriction. |
... |
Additional arguments passed to |
A tibble of class measrprior
.
prior()
: Alias of measrprior()
which allows arguments to be
specified as expressions without quotation marks.
prior_()
: Alias of measrprior()
which allows arguments to be
specified as one-sided formulas or wrapped in base::quote()
.
prior_string()
: Alias of measrprior()
which allows arguments to be
specified as character strings.
# Use alias functions to define priors without quotes, as formulas, # or as character strings. (prior1 <- prior(lognormal(0, 1), class = maineffect)) (prior2 <- prior_(~lognormal(0, 1), class = ~maineffect)) (prior3 <- prior_string("lognormal(0, 1)", class = "maineffect")) identical(prior1, prior2) identical(prior1, prior3) identical(prior2, prior3) # Define a prior for an entire class of parameters prior(beta(5, 25), class = "slip") # Or for a specific item (e.g., just the slipping parameter for item 7) prior(beta(5, 25), class = "slip", coef = "slip[7]")
# Use alias functions to define priors without quotes, as formulas, # or as character strings. (prior1 <- prior(lognormal(0, 1), class = maineffect)) (prior2 <- prior_(~lognormal(0, 1), class = ~maineffect)) (prior3 <- prior_string("lognormal(0, 1)", class = "maineffect")) identical(prior1, prior2) identical(prior1, prior3) identical(prior2, prior3) # Define a prior for an entire class of parameters prior(beta(5, 25), class = "slip") # Or for a specific item (e.g., just the slipping parameter for item 7) prior(beta(5, 25), class = "slip", coef = "slip[7]")
Add model evaluation metrics to fitted model objects. These functions are wrappers around other functions that compute the metrics. The benefit of using these wrappers is that the model evaluation metrics are saved as part of the model object so that time-intensive calculations do not need to be repeated. See Details for specifics.
add_criterion( x, criterion = c("loo", "waic"), overwrite = FALSE, save = TRUE, ..., r_eff = NA ) add_reliability(x, overwrite = FALSE, save = TRUE) add_fit( x, method = c("m2", "ppmc"), overwrite = FALSE, save = TRUE, ..., ci = 0.9 ) add_respondent_estimates( x, probs = c(0.025, 0.975), overwrite = FALSE, save = TRUE )
add_criterion( x, criterion = c("loo", "waic"), overwrite = FALSE, save = TRUE, ..., r_eff = NA ) add_reliability(x, overwrite = FALSE, save = TRUE) add_fit( x, method = c("m2", "ppmc"), overwrite = FALSE, save = TRUE, ..., ci = 0.9 ) add_respondent_estimates( x, probs = c(0.025, 0.975), overwrite = FALSE, save = TRUE )
x |
A measrfit object. |
criterion |
A vector of criteria to calculate and add to the model object. |
overwrite |
Logical. Indicates whether specified elements that have
already been added to the estimated model should be overwritten. Default is
|
save |
Logical. Only relevant if a file was specified in the
measrfit object passed to |
... |
Additional arguments passed relevant methods. See Details. |
r_eff |
Vector of relative effective sample size estimates for the
likelihood ( |
method |
A vector of model fit methods to evaluate and add to the model object. |
ci |
The confidence interval for the RMSEA, computed from the M2 |
probs |
The percentiles to be computed by the |
For add_respondent_estimates()
, estimated person parameters are added to
the $respondent_estimates
element of the fitted model.
For add_fit()
, model and item fit information are added to the $fit
element of the fitted model. This function wraps fit_m2()
to calculate the
M2 statistic (Hansen et al., 2016;
Liu et al., 2016) and/or fit_ppmc()
to calculate posterior predictive model
checks (Park et al., 2015; Sinharay & Almond, 2007; Sinharay et al., 2006;
Thompson, 2019), depending on which methods are specified. Additional
arguments supplied to ...
are passed to fit_ppmc()
.
For add_criterion()
, relative fit criteria are added to the $criteria
element of the fitted model. This function wraps loo()
and/or waic()
,
depending on which criteria are specified, to calculate the leave-one-out
(LOO; Vehtari et al., 2017) and/or widely applicable information criteria
(WAIC; Watanabe, 2010) to fitted model objects. Additional arguments supplied
to ...
are passed to loo::loo.array()
or loo::waic.array()
.
For add_reliability()
, reliability information is added to the
$reliability
element of the fitted model. Pattern level reliability is
described by Cui et al. (2012). Classification reliability and posterior
probability reliability are described by Johnson & Sinharay (2018, 2020),
respectively. This function wraps reliability()
.
A modified measrfit object with the corresponding slot populated with the specified information.
Cui, Y., Gierl, M. J., & Chang, H.-H. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49(1), 19-38. doi:10.1111/j.1745-3984.2011.00158.x
Hansen, M., Cai, L., Monroe, S., & Li, Z. (2016). Limited-information goodness-of-fit testing of diagnostic classification item response models. British Journal of Mathematical and Statistical Psychology, 69(3), 225-252. doi:10.1111/bmsp.12074
Johnson, M. S., & Sinharay, S. (2018). Measures of agreement to assess attribute-level classification accuracy and consistency for cognitive diagnostic assessments. Journal of Educational Measurement, 55(4), 635-664. doi:10.1111/jedm.12196
Johnson, M. S., & Sinharay, S. (2020). The reliability of the posterior probability of skill attainment in diagnostic classification models. Journal of Educational and Behavioral Statistics, 45(1), 5-31. doi:10.3102/1076998619864550
Liu, Y., Tian, W., & Xin, T. (2016). An application of M2 statistic to evaluate the fit of cognitive diagnostic models. Journal of Educational and Behavioral Statistics, 41(1), 3-26. doi:10.3102/1076998615621293
Park, J. Y., Johnson, M. S., Lee, Y-S. (2015). Posterior predictive model checks for cognitive diagnostic models. International Journal of Quantitative Research in Education, 2(3-4), 244-264. doi:10.1504/IJQRE.2015.071738
Sinharay, S., & Almond, R. G. (2007). Assessing fit of cognitive diagnostic models. Educational and Psychological Measurement, 67(2), 239-257. doi:10.1177/0013164406292025
Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30(4), 298-321. doi:10.1177/0146621605285517
Thompson, W. J. (2019). Bayesian psychometrics for diagnostic assessments: A proof of concept (Research Report No. 19-01). University of Kansas; Accessible Teaching, Learning, and Assessment Systems. doi:10.35542/osf.io/jzqs8
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413-1432. doi:10.1007/s11222-016-9696-4
Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research, 11(116), 3571-3594. https://jmlr.org/papers/v11/watanabe10a.html
cmds_mdm_dina <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "dina", method = "optim", seed = 63277, backend = "rstan", prior = c(prior(beta(5, 17), class = "slip"), prior(beta(5, 17), class = "guess")) ) cmds_mdm_dina <- add_reliability(cmds_mdm_dina) cmds_mdm_dina <- add_fit(cmds_mdm_dina, method = "m2") cmds_mdm_dina <- add_respondent_estimates(cmds_mdm_dina)
cmds_mdm_dina <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "dina", method = "optim", seed = 63277, backend = "rstan", prior = c(prior(beta(5, 17), class = "slip"), prior(beta(5, 17), class = "guess")) ) cmds_mdm_dina <- add_reliability(cmds_mdm_dina) cmds_mdm_dina <- add_fit(cmds_mdm_dina, method = "m2") cmds_mdm_dina <- add_respondent_estimates(cmds_mdm_dina)
Calculate posterior draws of respondent proficiency. Optionally retain all posterior draws or return only summaries of the distribution for each respondent.
## S3 method for class 'measrdcm' predict( object, newdata = NULL, resp_id = NULL, missing = NA, summary = TRUE, probs = c(0.025, 0.975), force = FALSE, ... )
## S3 method for class 'measrdcm' predict( object, newdata = NULL, resp_id = NULL, missing = NA, summary = TRUE, probs = c(0.025, 0.975), force = FALSE, ... )
object |
An object of class |
newdata |
Optional new data. If not provided, the data used to estimate
the model is scored. If provided, |
resp_id |
Optional. Variable name of a column in |
missing |
An R expression specifying how missing data in |
summary |
Should summary statistics be returned instead of the raw
posterior draws? Only relevant if the model was estimated with
|
probs |
The percentiles to be computed by the |
force |
If respondent estimates have already been added to the model
object with |
... |
Unused. |
A list with two elements: class_probabilities
and
attribute_probabilities
.
If summary is FALSE
, each element is a tibble with the number of rows
equal to the number of draws in object
with columns: .chain
,
.iteration
, .draw
, the respondent identifier, and one column of
probabilities for each of the possible classes.
If summary is TRUE
, each element is a tibble with one row per respondent
and class or attribute, and columns of the respondent identifier, class
or attribute
, mean
, and one column for every value specified in
probs
.
For diagnostic classification models, reliability can be estimated at the pattern or attribute level. Pattern-level reliability represents the classification consistency and accuracy of placing students into an overall mastery profile. Rather than an overall profile, attributes can also be scored individually. In this case, classification consistency and accuracy should be evaluated for each individual attribute, rather than the overall profile. This is referred to as the maximum a posteriori (MAP) reliability. Finally, it may be desirable to report results as the probability of proficiency or mastery on each attribute instead of a proficient/not proficient classification. In this case, the reliability of the posterior probability should be reported. This is the expected a posteriori (EAP) reliability.
reliability(model, ...) ## S3 method for class 'measrdcm' reliability(model, ..., threshold = 0.5, force = FALSE)
reliability(model, ...) ## S3 method for class 'measrdcm' reliability(model, ..., threshold = 0.5, force = FALSE)
model |
The estimated model to be evaluated. |
... |
Unused. For future extensions. |
threshold |
For |
force |
If reliability information has already been added to the model
object with |
The pattern-level reliability (pattern_reliability
) statistics are
described in Cui et al. (2012). Attribute-level classification reliability
statistics (map_reliability
) are described in Johnson & Sinharay (2018).
Reliability statistics for the posterior mean of the skill indicators (i.e.,
the mastery or proficiency probabilities; eap_reliability
) are described in
Johnson & Sinharay (2019).
For class measrdcm
, a list with 3 elements:
pattern_reliability
: The pattern-level accuracy (p_a
) and consistency
(p_c
) described by Cui et al. (2012).
map_reliability
: A list with 2 elements: accuracy
and consistency
,
which include the attribute-level classification reliability statistics
described by Johnson & Sinharay (2018).
eap_reliability
: The attribute-level posterior probability reliability
statistics described by Johnson & Sinharay (2020).
reliability(measrdcm)
: Reliability measures for diagnostic classification
models.
Cui, Y., Gierl, M. J., & Chang, H.-H. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49(1), 19-38. doi:10.1111/j.1745-3984.2011.00158.x
Johnson, M. S., & Sinharay, S. (2018). Measures of agreement to assess attribute-level classification accuracy and consistency for cognitive diagnostic assessments. Journal of Educational Measurement, 55(4), 635-664. doi:10.1111/jedm.12196
Johnson, M. S., & Sinharay, S. (2020). The reliability of the posterior probability of skill attainment in diagnostic classification models. Journal of Educational and Behavioral Statistics, 45(1), 5-31. doi:10.3102/1076998619864550
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) reliability(rstn_mdm_lcdm)
rstn_mdm_lcdm <- measr_dcm( data = mdm_data, missing = NA, qmatrix = mdm_qmatrix, resp_id = "respondent", item_id = "item", type = "lcdm", method = "optim", seed = 63277, backend = "rstan" ) reliability(rstn_mdm_lcdm)
A loo::waic()
method that is customized for measrfit
objects. This is a
simple wrapper around loo::waic.array()
. See the loo package
vignettes for details.
## S3 method for class 'measrfit' waic(x, ..., force = FALSE)
## S3 method for class 'measrfit' waic(x, ..., force = FALSE)
x |
A measrfit object. |
... |
Additional arguments passed to |
force |
If the WAIC criterion has already been added to the model object
with |
The object returned by loo::waic.array()
.