GithubHelp home page GithubHelp logo

wlenhard / cnorm Goto Github PK

View Code? Open in Web Editor NEW
2.0 3.0 2.0 3.12 MB

Continuous Norming with R

Home Page: https://www.psychometrica.de/cNorm_en.html

License: GNU Affero General Public License v3.0

R 98.16% HTML 1.84%
percentile psychometrics biometrics norm-scores normalization-techniques continuous-norming regression-based-norming norm-tables growth-curve

cnorm's Introduction

CRAN_Status_Badge CRAN RStudio mirror downloads

cNORM

The cNORM package provides robust methods for generating continuous standard scores in psychometric test development, biometrics, and medical screenings. Based on the approach by A. Lenhard et al. (2016, 2019), it addresses common issues in conventional norming methods. For an in-depth tutorial, visit the project homepage or try the online demonstration.

The package cNORM provides robust methods for generating continuous standard scores, as f. e. for psychometric test development, biometrics (e. g. biological and physiological growth curves), and screenings in the medical domain. Based on the approach by A. Lenhard et al. (2016, 2019), it addresses common issues in conventional norming methods. For an in-depth tutorial please visit the project homepage https://www.psychometrica.de/cNorm_en.html and https://cnorm.shinyapps.io/cNORM/ for an online demonstration.

Approach

Conventional methods for producing test norms are often plagued with "jumps" or "gaps" (i.e., discontinuities) in norm tables and low confidence for assessing extreme scores. cNORM addresses these problems and also has the added advantage of not requiring assumptions about the distribution of the raw data: The standard scores are established from raw data by modeling the latter ones as a function of both percentile scores and an explanatory variable (e.g., age) through Taylor polynomials. The method minimizes bias arising from sampling and measurement error, while handling marked deviations from normality – such as are commonplace in clinical samples. It includes procedures for post stratification of norm samples to overcome bias in data collection and to mitigate violations of representativeness. Contrary to parametric approaches, it does not rely on distribution assumptions of the initial norm data and is thus a very robust approach in generating norm tables.

The rationale of the approach is model the relationship between location / norm score, age and raw score via multiple regression and to fit a 3-dimensional hyperplane. This hyperplane is used to close all gaps and to compute continuous norm scores:

Installation

cNORM can be installed via

install.packages("cNORM", dependencies = TRUE)

Additionally, you can download a precompiled version or access the github development version via

install.packages("devtools")
library(devtools)

devtools::install_github("WLenhard/cNORM")
library(cNORM)

Please report errors. Suggestions for improvement are always welcome!

Example

Conducting the analysis consists of the following steps:

  1. Data preparation and establishing the regression model
  2. Validating the model
  3. Generating norm tables and plotting the results

cNORM offers functions for selecting the best fitting models and in generating the norm tables.

## In a nutshell:
## Basic example code for modeling the sample dataset
library(cNORM)

# Start the graphical user interface (needs shiny installed)
# The GUI includes the most important functions. For specific cases,
# please use cNORM on the console.
cNORM.GUI()

# Easy start: Conventional norming for one group without continuum over age
# with the inbuilt elfe dataset.
cnorm(raw = elfe$raw)

# Rank data within group and compute powers and interactions for the internal 
# dataset 'elfe' and compute model. The resulting object includes the ranked 
# data via object$data and model via object$model.
cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group)

# Plot R2 of different model solutions in dependence of the number of predictors
plot(cnorm.elfe, "subset", type=0)        # plot R2
plot(cnorm.elfe, "subset", type=3)        # plot MSE

# NOTE! At this point, you usually select a good fitting model and rerun the process
# with a fixed number of terms, e. g. four. Avoid models with a high number of terms:
cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group, terms = 4)

# Per default, the power parameter is set to k = 5 and t = 3. You can choose a value up 
# to 6, but higher values can lead to overfit. In case of overfit, please reduce these
# values. In case, only k is specified, cNORM uses this value for both k and t.
# In the following example, the distribution per age is modeled with power parameter 
# k = 3 (= cubic), while for the age, there is only a quadratic trajectory (-> 't = 2').
cnorm.elfe <- cnorm(raw = elfe$raw, group = elfe$group, k = 3, t = 2)

#  Visual inspection of the percentile curves of the fitted model
plot(cnorm.elfe, "percentiles")

# Visual inspection of the observed and fitted raw and norm scores
plot(cnorm.elfe, "norm")
plot(cnorm.elfe, "raw")
plot(cnorm.elfe, "raw", group = "group") # show fit per grouping variable

# In order to check, how other models perform, plot series of percentile plots with ascending
# number of predictors, in this example up to 14 predictors.
plot(cnorm.elfe, "series", end=14)

# Cross validation of number of terms with 20% of the data for validation and 80% training.
# Due to the time intensity, max terms is restricted to 10 in this example; 3 repetitions
cnorm.cv(cnorm.elfe$data, max=10, repetitions=3)

# Cross validation with pre-specified terms, e. g. of an already existing model
cnorm.cv(cnorm.elfe, repetitions=3)

# Print norm table (for grade 3, 3.2, 3.4, 3.6)
normTable(c(3, 3.2, 3.4, 3.6), cnorm.elfe)

# The other way round: Print raw table (for grade 3) together with 90% confidence intervalls
# for a test with a reliability of .94
rawTable(3, cnorm.elfe, CI = .9, reliability = .94)

# Get the predicted norm scores for a vector of raw scores and explanatory variable, e. g. age
predicted <- predictNorm(elfe$raw, elfe$group, cnorm.elfe)

# In case of unbalanced datasets deviating from the census, the norm data
# can be weighted by the means of raking / post stratification. Please generate
# the weights with the computeWeights() function and pass them as the weights
# parameter. For computing the weights, please specify a data.frame with the
# population margins (further information is available in the computeWeights
# function). A demonstration based on sex and migration status in vocabulary
# development (ppvt dataset; Gary et al., 2023a, 2023b):
margins <- data.frame(variables = c("sex", "sex",
                                    "migration", "migration"),
                      levels = c(1, 2, 0, 1),
                      share = c(.52, .48, .7, .3))
weights <- computeWeights(ppvt, margins)
model <- cnorm(raw = ppvt$raw, group=ppvt$group, weights = weights)


# start vignette for a complete walk through
vignette("cNORM-Demo", package = "cNORM")
vignette("WeightedRegression", package = "cNORM")

cNORM offers functions to choose the optimal model, both from a visual inspection of the percentiles, as well as by information criteria and model tests:

In this example, a Taylor polynomial with power k = 4 was computed in order to model a sample of the ELFE 1-6 reading comprehension test (sentence completion task; W. Lenhard & Schneider, 2006). In the plot, you can see the share of variance explained by the different models (with progressing number of predictors). Adjusted R2, Mallow's Cp (an AIC like measure) and BIC is used (BIC is available through the option type = 2). The predefined adjusted R2 value of .99 is already reached with the third model and afterwards we only get minor improvements in adjusted R2. On the other hand, Cp rapidly declines afterwards, so model 3 seems to be a good candidate in terms of the relative information content per predictor and the captured information (adjusted R2). It is advisable to choose a model at the "elbow" in order to avoid over-fitting, but the solution should be tested for violations of model assumptions and the progression of the percentiles should be inspected visually, as well.

The predicted progression over age are displayed as lines and the manifest data as dots. Only three predictors were necessary to almost perfectly model the norm sample data with adjusted R2.

Sample Data

The package includes data from two large test norming projects, namely ELFE 1-6 (Lenhard & Schneider, 2006) and German adaption of the PPVT4 (A. Lenhard, Lenhard, Suggate & Seegerer, 2015), which can be used to run the analysis. Furthermore, large samples from the Center of Disease Control (CDC) on growth curves in childhood and adolescence (for computing Body Mass Index 'BMI' curves), life expectancy at birth and mortality per country from 1960 to 2017 (available from The World Bank). Type ?elfe, ?ppvt, ?CDC, ?epm, ?mortality or ?life to display information on the data sets.

Terms of use, license and declaration of interest

cNORM is licensed under GNU Affero General Public License v3 (AGPL-3.0). This means that copyrighted parts of cNORM can be used free of charge for commercial and non-commercial purposes that run under this same license, retain the copyright notice, provide their source code and correctly cite cNORM. Copyright protection includes, for example, the reproduction and distribution of source code or parts of the source code of cNORM or of graphics created with cNORM. The integration of the package into a server environment in order to access the functionality of the software (e.g. for online delivery of norm scores) is also subject to this license. However, a regression function determined with cNORM is not subject to copyright protection and may be used freely without preconditions. If you want to apply cNORM in a way that is not compatible with the terms of the AGPL 3.0 license, please do not hesitate to contact us to negotiate individual conditions. If you want to use cNORM for scientific publications, we would also ask you to quote the source.

The authors would like to thank WPS (https://www.wpspublish.com/) for providing funding for developing, integrating and evaluating weighting and post stratification in the cNORM package. The research project was conducted in 2022.

References

  • Gary, S., Lenhard, W., Lenhard, A. et al. A tutorial on automatic post-stratification and weighting in conventional and regression-based norming of psychometric tests. Behav Res (2023a). https://doi.org/10.3758/s13428-023-02207-0
  • Gary, S., Lenhard, A., Lenhard, W., & Herzberg, D. S. (2023b). Reducing the bias of norm scores in non-representative samples: Weighting as an adjunct to continuous norming methods. Assessment, 10731911231153832. https://doi.org/10.1177/10731911231153832
  • Lenhard, A., Lenhard, W., Segerer, R. & Suggate, S. (2015). Peabody Picture Vocabulary Test - Revision IV (Deutsche Adaption). Frankfurt a. M./Germany: Pearson Assessment.
  • Lenhard, A., Lenhard, W., Suggate, S. & Segerer, R. (2016). A continuous solution to the norming problem. Assessment, Online first, 1-14. https://doi.org/10.1177/1073191116656437
  • Lenhard, A., Lenhard, W., Gary, S. (2018). Continuous Norming (cNORM). The Comprehensive R Network, Package cNORM, available: https://CRAN.R-project.org/package=cNORM
  • Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. PLoS ONE, 14(9), e0222279. https://doi.org/10.1371/journal.pone.0222279
  • Lenhard, W., & Lenhard, A. (2020). Improvement of Norm Score Quality via Regression-Based Continuous Norming. Educational and Psychological Measurement(Online First), 1-33. https://doi.org/10.1177/0013164420928457

cnorm's People

Contributors

alexlenhard avatar andreyakinshin avatar wlenhard avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

cnorm's Issues

argument is of length zero

I am trying to use the shiny app for cNORM but when I open my dataset, I cannot get past preparation as I get the error: Error:argument is of length zero

Incorrect leaps version

Should be 3.0 instead of 3.0.0. This doesn't matter on CRAN, because in R package_version("3.0") == package_version("3.0.0"), but I found it causes issues in other systems (e.g., RPM packaging).

Confidence interval in the normTable function: bug and suggestion

I probably found a bug in normTable function. If confidence intervals are requested, the interval for percentile works, but it is definitely wrong for the standard score as it shows the upper bound in both columns. An example (using N(10,3)):
image

In addition, I really like you use regression method. However, I would like to know more about the exact method which is used. Is it SE for observed score, SE = SD*sqrt(1-r), or SE for true-score estimate, SE_t = SD*sqrt(r)*sqrt(1-r)? And the regression was estimated like E(t) = r*X+(1-r)*M? It would be helpful to provide formula or reference any source with them.

Error when selecting specific predictors

When using the function bestModel, the predictors I specified started causing error of this type:

Error in (!is.null(predictors)) && (!inherits(predictors, "formula")) && :
'length = 5' in coercion to 'logical(1)'

This happens also in scripts that were fully functioning between 1 and 2 years ago. I am not sure whether I miss something or whether there is some more general error causing this issue?

This is example of my code leading to this issue:
norms_bvmt_model <- bestModel(norms_powers, raw = "score_total", predictors = c("L1", "L3", "L1A3", "A2", "A3"), plot = T, terms = 3)

compatibility issues with R 4.3.0

cnorm.cv is broken in R 4.3.0, other functions are evaluated
library(cNORM) model <- cnorm(raw=elfe$raw, group=elfe$group) cnorm.cv(model$data)

... fails in cycle 2.

Bug in pretty formatting for rawTable()

In the attached screen shot, you can see that "pretty = TRUE" returned three rows for the 1-74 range of raw scores, whereas it should have returned only a single row for the 1-74 range.

Screen Shot 2021-10-18 at 3 48 36 PM

object ‘askYesNo’ is not exported by 'namespace:utils'

OS: Linux Mint 19.2; R version 3.4.4

** preparing package for lazy loading
Error : object ‘askYesNo’ is not exported by 'namespace:utils'
ERROR: lazy loading failed for package ‘cNORM’
Warning in install.packages :
installation of package ‘cNORM’ had non-zero exit status

leap, RColorBrewer and latticeExtra are installed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.