Files
measure-repository/SOPs/Validation/SOP-VAL-001-Psychometric-Validation.md

260 lines
8.8 KiB
Markdown

# Standard Operating Procedure: Psychometric Validation of Clinical Outcome Measures
| Document ID | SOP-VAL-001 |
|-------------|---------|
| Title | Psychometric Validation of Clinical Outcome Measures |
| Revision | 1.0 |
| Effective Date | [DATE] |
| Author | [AUTHOR] |
| Approved By | [APPROVER] |
| Department | Outcomes Research |
---
## 1. Purpose
This procedure establishes requirements for conducting psychometric validation studies of clinical outcome measures to ensure they demonstrate appropriate measurement properties for their intended use.
## 2. Scope
This procedure applies to:
- New outcome measure development
- Validation of existing measures in new populations
- Adaptation of measures for new contexts or modes of administration
- All outcome measure types (PRO, ClinRO, ObsRO, PerfO)
## 3. Responsibilities
### 3.1 Principal Investigator/Measure Developer
- Design validation study protocol
- Ensure appropriate statistical expertise
- Review and interpret validation results
- Document validation evidence
### 3.2 Biostatistician
- Develop statistical analysis plan
- Conduct psychometric analyses
- Generate validation reports
- Advise on sample size and methodology
### 3.3 Quality Manager
- Review validation protocols for regulatory compliance
- Maintain validation documentation
- Track validation status of all measures
## 4. Definitions
| Term | Definition |
|------|------------|
| Reliability | The degree to which a measure is free from measurement error |
| Internal Consistency | The extent to which items within a scale measure the same construct (Cronbach's alpha) |
| Test-Retest Reliability | Consistency of scores when measure is administered to the same individuals at different times |
| Inter-Rater Reliability | Agreement between different raters/observers (for ClinRO, ObsRO) |
| Validity | The degree to which a measure assesses what it purports to measure |
| Content Validity | Evidence that measure items represent all aspects of the construct |
| Construct Validity | Evidence that measure relates to other measures as theoretically expected |
| Criterion Validity | Agreement between measure and a gold standard |
| Responsiveness | Ability to detect meaningful change over time |
| MCID | Minimal Clinically Important Difference - smallest change considered important |
| Floor/Ceiling Effects | Clustering of scores at bottom or top of scale, limiting ability to detect change |
## 5. Procedure
### 5.1 Validation Study Planning
5.1.1. Define validation objectives:
- Target population
- Intended use and context
- Mode of administration (paper, electronic, interview)
- Key measurement properties to evaluate
5.1.2. Develop validation protocol including:
- Background and rationale
- Study design and timeline
- Participant eligibility criteria
- Sample size justification
- Data collection procedures
- Statistical analysis plan
- Success criteria for validation
5.1.3. Select comparison measures:
- Established measures of same construct (convergent validity)
- Measures of different constructs (discriminant validity)
- Clinical indicators or gold standards (criterion validity)
5.1.4. Obtain necessary regulatory approvals (IRB, informed consent)
5.1.5. Document validation plan in Form FRM-VAL-001
### 5.2 Reliability Assessment
#### 5.2.1 Internal Consistency Reliability
5.2.1.1. Analyze baseline data from main study sample
5.2.1.2. Calculate Cronbach's alpha for each scale/subscale
5.2.1.3. Acceptance criteria:
- Alpha ≥ 0.70 for group comparisons
- Alpha ≥ 0.90 for individual decision-making
- Alpha < 0.95 (if higher, may indicate item redundancy)
5.2.1.4. Examine item-total correlations (typically ≥ 0.30)
5.2.1.5. Assess scale dimensionality using factor analysis
#### 5.2.2 Test-Retest Reliability
5.2.2.1. Administer measure twice to stable subsample
5.2.2.2. Time interval: typically 2-14 days
- Short enough that true change is unlikely
- Long enough to prevent memory effects
5.2.2.3. Calculate intraclass correlation coefficient (ICC)
5.2.2.4. Acceptance criteria:
- ICC ≥ 0.70 for group comparisons
- ICC ≥ 0.90 for individual decision-making
5.2.2.5. Calculate standard error of measurement (SEM)
5.2.2.6. Generate Bland-Altman plots to assess agreement
#### 5.2.3 Inter-Rater Reliability (for ClinRO, ObsRO)
5.2.3.1. Have multiple raters assess same participants
5.2.3.2. Calculate ICC or weighted kappa as appropriate
5.2.3.3. Acceptance criteria:
- ICC or kappa ≥ 0.70
5.2.3.4. Identify sources of disagreement for training improvement
### 5.3 Validity Assessment
#### 5.3.1 Content Validity
5.3.1.1. Conduct qualitative research with target population:
- Concept elicitation interviews
- Cognitive debriefing of items
- Assessment of comprehensibility and relevance
5.3.1.2. Obtain expert panel review:
- Clinical experts
- Psychometricians
- Patient representatives
5.3.1.3. Document evidence in content validity report
5.3.1.4. For FDA submissions, follow FDA PRO Guidance requirements
#### 5.3.2 Construct Validity
5.3.2.1. Convergent validity:
- Correlate with established measures of same construct
- Expected correlation: typically r ≥ 0.50-0.70
5.3.2.2. Discriminant validity:
- Correlate with measures of different constructs
- Expected correlation: typically r < 0.30
5.3.2.3. Known-groups validity:
- Compare scores across groups expected to differ
- Use appropriate statistical tests (t-test, ANOVA)
- Calculate effect sizes (Cohen's d, eta-squared)
5.3.2.4. Factorial validity:
- Conduct confirmatory factor analysis (CFA)
- Assess model fit (CFI > 0.90, RMSEA < 0.08, SRMR < 0.08)
#### 5.3.3 Criterion Validity
5.3.3.1. If gold standard exists, calculate:
- Sensitivity and specificity
- Positive and negative predictive values
- ROC curves and AUC
### 5.4 Responsiveness Assessment
5.4.1. Collect data at baseline and follow-up from participants expected to change
5.4.2. Calculate change scores
5.4.3. Assess responsiveness using:
- Effect sizes (Cohen's d, standardized response mean)
- Correlation with external indicators of change
- Receiver operating characteristic (ROC) analysis
5.4.4. Determine Minimal Clinically Important Difference (MCID):
- Anchor-based methods (correlation with patient global ratings)
- Distribution-based methods (0.5 SD, 1 SEM)
- Multiple methods recommended
### 5.5 Interpretability Assessment
5.5.1. Assess score distribution:
- Floor effects: >15% scoring at minimum
- Ceiling effects: >15% scoring at maximum
- Skewness and kurtosis
5.5.2. Develop score interpretation guidelines:
- Clinical cutoff scores
- Severity categories
- Normative data (if appropriate)
5.5.3. Document MCID and other interpretability anchors
### 5.6 Validation Report
5.6.1. Prepare comprehensive validation report including:
- Study objectives and methods
- Participant characteristics
- All psychometric analyses results
- Tables and figures
- Discussion of strengths and limitations
- Conclusions and recommendations for use
5.6.2. File validation report as Form FRM-VAL-002
5.6.3. Update measure status in Validation Tracking Database
5.6.4. For regulatory submissions, prepare according to FDA guidance
### 5.7 Ongoing Validation Activities
5.7.1. Plan for continued evidence generation:
- Validation in additional populations
- Assessment in different contexts or settings
- Cross-cultural validation
- Longitudinal measurement invariance
5.7.2. Monitor published validation evidence for measures in use
5.7.3. Review and update validation status annually
## 6. Related Documents
- FRM-VAL-001: Validation Study Protocol Template
- FRM-VAL-002: Psychometric Validation Report Template
- FRM-VAL-003: Validation Tracking Database
- SOP-DM-001: Data Management for Validation Studies
- SOP-LIC-001: License Management
## 7. References
- FDA (2009). Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims
- Mokkink LB, et al. (2010). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments. Quality of Life Research, 19(4), 539-549
- Reeve BB, et al. (2013). ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research. Quality of Life Research, 22(8), 1889-1905
- Streiner DL, Norman GR, Cairney J (2015). Health Measurement Scales: A Practical Guide to Their Development and Use (5th ed.). Oxford University Press
- DeVellis RF (2017). Scale Development: Theory and Applications (4th ed.). SAGE Publications
---
## Revision History
| Rev | Date | Description | Author |
|-----|------|-------------|--------|
| 1.0 | [DATE] | Initial release | [AUTHOR] |