The internal and external validity of the major depression inventory in measuring severity of depressive states
Psychological Medicine, 2003, 33, 351–356.
The internal and external validity of the Major Depression Inventory
in measuring severity of depressive states
L. R. O L S E N,1 D. V. J E N S E N, V. N O E R H O L M, K. M A R T I N Y A N D P. B E C H
From the Psychiatric Research Unit, Frederiksborg General Hospital, Hillerød ; and Department of Rheumatology,
Hoersholm General Hospital, Hoersholm, Denmark
Background. We have developed the Major Depression Inventory (MDI), consisting of 10 items,covering the DSM-IV as well as the ICD-10 symptoms of depressive illness. We aimed to evaluate thisas a scale measuring severity of depressive states with reference to both internal and external validity.
Method. Patients representing the score range from no depression to marked depression on theHamilton Depression Scale (HAM-D) completed the MDI. Both classical and modern psychometricmethods were applied for the evaluation of validity, including the Rasch analysis.
Results. In total, 91 patients were included. The results showed that the MDI had an adequateinternal validity in being a unidimensional scale (the total score an appropriate or sufficient statistic). The external validity of the MDI was also confirmed as the total score of the MDI correlatedsignificantly with the HAM-D (Pearson’s coefficient 0.86, Pf0.01, Spearman 0.80, Pf0.01).
Conclusion. When used in a sample of patients with different states of depression the MDI has anadequate internal and external validity.
questionnaires (BDI, SDS and CES-D) all con-tain around 20 symptoms which, however, have
The most frequently used self-rating scales for
a limited coverage of the nine DSM-III symp-
depression are the Beck Depression Inventory
toms of major depression. On this background,
(BDI) (Beck et al. 1961), the Zung Self-Rating
we developed the Major Depression Inventory
Depression Scale (SDS) (Zung, 1965) and the
(MDI) (Bech, 1998 ; Bech et al. 2001), which
Center for Epidemiological Studies Depression
covers the whole spectrum of symptoms in both
Scale (CES-D) (Radloff, 1977). These question-
the DSM-III/DSM-IV (APA, 1994) ‘ major de-
naires have been psychometrically evaluated as
pression ’ and the ICD-10 (WHO, 1993) ‘ mod-
scales to measure the severity of depressive states
and also as screening instruments for the diag-
On the basis of the algorithms for diagnosing
depression in accordance with DSM-IV or ICD-
10 the MDI showed a high sensitivity and speci-
1980) with symptom-based diagnostic criteria
ficity in a previous study (Bech et al. 2001).
for mental disorders, the diagnosis of major de-
In the present study we have investigated the
pression is reached using an algorithm cover-
MDI as a scale for measuring severity of de-
ing only nine symptoms. The three depression
pressive states. The analysis of the MDI has fo-cused on both the internal validity (i.e. tests for
1 Address for correspondence: Dr Lis Raabaek Olsen, Psychiatric
unidimensionality) and the external validity (i.e.
Research Unit, Frederiksborg General Hospital, DK-3400 Hillerød,Denmark.
the correspondence with a clinician rated scale).
item of guilt. Thus, the MDI contains 10 items,
however, items 8 and 10 are divided into two sub-
have been to evaluate the internal validity of
items, a and b (Appendix 1). Only the highest
the scale (the total score being an appropriate
scores of items 8 and 10 (either a or b) are in-
or sufficient statistic) as well as the external
cluded in the statistical analysis. On a 6-point
validity of the scale (the correlation of the MDI
Likert scale, the individual items measure how
with the Hamilton Depression Scale (HAM-D
(Hamilton, 1967 ; Bech et al. 1986)), which in-
present during the past 14 days. The scale goes
cludes a standardization of the MDI with cut-off
from 0 (the symptom has not been present at all)
scores in terms of the HAM-D definitions of mild
to 5 (the symptom has been present all of the
time). The various steps refer to the frequency ofthe symptoms during the last 2 weeks and aredefined by adverbs or adjectives (Appendix 1)
with only indirect definitions. In a previous study
(Bent-Hansen et al. 1995) it had been found that
The patients were selected from ongoing studies
depressed patients prefer such indirect stipu-
within the following range of depressive states.
lations to a direct or definite item manual for theindividual items.
These were out-patients from our Department
diagnostic instrument with the algorithms lead-
of Rheumatology ; we consecutively included
ing to the DSM-IV or ICD-10 categories ‘ major ’
patients who had suffered from low back pain
or ‘ moderate to severe ’ depression (Bech et al.
for more than 3 months without psychiatric
2001), and as a measuring instrument in which
the total score is a sufficient statistic. When usedas a measuring instrument, the 10 items are
added up, with a theoretical score range from0 to 50.
These out-patients were from a private psychi-atric practice in Copenhagen and they had been
screened for inclusion in a study on social ad-aptation.
We used the 17-item (HAM-D17) version en-dorsed by Max Hamilton and published by Bech
et al. (1986). This version has been used in thestudies performed by the Danish University
These out-patients from our Psychiatric Re-
Antidepressant Group (e.g. DUAG, 1990). The
search Unit were participating in an ongoing
HAM-D raters who participated in the present
study on light therapy in major depression with-
study had been trained as investigators in the
out SAD (seasonal affective disorder).
DUAG trials. The intraclass coefficients of re-
liability in the DUAG trials are 0.75 or higher(Stage et al. 2001).
These in-patients from our Psychiatric HospitalDepartment were participating in a study on the
sensitivity and specificity of the MDI, using thePresent State Examination (PSE) as the index of
diagnostic validity (Bech et al. 2001).
A factor analysis in terms of a principal com-
ponent analysis was performed (Nunnally &
Bernstein, 1994). A scree plot was used to de-
The items of the scale cover the ten ICD-10
termine the numbers of factors to be taken into
symptoms of depression. These symptoms are
consideration. A ‘ general factor ’ was defined as a
identical with the DSM-IV major depression
factor explaining at least 50 % of the variance.
symptoms apart from one symptom, low self-
Cronbach’s coefficient alpha was used to evalu-
esteem, which in DSM-IV is incorporated in the
ate internal consistency. A coefficient of 0.80 or
The MDI scale for measuring severity of depression
higher was considered adequate (Nunnally &
Traditionally, the number of patients needed psy-chometrically when using principal component
analysis is approximately 10 times the number ofitems in the scale under examination (Aiken,
The Rasch analysis (Rasch, 1960 ; Bech et al.
1995). As the MDI contains ten items, the num-
1981 ; Allerup, 1997) was used to test for uni-
ber of patients should be approximately 100.
dimensionality of the scale. The test of fit of theRasch model for the total scale score being asufficient statistic was performed by use of the
one parameter logistic programme in which thecriteria of males versus females and patients with
In total, 91 patients (24 males, 67 females ; mean
low scores versus patients with high scores were
age 45.5 years, S.D. 15.2) were included in the
tested (Verhelst & Glass, 1995). The non-para-
psychometric analysis of the MDI. Of those, 18
metric evaluation of the data structure in ac-
patients were recruited from the Department of
cordance with the Rasch model was performed
Rheumatology (5 males, 13 females, mean age
using the Mokken analysis (Mokken, 1971 ; De
43.0 years, S.D. 15.2, HAM-D17 mean score 6.1,
Jong & Molenaar, 1987 ; Molenaar et al. 1994).
S.D. 5.9), 11 patients were recruited from our Out-
The Mokken analysis of homogeneity or uni-
patient Research Unit (4 males, 7 females ; mean
dimensionality is a measure of the extent to which
age 48.5 years, S.D. 11.1, HAM-D17 mean score
an extra item fits into the structure provided
20.6, S.D. 4.6), 40 patients were recruited from
by the other items of the scale. The test of fit of
the private psychiatric practice (13 males, 27 fe-
the individual items analogously to the Mokken
males ; mean age 40.6 years, S.D. 15.5, HAM-D17
analysis was within the Rasch analysis per-
mean score 18.9, S.D. 7.5), and 22 in-patients
formed as described by Allerup (1997). Each item
were recruited from our Psychiatric Department
was first dichotomized by rescoring grades 0, 1
(2 males, 20 females ; mean age 55.0 years, S.D.
and 2 as 0, and grades 3, 4 and 5 as 1. The level
15.1). In the latter sample of in-patients, all
of rejection of unidimensionality in the Rasch
patients had a mood disorder, 15 patients had a
analysis was Pf0.01. As external criterion the
current diagnosis of major depression (HAM-
level of acceptance according to the Mokken
D17 mean score 21.5, S.D. 5.5) and the remaining
analysis was a coefficient of homogeneity of
7 patients had major depression in remission
o0.40, while a coefficient of 0.30 to 0.39 was
(HAM-D17 mean score 11.1, S.D. 6.3). In the
considered only to be just acceptable (Mokken,
different groups, the percentage of patients with
a HAM-D17 score of o18 ranged from 5.6 % to66.7 %.
The Hamilton Depression Scale (HAM-D) was
used as the index of external validity in the 17-
item version (HAM-D17). As to the measure of
identified only one factor when the scree plot was
correlation, the Pearson coefficient is reported in
analysed. This factor explained 56 % of the
some studies while the Spearman coefficient is
variance while the second factor explained 10 %,
reported in other studies. Therefore, the strength
the third factor 8 % and the fourth factor 5 %
of the association between the MDI and the
of the variance. Table 1 shows the factor load-
HAM-D17 was expressed in terms of the Pearson
ings for the individual items according to the prin-
coefficient (Altman, 1991) as well as the Spear-
cipal component analysis, indicating a higher
man coefficient (Siegel, 1956). The association
loading in the top-listed items compared to the
between the MDI score and the HAM-D score
bottom-listed. Cronbach’s coefficient alpha was
to standardize the MDI was estimated by linear
0.90. Table 1 shows also the results of the
regression analysis in which the MDI score was
Mokken analysis with the Loevinger coefficient
considered the dependent variable. The stan-
of homogeneity for the total scores and for the
dardization included prediction intervals of the
individual items. Although two of the items had
Loevinger coefficients of <0.40 (item 9 and 10)
By linear regression in which the MDI score
the 10 items of the MDI with the corresponding
was considered as the dependent variable the
factor loadings from the principal component
following equation was estimated (confidence
analysis. The items are listed in terms of inclu-
siveness (rank-ordered ), i.e. highest mean score for
‘ lack of energy ’ and lowest mean score for ‘ sui-
For this estimation the value of R2 is 0.73, i.e. the
proportion of the total variation of the depen-
dent variable explained by this model is 73 %.
Table 2 shows the standardization of the MDI
using the conventional cut-off scores on the
HAM-D17 as index of validity (Bech et al. 1975).
The range of scores obtained on the Hamilton
population in the present study had a distri-
bution which was adequate for an analysis ofa self-rating scale such as the MDI, i.e. a scalefor patients with mild to marked degrees of de-pressive states. All patients in the present study
the coefficient for testing to what extent all the
were able to complete the MDI, indicating a high
dimensionality of the scale was acceptable (0.52).
The 10 items of the MDI obviously have a high
Table 1 shows the rank-order of the MDI items
content validity when compared to the diag-
when using the mean score value for the indi-
nostic systems (DSM-IV or ICD-10) as the scale
vidual items as index of inclusiveness. Thus, at
is based on the universe of symptoms within these
the top is placed item 3 (lack of energy) and at the
systems. Although symptoms with a high diag-
nostic validity do not necessarily have a high
The Rasch analysis confirmed that the 10 items
validity for measuring severity (Frances et al.
of MDI constitute one dimension. According to
1990 ; Kessler & Mroczek, 1995), the present
the Rasch analysis the same rank-order of the
study showed that the MDI is a unidimensional
individual items was found both when males
scale. This was supported both with classical
were compared with females and when patients
psychometric tests (e.g. principal component
with low total MDI scores were compared with
analysis and Cronbach’s coefficient alpha) and
patients with high total MDI scores. Where
with modern psychometric tests (e.g. the Mok-
discrepancies emerged in rank-order between
ken analysis and the Rasch analysis). The rank-
the Mokken analysis and the Rasch analysis the
order of inclusiveness showed almost the same
difference was only of the order of one rank. The
pattern when applying the two different types of
item with the lowest coefficient in the Mokken
modern psychometric tests. The structure of in-
analysis was item 9 (sleep), which also was the
clusiveness shows that the core symptoms of
depression according to DSM-IV and ICD-10(depressed mood, lack of energy and lack of in-terests) are among the most inclusive items of the
MDI (Table 1) indicating a ‘ ceiling effect ’, while
When the MDI scores were correlated to the
the items of guilt feelings and suicidal thoughts
were most exclusive indicating a ‘ floor effect ’.
ficient was 0.86 (Pf0.01), (the corresponding
The somatic items (sleep and appetite) showed
non-parametric Spearman coefficient was 0.80
suboptimal fitting in the Mokken analysis
(Loevinger’s coefficients <0.40) as well as in the
The MDI scale for measuring severity of depression
Standardization of the Major Depression Inventory (MDI ) using the HAM-D17 as index of
Probable major depression/mild depression
Rasch analysis. Furthermore, in the principal
depression to fulfil the objective of the study. The
scale might perform differently in a more homo-
showed the lowest factor loadings. However, the
genous sample of depressed people, studies to in-
somatic items had no impact on the overall val-
spect this are now in progress. Additional items
idity of the MDI indicating that the total score is
could have been added, e.g. an item about hy-
a sufficient statistic. The three self-rating scales
persomnia, which is included in the DSM-IV but
developed before the release of the DSM-III
not in the ICD-10. Nevertheless, the purpose
(BDI, SDS and CES-D) have all previously been
with this scale was to make it as short as possible
correlated with the HAM-D17 and coefficients
while still covering enough information to make
between 0.6 and 0.8 have been reported (e.g.
diagnoses as well as to rate severity of depression.
Brown & Zung, 1972 ; Bech et al. 1975 ; Biggs
In conclusion, this study has shown that the
et al. 1978 ; Radloff, 1977). The correlation co-
total score of the MDI is a sufficient statistic to
efficient of 0.86 found in the present study is,
measure severity of depressive states. Moreover,
a linear correlation to the Hamilton Depression
The standardization of the MDI indicated that
Scale has been found, resulting in a standardiz-
a cut-off score of 27 corresponds to a score of 18
ation of the MDI by using the HAM-D17 as index
on HAM-D17 (or major depression), which is in
agreement with our analysis of the MDI whencompared to the diagnosis of major depressionbased on a psychiatric interview (Bech et al.
2001). As shown by Paykel (1990) a HAM-D17
Aiken, L. R. (1995). Personality Assessment. Methods and Practices
score of 18 equals major depression while a score
(2nd revised edn.). Hogrefe and Huber : Toronto.
of 13 equals probable major depression.
Allerup, P. (1997). Statistical analyses of data from the IEA reading
Because the MDI scale is a brief scale, con-
literacy study. In Applications of Latent Trait and Latent ClassModels in the Social Sciences (ed. J. Rost and R. Langeheine),
sisting of only 10 items that are presented to the
patient on a single page (Appendix 1), the MDI
Altman, D. (1991). Practical Statistics for Medical Research. Chap-
can easily be used in the setting of general
American Psychiatric Association (1980). Diagnostic and Statistical
practice or in somatic hospital departments both
Manual of Mental Disorders, 3rd edn. (DSM-III ). APA : Washing-
as a screening instrument for detecting de-
American Psychiatric Association (1994). Diagnostic and Statistical
pression (Bech & Wermuth, 1998 ; Bech et al.
Manual of Mental Disorders, 4th edn. (DSM-IV ). APA : Washing-
2001) and to monitor the effect of antidepressive
therapy analogous to the use of the Hamilton
Bech, P. (1998). Quality of Life in the Psychiatric Patient. Mosby-
Depression Scale as outcome measure.
Bech, P. & Wermuth, L. (1998). Applicability of the Major Depression
This study has some limitations. The diag-
Inventory in patients with Parkinson’s disease. Nordic Journal ofPsychiatry 52, 305–309.
noses were not made by structured research in-
Bech, P., Gram, L. F., Dein, E., Jacobsen, O., Vitger, J. & Bolwig,
terview, instead the Hamilton Depression Scale
T. G. (1975). Quantitative rating of depressive states. Acta Psy-
was used as reference, conducted by trained
chiatrica Scandinavica 51, 161–170.
Bech, P., Allerup, A., Gram, L. F., Reisby, N., Rosenberg, R.,
psychiatrists. Comparison to SCAN interview
Jacobsen, O. & Nagy, A. (1981). The Hamilton Depression Scale.
has previously been made (Bech et al. 2001).
Evaluation of objectivity using logistic models. Acta Psychiatrica
In the present study we used a sample covering
Bech, P., Kastrup, M. & Rafaelsen, O. J. (1986). Mini-compendium of
the spectrum from no depression to severe
rating scales for states of anxiety, depression, mania, schizophrenia
with corresponding DSM-III syndromes. Acta Psychiatrica Scan-
Kessler, R. C. & Mroczak, M. (1995). Measuring the effect of medical
intervention. Medical Care 33 (suppl.), 109–119.
Bech, P., Rasmussen, N. A., Raabaek Olsen, L., Noerholm, V. &
Mokken, R. J. (1971). A Theory and Procedure of Scale Analysis.
Abildgaard, W. (2001). The sensitivity and specificity of the Major
Depression Inventory, using the Present State Examination as
Molenaar, I. W., Debets, P., Sytsma, K. & Hemker, B. T.
the index of diagnostic validity. Journal of Affective Disorders 66,
(1994). User’s Manual MSP : A Program for Mokken Scale
Analysis for Polytomous Items (version 3.0). lec ProGamma :
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J. & Erbaugh, J.
(1961). An inventory for measuring depression. Archives of General
Nunnally, J. & Bernstein, H. (1994). Psychometric theory, 3rd edn.
Bent-Hansen, J., Lauritzen, L., Clemmesen, L., Lunde, M. & Korner,
Paykel, E. S. (1990). Use of the Hamilton Depression Scale in general
A. (1995). A definite and a semidefinite questionnaire version of
practice. In The Hamilton Scales (ed. P. Bech and A. Coppen),
the Hamilton/Melancholia (HDS/MES) scale. Journal of Affective
Radloff, L. S. (1977). The CES-D scale : a self-report depression scale
Biggs, J. T., Wylie, L. T. & Ziegler, V. E. (1978). Validity of the Zung
for research in the general population. Applied Psychological
self-rating depression scale. British Journal of Psychiatry 132,
Rasch, G. (1960). Probalistic Models for Some Intelligence and
Brown, G. L. & Zung, W. W. K. (1972). Depression scales : self-
Attainment Tests. Danish Institute for Educational Research :
physician-rating ? A validation of certain clinically observable
Copenhagen (reprinted 1980 by University of Chicago Press :
phenomena. Comprehensive Psychiatry 13, 361–367.
Danish University Antidepressant Group (1990). Paroxetine : a selec-
Siegel, S. (1956). Non-Parametric Statistics. McGraw-Hill : New
tive serotonin reuptake inhibitor showing better tolerance, but
weaker antidepressant effect than clomipramine in a controlled
Stage, K. B., Bech, P., Kragh-Sorensen, P., Nair, N. P. & Katona, C.
multicenter study. Journal of Affective Disorders 18, 59–66.
(2001). Differences in symptomatology and diagnostic profile in
De Jong, A. & Molenaar, I. N. (1987). An application of Mokken’s
younger and elderly depressed inpatients. Journal of Affective
model for stochastic, cumulative scaling in psychiatric research.
Journal of Psychiatric Research 21, 137–149.
Verhelst, N. & Glass, C. (1995). The One Parameter Logistic Model :
Frances, A., Pincus, H. A., Widiger, T. A., Davis, W. W. & First,
OPLM. Arnheim : Cito, The Netherlands.
M. B. (1990). DSM-IV : work in progress. American Journal of
World Health Organization (1993). The ICD-10 Classification of
Mental and Behavioural Disorders. Diagnostic Criteria for Research.
Hamilton, M. A. (1967). Development of a rating scale for primary
depressive illness. British Journal of Social and Clinical Psychology
Zung, W. W. K. (1965). A self-rating depression scale. Archives of
The following questions ask about how you have been feeling over the last two weeks. Please put a tick in the boxwhich is closest to how you have been feeling.
Have you lost interest in your daily activities?
Have you felt lacking in energy and strength ?
Have you had a bad conscience or feelings of guilt ?
Have you felt that life wasn’t worth living?
Have you had difficulty in concentrating,
e.g. when reading the newspaper orwatching television ?
10a Have you suffered from reduced appetite ?
10b Have you suffered from increased appetite ?
WinterZeit verweilZeit ein ‚ flexitarisches’ Café und Bistro Café Latte mit weißer Schokolade Unsere bewusste Entscheidung zur fleisch- und fischarmen Ernährung! Im Wort flexitarisch stecken die Wörter flexibel und vegetarisch. Man isst, was einem schmeckt und folgt dabei keinem festgelegten Plan. In der einen Woche kann man als Flexitarier Fleisch essen, in der darauff
Paratracks, November 2000. Publication of Canadian Paraplegic Association (Manitoba) Inc. RESEARCH Spinal Cord Injury Clinical Research Report I believe a short explanation is needed, before loss. It would appear that the drug would need to bereading about the following research projects. Intaken indefinitely. Please refer to footnote 1 at theclinical research, basic researchers work t