ABSTRACT.This study was undertaken to assess the test-retest reliability of the Medilog SS-90-III Sleep Stager by comparing sleep stager scoring of the same records scored more than once. Nineteen normal volunteers, ranging in age from 19.3 to 63.5 (mean = 31.1 yrs), served as subjects. All sleep tapes were scored five consecutive times (runs 1–5). Sleep measures studied were: total time in bed; total sleep time; wake after sleep onset; movement time; time in stages 1, 2, 3, 4, and REM; percent of stages 1, 2, 3, 4, and REM; sleep latency; stage REM, 2, 3, and 4 latencies. The primary findings were: 1) Pearson correlation coefficients between consecutive scoring runs achieved an acceptable level of reliability (.990–1.00) for an automated scoring system in 51% of 100 correlations, whereas 49% were below .990, and 30% of the latter were below .970; 2) 13 of 20 alpha reliability coefficients of the same sleep measures reached the acceptable range of .990–1.00, and 7 (35%) of the alpha coefficients were below .990. The greatest problems occurred with the latency measures (sleep onset to stages REM, 2, 3, and 4), plus a second group: wake after sleep onset, movement time, and sleep latency. The only measures to show consistent reliability over .990 were total time in bed (based on clock time), and two measures of stages 3 and 4. No measures of stages REM or 2 showed consistent reliability over .900. Implications are discussed.