RIR Accuracy in Older Adults: What the Research Actually Shows

Introduction

Repetitions in Reserve has become one of the most widely used autoregulation tools in resistance training. The premise is straightforward: estimate how many repetitions you have remaining before failure, and use that number to govern effort, set termination, and volume accumulation across a training week. For men in their 30s and 40s managing full training schedules around work, family, and recovery demands, the appeal is clear. RIR offers a flexible, internally regulated way to manage training intensity without requiring external equipment or rigid prescription.

What the research has consistently struggled to establish is whether the subjective perception underlying RIR is accurate enough to be trusted as a dosage mechanism. A 2025 study published in Experimental Gerontology by Gómez-Redondo and colleagues contributes a specific and useful data point to that question, examined through an older adult population and supported by objective velocity measurement. The findings carry structural implications for how autoregulation tools should be used in any serious training framework, regardless of the age of the trainee.

Study Breakdown

Study Design

This was a controlled laboratory trial with a pre-registered protocol (ClinicalTrials.gov: NCT05619250). It was not a longitudinal training study. It was a single-session, cross-sectional validity assessment designed to measure how accurately participants could self-regulate proximity to failure using predicted RIR, with objective velocity data used as the external comparator.

Participants first completed a one-repetition maximum test on the chest press. A repetitions-to-failure test at 65% of 1RM was then conducted to establish each participant’s actual RIR baseline. On a subsequent visit, participants performed three sets at 65% of 1RM with maximum intended concentric velocity, stopping at three different predicted RIR targets — RIR-2, RIR-4, and RIR-6 — in randomized order. Bar velocity was tracked throughout using a linear position transducer. After each set, participants reported their RPE.

Population

Twenty-five community-dwelling older adults participated, with a mean age of 68 ± 4 years and a mean BMI of 28.1 ± 4.6 kg/m². Sex composition and formal resistance training classification were not reported in available data. All participants completed a familiarization phase with both resistance training and the RIR and RPE rating systems prior to data collection.

This population is exclusively adults aged 60 and above. No younger comparison group was included. Any inference drawn toward men in their 30s and 40s must be treated as contextual extrapolation, not direct application of the findings.

Intervention Characteristics

All testing was conducted on the chest press exercise at a single relative load of 65% of 1RM. Three sets were performed per RIR condition. RIR condition order was randomized within subjects. Concentric velocity was recorded on every repetition. The intervention was acute, single-session, and restricted to one upper-body pushing movement at one load. No lower-body or compound multi-joint exercises were tested.

Main Findings

The core finding is that RIR accuracy deteriorated as sets approached failure. At predicted RIR-2, participants stopped with 16% greater velocity loss relative to the corresponding repetition in the repetitions-to-failure test, a statistically significant difference (p < 0.001). At predicted RIR-4, velocity loss was 10% greater than the corresponding point in the failure test (p = 0.009). At predicted RIR-6, there was no significant velocity difference (0%; p = 0.989).

Translating this into repetition counts: at predicted RIR-2, participants underestimated by 2.1 ± 0.3 repetitions (p < 0.001). At predicted RIR-4, underestimation was 1.6 ± 0.6 repetitions (p = 0.003). At predicted RIR-6, no significant error was found (0.1 ± 0.8 repetitions; p = 0.823).

Underestimation here means participants stopped the set closer to failure than intended. They believed they had more repetitions remaining than they actually did. The consequence is that sets prescribed at RIR-2 and RIR-4 were landing closer to true failure than the participants recognized.

RPE values across the three conditions were 8.0 ± 0.2 for RIR-2, 7.6 ± 0.2 for RIR-4, and 7.3 ± 0.3 for RIR-6. These differences were not statistically significant (p > 0.05), meaning RPE did not meaningfully differentiate between the three proximity-to-failure conditions. The authors conclude that predicted RIR may lack prescriptive precision in older adults but may retain utility for volume monitoring when interpreted alongside RPE.

Limitations

The study’s limitations are substantial and should govern how far its conclusions travel. Testing was confined to a single exercise, a single relative load, and a single session. There is no longitudinal data on whether RIR accuracy improves with training experience in this population. No younger adult comparison group was included within the study itself; between-age differences are inferred from external literature. Sex and training status were not formally characterized. Effect sizes were not reported. Fatigue, hormonal, and injury markers were not measured. RPE’s failure to differentiate across conditions limits its utility as a concurrent validity marker in this context.

What This Means

The finding that is most structurally significant is not that RIR failed entirely. It is that RIR accuracy failed in a specific and predictable zone: close to failure. At RIR-6, the system held. At RIR-2, it broke down in a direction that produced more fatigue than intended, without the participants being aware of it.

This is not a finding about RIR being a useless tool. It is a finding about where the tool loses resolution. When a measurement instrument drifts in a specific and consistent direction — in this case, underestimating proximity to failure as sets get harder — the appropriate response is not to discard the instrument. It is to understand the drift, account for it structurally, and build in cross-checks where the instrument is known to be unreliable.

The RPE finding compounds this. If RPE cannot differentiate between RIR-2, RIR-4, and RIR-6 in this population, then using RPE as a corrective check on RIR perception does not solve the problem. Two tools that fail in the same window do not add precision. They confirm the same blind spot.

The authors note that unlike younger adults, where RIR accuracy tends to improve as sets approach failure, this pattern did not hold in older adults. This distinction has practical relevance for how programming tools are selected and weighted across different populations.

Application Within The DadStrength Method

Three pillars of The DadStrength Method are directly implicated by this research.

Recovery Governance carries the most immediate implication. If a lifter is consistently stopping sets closer to failure than intended, fatigue is being systematically underestimated. Within a full training week involving multiple exercises, multiple sessions, and real-world stress layered on top, that error does not stay contained to a single set. It compounds. Deload timing, inter-session recovery windows, and weekly volume ceilings are all calibrated against perceived fatigue, and if perceived fatigue is running behind actual fatigue, the margin for error collapses without visible warning.

Structural Programming must account for the possibility that RIR-2 prescriptions are landing at or near true failure in individuals whose perceptual accuracy near failure cannot be confirmed. The response within a structured framework is not to eliminate near-failure work. It is to treat RIR-2 and below as a range requiring additional external accountability rather than sole reliance on self-report. The fact that RIR-6 accuracy held in this study is a structurally useful finding. It suggests that anchoring moderate-to-hard work in the RIR-4 to RIR-6 range preserves the reliability of the autoregulation system while still producing sufficient stimulus for adaptation.

Capacity Over Intensity is the organizing principle this finding protects. Volume as dosage only functions when the dosage is measurable. If perceptual tools are drifting at the high-effort end of the range, dosage control is lost exactly where the consequences of miscalculation are highest. Programming that chases proximity to failure as a stimulus strategy, while relying solely on perception to locate that proximity, is not a controlled approach. It is unmanaged stress delivered with the appearance of structure.

Practical Implementation

Treat RIR as a reliable autoregulation tool within the RIR-4 to RIR-6 range. Accuracy in this zone was confirmed in the study population. Prescriptions in this band carry more structural reliability than those at RIR-2 and below.
When programming calls for RIR-2 work, introduce external accountability mechanisms. Bar speed tracking, video review of concentric tempo, or observation by a training partner each provide information that perception alone cannot guarantee at that proximity to failure.
Do not use RPE as a compensating check on RIR accuracy near failure. In this study, RPE failed to differentiate across the tested RIR conditions. Using one imprecise internal signal to validate another in the same failure zone does not add resolution.
Audit cumulative fatigue through session-to-session performance trends, not through end-of-session perceived effort alone. If outputs are declining across sessions at equivalent loads, that is a fatigue signal that internal perception may not be surfacing in real time.
Do not wait for perceived fatigue to dictate deload integration. If perceptual accuracy near failure is drifting, subjective tiredness is not a reliable early warning mechanism. Scheduled deloads, built into the program structure, are not optional buffer weeks. They are the mechanism that prevents invisible fatigue accumulation from becoming compounding injury risk.
Maintain appropriate scope when applying these findings. This study examined adults aged 60 and above, one exercise, one load, one session. The structural principle it illustrates has relevance across training contexts. The specific numerical findings do not transfer directly to men in their 40s without further research to support that transfer.

Conclusion

This study adds a specific, well-measured data point to a known limitation in internal autoregulation tools. Perceptual accuracy of proximity to failure is not stable across all training conditions or populations. In this older adult cohort, it degraded meaningfully as sets approached failure, and it did so in a direction that produced more accumulated fatigue than participants recognized or intended. RPE did not compensate for that degradation.

The lesson is not that RIR-based training is invalid. The lesson is that any system built on autoregulation depends on perceptual accuracy, and that accuracy should be periodically verified rather than assumed. Keeping intensity prescriptions within the range where perceptual tools demonstrably hold — and adding external cross-checks where they do not — is the approach that preserves dosage control over the long term.

Training designed to last does not chase unverifiable proximity to failure. It builds systems where effort, fatigue, and adaptation can each be accounted for across weeks and months, not just within a single session. That is the structural standard this research reinforces.

Robban
Founder of The DadStrength
Creator of The DadStrength Method
47 years old. Lifelong lifter. Father. Educator.
Evidence first. Experience applied. Strength built to last.

How This Fits The DadStrength Method

This research reinforces the importance of structured progression, recovery-aware programming, and long-term capacity building.

ENTER