Sunday, September 20, 2020

The Race of a Lifetime : Tadej POGAČAR's Stage 20 Time Trial Analysis

Photo courtesy :

Great comebacks are always a fascination for sports observers both from an entertainment and statistics perspective. Don't we all thrive for that moment when fortunes can be reversed and the underdog can win? In social psychology, this phenomenon even has a special name - schadenfreude.

Such a reversal in fortune happened during Stage 20 of the 36.2 km individual time trial at the Tour de France when a 21 year old Tadej Pogačar reclaimed nearly 2 minutes over his nearest rival Primoz Roglič, all but securing the title of the coveted yellow jersey and taking home 500,000 Euros in hard won prize money.

This was a rare feat to witness 20 days into the 3500 km Tour de France, and many had made up their minds that 57 seconds was a large chunk of time to win back from a highly motivated Primoz who had been in sitting in yellow for 11 days in a row. In the aftermath, the sport's pundits are going to be looking closely at how this was accomplished by the youngster, who beat just about every veteran of the time trial format available to contest that day.

Allow me to devote a brief section below to the analysis of the actual time trial performance and the corresponding power demands without going too much into the mathematics of it all. Please note this analysis remains to be validated since the official performance data from Team UAE Emirates is unavailable to the public as of today. Sources of my information are highlighted below and where required, educated guesses are employed. I also discuss my results towards the end of the article.

Assumptions & Considerations

I've used the following assumptions & considerations in this first order analysis :

  • Weight/Height : 66 kg/176 cm (Source)
  • Assumed Drag Area, CdA : T1/T2/T3/Finish = 0.22/0.24/0.3/0.3 sq.m (arbitrary but educated)
  • Assumed Rolling Resistance Co-efficient, Crr : 0.002-0.0023, 25mm width (Vittoria Corsa tubeless)
  • Assumed drivetrain efficiency : 98%
  • Bike T1-T2 : TT bike w/ rim profile 60mm/Full Disc at 8.3 kg
  • Bike T2-Finish : Road bike w/ rim profile 30mm/30mm at 6.8 kg (current UCI limit, source)
  • Gear : Aerodynamic skin-suit and streamlined TT helmet
  • Weather : Historical weather for 3-5pm local French time w/ winds 8.5-12 kph at 93-105 degrees range.
  • Roads : Good (smooth asphalt) w/ mountainous terrain
  • Course GPX source : Ritchie Porte's Strava data  
  • Performance time data : Pro Cycling Stats 
  • Model used : A widely cited & validated general purpose model of human power requirements in cycling
  • Secondary power data for comparison : Thomas de Gendt's Strava data for Stage 20


The race course was broken up into 4 segments corresponding to the official time checkpoints for the stage. A 1st order physics model was used in combination with official timings at those checkpoints to reverse calculate a suitable matching power output. I quote "suitable" as the numbers could change up or down depending on the actual conditions. From the potential locus of power outputs, this is a workable number for the rider, as I validate it below.  

Stage 20 ITT course profile


The modeling indicates that for the first two sections totaling 30.3 km, the use of a special purpose TT bike weighing in at an assumed 8.3 kg and a body shape of CdA 0.22 sq.m required an average power output of 427 Watts. The results indicate a positive split with an average power of 451 Watts for the 1st segment until T1 and 402 Watts for the 2nd segment T1-T2. 

In the vicinity of T2 at 30.3 km, a bike change happened where the TT bike was exchanged for a lighter road bike due to requirements necessitated by the gradient. This climb is at an average 8% gradient, kicking up to 20% in places. The bike change cost anywhere from 6-8 seconds in total, depending on how you start and stop the watch. This time cost is factored into the overall performance time. 

Thus, in the last 5.9 km of this climb, the use of the assumed 6.8ckg road bike required an approximate average of 412 Watts at an estimated 6.2 W/kg (power to rider weight). The power demand for T2-T3 and T3-Finish of approximately 3.3 and 2.6 km each were 432 and 392 W (6.5 & 5.9 W/kg respectively).

The results are plotted in the image below :

(Click to zoom) : Actual performance times along with corresponding modeled average power outputs for Tadej Pogacar in the final individual time trial of Stage 20 of the 2020 Tour de France.


This is an unverified analysis done based on checkpoint timings obtained from Pro Cycling stats and other publicly available information. An average power output of 419 Watts was required for this performance as per the modeling. What is definitely in question is the pacing profile over the course of time duration, which needs to be validated with real data.

Such a power output is not totally unrealistic for Tadej, given we know that in the 140km Mountain Stage on  Stage 8 of the Tour, he displayed a power output of 428 Watts over the Col de Peyresourde, climbing it in one of the fastest times recorded in recent history and an estimated power to weight ratio of over 6.5 W/kg. This was after 2 massive climbs before it and 120 km in the legs.

The modeled power output of 412W on the final 5.9 km climb equates to a power to weight ratio of 6.2 Watts/kg. Compare this to Thomas de Gendt's data from the same stage where he rode with an average of 405W at a power to weight of 5.9 Watts/kg. This is consistent with Thomas' performance data that shows he climbed 1:51 minutes slower than Tadej. 

The overall data indicates a positive split of power across elapsed time duration. I justify this with two potentially valid points : 

1. High motivation at the start, giving the rider the urge to ride hard in the first half. Tadej was in fact chasing what looked like an improbable target, a 58 second deficit to win the Tour de France. He might have purposely fired all cylinders, thus accounting for a potential loss of valuable seconds later during the bike change and any other unforeseen events on the climb. 

2. The decrease in power output in the second half might be attributed to a combination of accumulated fatigue and/or a change of the power demand and the impact on feelings from a sudden change to a lighter bike on a steep climb. The "sudden" change to a new bike and the lack of objective power data from an absent head unit meant that Tadej had to guage his effort carefully. It could be that despite a drop in power and cadence, Tadej maintained the "same" or even "greater" level of perceived effort compared to previous flat sections of the course. However, this is just my speculation.

The choice of tire rolling resistances and drag areas although arbitrary, are not a totally wild guess. We know that Team UAE Emirates is sponsored by Vittoria in 2020, the tubeless varieties of which have reportedly exhibited some of the lowest rolling resistances at race speeds. Therefore, I have started off with an ideal case of 0.002 increasing this to 0.0023 at the climb. I figured the weaving on the climb at slow speeds combined with the quality of road on the gradient poses less than ideal conditions, justifying the small increase to Crr. 

Reported co-efficients of rolling resistance for some bicycle racing tires at race speeds. Source : Aerocoach

Professional TT riders are known to be slippery, exhibiting well under 0.25 sq.m of drag area in ideal conditions (smaller riders reportedly presenting less than 0.2 sq.m!) I have started off with an ideal scenario of 0.22 sq.m in the TT position due to Tadej's height and weight, increasing this to 0.3 sq.m on the climb which corresponds to a climbing position adopted with the hands on the hoods. Again, these numbers are arbitrarily chosen but there is no way at present to verify what the real numbers in open terrain might be. I do have some references from a Twitter conversation to believe that my choices are conservative for a top professional rider. 

CFD simulation results showing the individual contributions of wheels, bicycle and rider to CdA as well as the net CdA. Source : Fabio Malizia, Katholieke Universiteit, Leuven

The total system weight with rider and all accessories is an unknown. A premium TT bike setup of 8.3kg and lightweight road bike setup of 6.3kg are not unexpected and matches recorded observations on the internet.  However, the weight of his kit, shoes, helmet, bottle etc are unknowns. I have reasons to believe this will be under 1kg in total however the uncertainty in analysis from the final climb will stem from the uncertainty in system weight and rolling resistance. Regardless, the modeled power outputs are likely not very far off from the actual numbers. 


I titled this race as the "race of a lifetime". Indeed, performances like these are hard to come by simply due to the immense difficulty of turning around such time advantages over a pile of fatigue and mental exhaustion 20 days into the Tour de France.

In some respects, Tadej's race performance has been likened to a pivotal moment in 1989 when the American Greg Lemond, bustling with energy and ready to try new technologies, beat the yellow jersey holder Laurent Fignon with the use of aerodynamic gear and in turn, winning the Tour de France. 

Whether Tadej's victory was a matter of such marginal gains at the end of the day is debatable. Yes, two purpose made bikes were used in the time trial in an unusual manner, but this is increasingly becoming common in the top races these days. Moreover, unlike 1989, both Primoz and Tadej were arguably evenly matched in terms of technology, the funding and competent attention required to apply the technology. In fact, on race-day, they both undertook bike changes before the 6 km climb so any small variations in equipment came really down to supply differences from the equipment sponsors.

Did Tadej just ride his usual top race, as he does every time and was it Primoz who slowed and fizzled out? Well, I think that is clear to see. A race is indeed won by someone who slows the least. And what promoted this spectacular fall when the day demanded the best? Whether it was the massive pressure upon his Primoz's shoulders, or whether it was the failure of his power pacing model, or whether it was the fatigue, or ALL of the above, we will not know for sure. 

What speaks to me from this performance is that marginal gains did not win, and something else contributed. Certainly Tadej rode the time trial of his life, and converted the opportunity of a lifetime to a magnificent victory. And I think in that moment, the individual qualities of what makes one rider better than another in the heat of the moment won. It really is a victory for the human element.

Years after his crushing defeat in the 1989 Tour, Laurent Fignon would write that despite getting over it, "you never stop grieving over an event like that; the best you can manage is to contain the effect it has on your mind." I hope that Primoz, as amazing a rider he has been to reach this level, is able to contain the effect of this race outcome on his mind and move on. He has more than a few good years of a top fight left in him at the very top. But an able and worthy opponent stands beside to check that in the form of Tadej Pogačar.

Thanks for reading. Comments and observations welcome below.

Sunday, August 30, 2020

Fulgaz App : Validating Model Prediction & Performance Results


After my self-inflicted Poor Man's Tour de France that ended on 22nd August, I took a week of recovery and lunged into the Fulgaz app's 3 week fundraising campaign called French Tour. This "curated" Tour campaign features most of the celebrated climbs of the Tour de France in the high Alps along with other famous circuits in and around France. With a real-time leaderboard and 381 virtual kilometres with 14000m of climbing over 21 stages, its a challenging event to keep my mind occupied while the actual Tour de France plays out. 

When riding the Tour in Fulgaz, you see a beautiful HD or 4K video of the course taken by a volunteer rider on a high resolution cam. The volunteer would obviously ride the course at their own capacity, so when you use the app, the speed of the footage can be set to "reactive", so effectively it'd speed up or slow down based on how closely you match the recording (for example, 1x, 0.9x, 0.8x or >1x). 

Being new to Fulgaz, I was quite impressed with the app's in-built features and sliders to "tune" nearly everything that would have appreciable impact on the ride - for example system weight, rolling resistance, drag area, even wind speed and direction. The speed with which the app loads is extremely fast, about a second on my Windows 10 pc with a 32gb RAM. Whats more, you can download all the high resolution videos to stave off any buffering troubles. A download would take 15 minutes for a full HD video on my modest internet connection. 

All this was fascinating, given that a) I'm quite new to indoor cycling apps and b) Zwift, another leading indoor cycling app of which I'm a paying customer, keeps a lot of these variables under tight secrecy so effectively you have little clue what is driving the model. 


Sometime ago, I built myself a cycling performance model for personal use. The model is built based on Martin's power model for cycling which I like to use for personal and coaching related estimation purposes. I can build as many segments of a course in the model and make tweaks to inspect how it changes my performance. This is handy for climbing and TT predictions and even drafting simulations. 

For Stage 3 of the French Tour, registrants would have to climb the 1200m vertical Col du Galibier. I was interested in knowing how my performance results in Fulgaz would compare with the model predictions for the same given power input. So I did the Galibier this morning, staying completely aerobic, sweating buckets, powers tuned to steady perfection with a Computrainer erg controller and a second Powertap pedal (those curious how I used the Computrainer with Fulgaz can send me an email or comment). Once I had my performance, I input the same powers into the model along with the driving variables input that I'd input into the app. Results are below. 


Given the assumptions I used (stated in the graphic below), the app performance results and the model predictions converged very well, which I'm pleased with. This gives me further trust in the app. Note how the positive and negative errors negate each other over time. Also note that generally, the errors are within 5% and the average error for 17.9 segments I manually built is -0.26%. 

(Click to view) "Virtual" performance in the Fulgaz app compared to model predictions for given power. The ride was done on a Computrainer using Fulgaz app. Strava results :

I believe the errors are partly due to :

1) The chosen granularity of the course, which is a km at a "constant elevation". In reality, the road might step up or step down several times within a kilometre. However, for my purposes, this would suffice.

2) I did not ride km segments at "constant power". Infact, I modulated it based on how I felt. 

3) I've assumed a constant rolling resistance per segment of 0.003. If the Fulgaz app changes rolling resistance in real time based on the segment you're on, that could affect the speed slightly. 

Also note that I have not applied altitude power-attenuation ("de-rate") in the model, which is because this is a virtual environment. However, we know for a fact, from both tested runners and cyclists, that aerobic capacity drops at moderate to high altitudes, how much depending on acclimatization levels and individual attributes. So in reality, actual times are very likely going to be slower. How much slower is another conversation. I hope to tackle that in an upcoming post. 


The close results from my model and the actual performance on Fulgaz makes the Fulgaz app a reliable training tool, in so far as it is used for constant, steady speed climbing (I have yet to test it for solo TT efforts against the wind). It also validates the Martin model (which has probably been done several times before by several people). When I shared this article with Mike Clucas of Fulgaz, he essentially confirmed that my reverse engineering closely matches the inputs that drive their model, atleast in curated events such as the French Tour. 

This post generally speaks to the need for indoor cycling training apps to make transparent to a customer what is driving their in-game physics. If in-game physics is non transparent, predictions based on widely used open source models will be vastly different to in-app performance. 

If the variables that impact the in-game performance are not transparent, you can't effectively do "what-if" predictions as almost all cyclists do in real life ("if I ride with x equipment and/or shed a few pounds, how would that affect my performance?"). This does not help those who take their training and racing very seriously and like to pre-plan for the event.

One might argue that indoor cycling apps are built like "games", hence the physics can deviate to an extent simply because it is a game. But there can be impacts. Depending on the magnitude of the deviation, a host of things can be affected ranging from perceived exertion, fatigue, CP and W' dynamics and most importantly the nutritional needs that an app based performance requires.  Either-way, if you can't predict something with physics, it's unverifiableunpredictable and might I add, possibly unstable

If on the other hand, all indoor cycling apps used a verifiable model, one could effectively standardize/minimize/account for a source of variability while the rest of the differentiation can be in the graphics, software performance and other perks unique to each app. 

I look forward to actually climbing the beautiful Galibier in reality, if I'm lucky enough to put together my coin collection and go to France. Huge thanks to everyone at Fulgaz for keeping me entertained during this tumultuous time.


Sunday, August 23, 2020

The Poor Man's Tour de France : Virtual Stage Racing in GT Mimicry

Readers might recall that last year, I attempted a Poor Man's Giro d'Italia, a tongue in cheek name for a stage racing simulation in which the objective was to follow the Giro while riding "short" stages pretty much everyday by myself on local roads. The main motivation behind the exercise was to collect data and compare them to research studies attempted into Grand Tours and Grand Tour racers. 

I'd wanted to replicate something like that this year but with some additional realism to racing. Obviously for this to happen, the intensities would have to be high and I'd have to race with other people. With the whole Covid-19 situation demolishing the race calendar throughout the world, I turned to Zwift for the obvious solution. 

And thereby, I began another self-inflicted stage racing attempt called Poor Man's Tour de France in July. 

I have a few points to make on this mini-adventure before I share the data :

1. The races began on 17th July and lasted upto 21st August. All race results are recorded in my Zwift power user profile. I started Zwift as a beginner in the E/D category and moved up to C by Stage 13. (To download my data in Excel .csv format, you can click on the plot below in Figure 2 where it links to the tabulated data). 

2. All races were done with a single sided pedal based power meter and a heart rate monitor on a non-smart trainer. 

3. The trainer used was the Feedback Sports Omnium Over-drive unit. This is a roller unit which is extremely portable, and perhaps the most portable of all trainers. Owing to direct contact between tire and roller, rolling friction and the dynamics of tire pressure becomes a bit more important than direct-drive units. The unit has minimal inertia. therefore, there is little to no way to coast during racing. If you stop pedaling, you lose power and stop very quickly. On the plus side, riding with this trainer has considerably improved my pedaling conditioning. Due to the direct wheel-on-roller experience, I was also able to get instant audible feedback on stomping vs smooth pedaling patterns.

4. Due to lack of a direct drive setup, I found I had constraints with the inherent power curve available within the above trainer. With the gearing available to me and that power curve, I rarely escalated past cruise powers greater than 200 Watts, for fear of damaging the rollers (I'd already damaged one earlier this year and lost nearly 3 weeks to have a replacement under warranty shipped out to me from Hong Kong!). This also limited the short maximal sprint power outputs I could display within 300 Watts (I was generally not interested in sprinting) 

5. Choice of daily distances were variable. Terrain type was a mix between rolling hill races, mountain stages, few crits and uphill time trials. I skewed the race stages more towards rolling hilly races. 

6. I felt racing every day on Zwift while maneuvering around time constraints as a parent to a 2 year old wasn't really easy. Therefore, I had a few more recovery days in between stages than what would be standard for a Tour. In general, I didn't exceed more than 3 days without a race but the norm was racing every second day as fatigue started accumulating. 

Figure 1 : The author's "pain cave", cobbled together during Covid-19 shelter in place restrictions in UAE. Materials used : book shelf, baby high chair, normal chair, ironing stand, yoga block, a weighing machine, Feedback Omnium Overdrive trainer, road bike, laptop and a 19 inch wide screen monitor. 

Below is the interactive data for all 21 stages of the Poor Man's Tour. Note that data is presented against a logarithmic y-axis to make the plot more readable. Scrolling over the data lines should show data points.


Table 1 : List of races done during 21 day Poor Man's Tour de France on Zwift

Figure 2 : Data for 21 days from a self-inflicted stage racing simulation called The Poor Man's Tour de France (top to bottom) - Total heartbeats, calculated calories, Zwift reported calories, work done, elevation, Trimp points, average heart rate, bike stress, normalized power, average power, average cadence, distance, RPE, TSS/km, TSS/km, & duration. Note that BikeStress is a training metric native to Golden Cheetah which establishes race intensities as a function of duration and intensity. Click on the line to view the data. 


1. Total Distance, Elevation & Calories : Over the total of 21 stages, I completed 600km of racing with a net ascent of 8666m burning an estimated 12000-13000 kcals. This equates to 17% of the actual Tour de France distance with an elevation gain nearly the height of Mt. Everest. These are modest numbers.

2. Heart rate : The range of racing heart rates were between 151-194 bpm across the 21 stages. The highest heart rates were featured in stages with high stochasticity in pacing effort. For example, the two crits I attempted on Stage 9 and Stage 14 both of which had rolling terrain showed the highest heart rates.  However, there were other crits I attempted which did not feature high heart rates (for example Stage 18). Although the normalized power for those stages were also high, this does not explain the higher heart rates. Perhaps cadence is another factor that might offer a clue, meaning the stages with higher cadence could feature high heart rate. There may also be a hidden con-founder somewhere that is outside of this data (diet, sleep, fatigue, other activities in life...). 

3. Power : In terms of normalized power, the range was from 117W-193W over 21 days of racing. Interestingly, in the early stages, I was just getting used to riding at high intensities on a trainer and not wholly happy with the cooling air flow available to me. So the early stages featured low powers at high heart rates. As the stages evolved, I got fitter in terms of being able to deliver higher power to the pedals at similar heart rates. I also got myself a bigger industrial size fan which could push out more air volume! There was a plateauing phenomena in powers as racing progressed which I attribute to day-to-day fatigue and the inherent power curve limitations of the non-smart trainer. 

4. Aggregate stress : Over 21 days of riding, the aggregate TRIMP based stress was 3008 for a daily stress of 143 AU/day. The aggregate Bikestress (a correlate for TSS) was around 2124, giving a daily figure of 101 AU/day. These were all calculated in Golden Cheetah. Total kilojoules burned was 12036, resulting in an average of 573 KJ/day. On a per day basis, these numbers are higher than the same data from Poor Man's Giro d'Italia. 

5. Distance specific intensity : In terms of TSS/km and Trimp/km, two metrics that maybe indicative of ride intensity as a function of unit distance, the highest values were incurred in stages that featured either a mountain climb time trial, a mountain race or a high intensity crit race. For example, of all the stages, the ones I rode on Stage 3 (L'Etape du Tour Stage 3) and Stage 7 (Alpe du Zwift TT) posted the highest values of intensity per distance. This again agrees with my findings previously from Poor Man's Giro d'Italia and the research data I posted there from the Sanders investigation of Grand Tour racing. With distance and per day stress metrics stated as above, one area of inquiry is whether there are differences in the numbers between indoor and outdoor racing. With constraints of air flow and cooling indoors, one might expect to see higher race intensities indoors. Comparing the Zwift racing data with last year's Poor Man's Giro, the distance specific intensity metric Trimp/km is definitely higher this year on Zwift. However, this is not exactly an apples-to-apples comparison because I did not do a true "stage racing simulation" last year. However, the argument that indoor intensities should be higher is a rational one and something to discuss and further explore. 

6. RPE : Across 21 days of racing, RPE varied from a low of 6 to 10! The hardest I felt was during Stage 2 (L'Etape du Tour) which featured a mountain ascent of 1538m. Because this was one of the earlier stages, I was in no shape to climb continuously for 3 hours with poor air flow. In the final 20 minutes, I did hop off the bike once to take a break, thinking I was going to have a heart attack. Part of the challenge that day in my pain cave was lack of air flow to cool myself for that long! The standard deviation in RPE across the stages was quite low, however, indicating that the intensity on all days were more or less quite similar to each other. 

7. Cadence : My average cadence across all stages was 83 and the highest cadence was during Stage 19 which was a rolling hills ITT of 28 kilometers in length, where I staved off fatigue by riding at 90+ cadence. I reckon the stages with high cadence were excellent stimulus to the VO2max region of training intensities. One of the things I'm pleased with as I attempted this challenge is that I got quite experienced with being able to regulate my cadence to tune my perceived effort within different racing situations. It may have been that this factor also affected my heart rates over the course of each day's race. 

8. Nutrition : In general, being able to race everyday on Zwift means having to rely heavily on carbohydrates; success seems to depend on how well the stores of glycogen are topped up between races. My fuel system is one that is biased towards carbohydrates, which may also be partly explained by the fact that I'm a habitual carbohydrate consumer. There were some days were I didn't have the luxury to manage the diet well to feel fully topped up before the next race. On days where I felt I needed an extra "boost", I used the top ergogenic aid known to man, that's right - Coca Cola! Caffeine works. 

9. Race Competition : In general, I have only good things to say about Zwift as it is a tremendous motivational tool. During the 21 stages, I enjoyed many days sitting in the peloton and sharing the effort that got us all across the line with good timings. But I was in no way a match for those who could utilize some of the "gaming" aspects of virtual racing.

I think Zwift has to figure out some way to weed out sandbaggers. Although the final listings on Zwift Power website excludes those cheating below their actual categories, the race dynamics are affected by the presence of these individuals. For example, it is often the first 2-3 minutes of an e-race where your placement is made or broken due to massive power surges to find position. The presence of more able riders who are cheating below their category could compel others to ride just as hard in order to get on their wheel , as a result many gaps are formed disadvantaging the lower order riders who have "missed the draft". This point maybe moot. 

Figure 3 : The author in a "break" of select group of riders from the Namibian Race League
during Stage 17. 


The results discussed in the section above generally agrees with the data found from Grand Tours that the mountain stages are where the action really is in-terms of stress and intensity. Although the race intensities were high and the monotony day to day was also high. Zwift provided a great way to beat that monotony with the ability to select from numerous races spread across different maps with different competitors. For example, I found South Africans and Japanese race subtly different when compared to Brits! Maybe that is an imaginative observation, but it still is an observation. 

Overall, while I found I was making improvements in the duration specific power outputs as the races progressed, I found myself hitting a plateau due to a combination of fatigue and power curve limitations on the trainer. In other words, there were diminishing returns after a point. 

From the Poor Man's Tour de France racing challenge, I was quickly able to learn which e-races suit me and which races wouldn't. Therefore, the choice of many rolling hilly races was intentional. I also included mountain stages. Flat, all-out races were few. 

If I redid the Poor Man's Tour de France again, I'd figure out a way to balance out the percentage of race distance spread between mountains, flats and rollers. But I can't say if the actual Tour de France traditionally or even this year has actually been balanced either! Often we hear that the Tour stages are deliberately designed to suit some of the top French stars. That doesn't seem to be any different this year. 

In conclusion, with the constraints that were upon my time, I think this was sufficient racing stimulus. Due to the plateauing effect of power and the accumulating fatigue as the stages progressed, I had to draw the line somewhere to minimize the losses. 

With a few days left for the actual Tour de France, I will be able to smugly soak in the racing footage and maybe even pretend to co-relate to it with my own experience, ha!

Wednesday, July 15, 2020

Tour de France : Key Statistics

The following is an easily accessible graph showing key statistics for the Tour de France from the years 1903-2019.

I've highlighted some epochs in the data, namely the two World Wars and the 1990's doping era culminating in the Armstrong doping saga. For the wars, the plot makes it seem as if racing took place but this is just a visual effect. No racing took place during those years.

To go along with this data, you might also like my previous post on modern bicycles and cycling speeds. There, I explored whether bicycles themselves have made any appreciable impact to speeds.

Hopefully this datasheet can be used in future years as a live plot.

All data obtained from and compiled with Datawrapper.

Wednesday, April 29, 2020

Functional Threshold Power : A Scientific Scrutiny

Certain entities in the world have originated competing claims about cycling performance concepts, test protocols, and training zones the rest of the world must adhere to. The astute athlete cum observer would want to find out which ones stand scientific scrutiny and separate myth from fact.

In that spirit, this post is an appraisal of the definition and estimation techniques of Functional Threshold Power (FTP) which are at the core of this power-based training concept. It follows from the last post, where I explored a scientifically vetted threshold concept called critical power (CP) and most of its nuances, including application related issues.

This post attempts to explain with simple arguments and scientific references why FTP, although as "useful" a performance metric as it may be to some people, is a pseudo-scientific concept at best. 

I. Introduction to FTP : Definition and Estimation

FTP was conceptualized as a field-based practical method of estimating a threshold phenomena using cycling power meter technology. It is, like Critical Power, used as an endurance index to design training prescription as well as classify cycling talent. A threshold, as a reminder, is an intensity marker just above which physiological responses will sharply change whereas below it, attains a steady-state within a tolerance band. 

The training concept was formalized by Coggan in the book Training and Racing with a Power Meter (TARWAPM) in the early 2000s and an ecosystem was built around FTP consisting of sister metrics (NP, IF, TSS, etc) and software marketed by Training Peaks group. Some of the history behind how this all came to be is documented on TARWAPM's blog page.

Andrew Coggan PhD signing copies of TARWAPM books. Source : TARWAPM's blog page. 

Let me cut to the chase and quote the 3rd edition of TARWAPM's definition of FTP, marking in red some key terms that I will explore further : 

"FTP is the highest power that a rider can sustain in a quasi-steady state without fatiguing. When power exceeds FTP, fatigue will occur much sooner (generally after approximately one hour in well-trained cyclists), whereas power just below FTP can be maintained considerably longer. [1]"  --- (1)

The text lists around 7 different methods to estimate FTP.

1) From a power & time-frequency distribution chart from cycling training and racing data.
2) From routine steady power intervals, repeats or longer climbs.
3) From normalized power (NP) during hard mass-start races of approximately one hour.
4) From a one-hour time trial by inspecting a smoothed time-series plot of power.
5) From a power duration model. obtained by testing for CP, and where the resulting model derived value of CP is suggested to be interchangeable with FTP.
6) From the proprietary mFTP model in WKO4.
7) From the FTP testing protocol consisting of a 28-minute warmup, a main set of 20 minutes, and a cooldown of 10-15 minutes. 

The 7th method is probably the most notoriously proliferated in cycling lexicon. The premise being that subtracting 5% from the main set time trial of 20 minutes after a hard warm-up will estimate FTP (hereby, called FTP20 for simplicity).  A. Coggan has not until recently distanced himself from this estimation technique, saying that it was Allen Hunter's contribution and not his.

There are other methods of estimating FTP which is coded into popular programs like Zwift and Trainer Road and yet another confusing bunch of "new" test protocols on Training Peaks' website. The validity of these techniques are in question besides the obvious danger of under/overestimating some individuals. Therefore, this post is purely focused on the original concept of FTP and its test protocols as codified in TARWAPM. 

II. A Deeper Inspection of FTP

Let me inspect in slightly more detail the terms highlighted in red in the definition of FTP in (1). 

A) The Issue of "Quasi-steady State" 

A quasi-steady state is meant to describe a transient situation where physiological variables such as blood lactate and VO2 are rising but remain within the zone of uncertainty. Using field power estimates, a quasi-steady state can be attributed to the variation of power values over the time duration. 

Scientific studies demonstrate the ability to work at a "quasi-steady state"  at critical power, where workloads are on the order of 10-15 Watts higher than what can be sustained for one hour. Critical power, from my previous post, corresponds to a workload of approximately mid-way between lactate threshold (or gas exchange threshold) and VO2max.

Moreover, the time to exhaustion at that workload is lower, around 24 minutes or so. This has been demonstrated in both physically active subjects and competitive cyclists (see Figures 1,2 below). Besides, even a shorter maximal time trial lasting around 30 minutes was demonstrated to show quasi-steady-state behavior in both power outputs and physiological variables (see Figures 2B, 2C). 

Therefore, the FTP concept systematically underestimates the wattage, or the workload that can be achieved at a quasi-steady state. From a lab testing standpoint, this same observation was noted with MLSS suggesting that a steady state in VO2 can be achieved even beyond the power at MLSS [15].

Figure 1 : The Poole study showed that in physically active subjects, the group mean of metabolic demand when working at constant-load exercise at critical power, much higher than that of lactate threshold, resulted in steady state VO2. Time to exhaustion in these subjects was 17.7 +/- 1.2 minutes. From research reference [6].
Figure 2 : The de Lucas study in competitive cyclists showed that quasi steady state VO2 was achieved at workloads at CP. Here again, the group mean for time to exhaustion was 22.9 +/ 7.5 minutes. From research reference [7].
Figure 2B : Quasi-steady state power outputs were shown in intense 30 min TTs conducted on well-trained triathletes. See reference [12]. 

Figure 2C : Quasi steady states in metabolic demand and blood lactate values were shown in a much more intense, shorter TT lasting 30 minutes on well trained triathletes. Moreover, study demonstrates that subjects can sustain very high values of blood lactate for extended time, some > 10mM. See reference [12].

B) The Issue of "Absence of Fatigue"

The highest workload one can sustain for about an hour does not or cannot occur in the absence of fatigue.

A well-documented study in well trained cyclists who completed a 4, 20 and 40km time trial demonstrated that central and peripheral fatigue occurred in all distances, including the 40km TT which took approximately 65 minutes to complete.  The pattern of central vs peripheral fatigue shifts from peripheral dominant over 4km towards central fatigue over 20km.  In other words, the decline in the ability to produce force residing within the central nervous system was higher in the longer time trials. Given this data, exercising at the highest workload corresponding to quasi steady-state in the "absence of fatigue" is notionally incorrect. 
Figure 3 : Exercise induced impairment in the ability to produce muscular force measured in well trained cyclists who executed 4km, 20km and 40km time trials. Fatigue, central and peripheral, are prominent in all duration time trials with central fatigue being highest during longer time trials. Reference and adapted from [8]. 

C) The Issue of "Highest Power" at Quasi-steady State 

Following the arguments from A), the power output that can be sustained for "approximately" an hour does not correspond to the "highest power" for which a quasi steady-state can be achieved. The study referenced in Figure 3 shows that lactate values for the 20km TT also stabilized around the 8km mark and barely increases until the end spurt, despite being at a 15-20W higher than the 40km TT wattage.

FTP arbitrarily pegs a duration of "approximately" one hour to "highest power at quasi-steady state" which is not actually the case. What is obvious though, is that the workload at FTP is an unambiguously steady-state and therefore, is not the highest "quasi-steady state" intensity. 

Figure 4 : Lactate values in a 20km TT stabilize at the 8km mark and barely increase until the end spurt, despite being a 15-20W higher than a 40km TT. The latter took 65min to complete, which going by FTP definition, would correspond to "approximately" one hour.  Reference and adapted from [8]

D) The Issue of "Functional Threshold" As a Surrogate for Laboratory-Based Testing

FTP originated as a practical field-based alternative to lactate threshold testing. "Threshold" in concept refers to sharp distinctions in physiological responses associated with exercise slightly below and above a specific intensity value.  

However, as we have seen previously, the maximum intensity of "quasi-steady" state exercise has been shown to have sustainable durations much lesser than approximately an hour. So it is questionable why FTP should lay claim to accurately representing threshold in a wide group of people. 

Having stated that, how do comparisons between FTP and laboratory indicators of threshold match up in scientific studies? 

Let's take a look : 

1) FTP compared against individual anaerobic threshold (IAT) power:  Although empirical demonstrations have shown FTP20 and IAT are close, authors of one recent study stated: "…it is difficult to accept FTP as a thoroughly valid concept. We found large limits of agreement between most variables, suggesting a high level of inter-individual variability in the relationship between FTP20 vs. FTP60 and between both measurements vs. IAT (me: stepwise lactate profile test)."  [2]  

In other words, wide limits of agreement in a Bland-Altman plot shows that any agreement between a surrogate method (FTP) and a laboratory-based marker, here IAT, must be ambiguous.

Figure 5 : Bland altman plot of FTP20 compared to individual anaerobic threshold (IAT)  in 23 well trained cyclists. See reference [2].

2) Against maximum lactate steady state (MLSS) power: The same authors from the study above compared FTP with another threshold concept called MLSS and found generally good agreement. However, even in this study, wide limits of agreement were bserved between FTP20 and MLSS among different groups of cyclists with different training statuses, implying ambiguous agreement between the two when we look at heterogeneous samples [3]

Figure 6: Comparison of FTP20 with MLSS in 15 cyclists - 7 trained and 8 well trained. See reference [3].

Another study that studied validity to MLSS concluded: "The results indicate that the PO at FTP95% is different to MLSS, and that changes in the PO at MLSS after training were not reflected by FTP95%.  Even when using an adjusted percentage (ie, 88% rather than 95% of FTP20), the large variability in the data is such that it would not be advisable to use this as a representation of MLSS." [14]

3) Against Lactate Threshold (LT) power: Foster came to a similar conclusion as the previous two studies when comparing LT and FTP20. They wrote: ".....caution should be taken when using the FTP interchangeably with the LT as the bias between markers seems to depend on the athletes’ fitness status. Whereas the FTP provides a good estimate of the LT in trained cyclists, in recreational cyclists, FTP may underestimate LT." [4]

Figure 7 : Limits of agreement between FTP and lactate threshold studied in 20 healthy cyclists. See reference [4]. 

4) FTP Compared Against A Range of Blood Lactate Threshold Markers: One study compared FTP20 with a range of laboratory-based blood lactate measurements, such as LT, LT at 4mmol blood lactate, Dmax derived LT, and IAT (LT = lactate threshold).  The main objective was to find the best correlate of FTP in a single study. 

They demonstrated that all computations resulted in numbers that differed significantly from FTP20. Despite the strongest correlation being between FTP and LT4.0, a large dispersion of approximately 100 Watts was found in the inter-individual data questioning their equivalence. The study concluded: "...we suggest that FTP does not have an equivalent physiological basis to any of the tests used herein and, therefore, cannot be used interchangeably." [9] 

Figure 8 : FTP compared to a host of lactate parameters in 20 competitive cyclists. See reference [9].

The overall picture from the previous studies shows that claiming FTP can be used as an accurate surrogate for laboratory-based measures of threshold is at best, unfounded.

E) The Issue of FTP20 method and "False Sense of Precision" 

As a matter of practical convenience, the second and third editions of TARWAPM suggested the FTP20 method as a way to estimate 60 minute FTP.

The issue with this technique is that the 95% computation is probably an average for a large group of cyclists but not exactly applicable to you or I mainly due to inter-individual variability [5]. Some people will be at 93%, some at 90%, some at 85%. This was also shown by A. Coggan himself (see Figure 9).

One prominent exercise physiologist told me: "A value of 92-93% is probably closer on average, whereas a value of 95% would, therefore bring the estimated threshold back towards 30MMP (30 minute mean maximal power)!"

Figure 9 : 95% of 20min power is not necessarily one hour FTP. Source : Facebook fan-page of TARWAPM.

As published by A. Coggan in a whitepaper in March 2003, the real effect of employing an arbitrary correction factor to 20 min power may simply be to convey a false sense of precision [10].

While its understood that he would like to distance himself from the FTP20 method, I would add that continuing to perpetrate the false sense of precision in the TARWAPM book does not make false sense of precision go away. Besides, the entire discussion of whether the correction factor should be 95%, or 90% does not take away from the fact that FTP is arbitrary linked to "approximately one hour" with an unfounded claim to being the "highest workload" at quasi-steady state. Will two wrongs make a right?

F) The Issue of FTP derived as CP from W-time plot

In my previous post, we looked into several research studies showing how critical power defines the boundary between heavy and severe intensity. In numerous research studies, work-rates at markers of thresholds such as LT and MLSS were found to be lower than work-rates at CP. In fact, it falls somewhere midway between LT and VO2max, depending on which study you look at. 

With that information available, CP is a high-intensity workload that may be sustained only approximately 30 minutes or less. Therefore, approximately one hour of power (FTP) and CP should not be considered interchangeable in principle without data. As one research team noted, a fresh study involving a wider cohort of subjects is worthwhile to continue to test this idea of interchangeability [11]. 

In a study conducted by Morgan, FTP20 and CP correlated with each other but the limits of agreement were found to be relatively large (+ 10.9 to -13.1%) such that the authors argued: "...limits of agreement between CP and FTP in this study may be too large to be practically meaningful for athletes and coaches, and that the agreement between the two variables may be coincidental." [5]

The idea advanced in TARWAPM Ed.3 that FTP can be estimated from a linearized Work-Time plot and considered interchangeable with Critical Power is unfounded.

G) The Issue of Secret Sauce In Modeled FTP

In TARWAPM, one of the methods to estimate FTP is from modeling it from a collection of mean maximal power (MMP) values collected in a specific time frame window. The value of FTP is the resulting parameter solved from the fit, called modeled FTP or mFTP.

However, mFTP modeling is only available in the proprietary software WKO4 (now WKO5). On the Wattage forums, A.Coggan has claimed that data from over 200 MMP values show mFTP to be 60 +/- 13 min, and he's used this as an argument to claim that FTP is sustained for "approximately" an hour.

TARWAPM calls the modeling technique the "secret sauce" implying that the proprietary fitting method is not available for open scrutiny, only its outputs are. This roadblock might explain why most studies have used the FTP20 estimation technique to explore their research questions. Compare this to the CP concept which is pretty much open-source and tenable to research to advance our understanding in wide groups of people and wide groups of sporting activities.

H) The Issue of FTP Based Stress Metrics and One Hour 

While TARWAPM's definition that FTP is based "approximately around an hour" continues to be proliferated, other metrics in the FTP ecosystem such as the Bannister style "Training Stress Score" (TSS) is still arbitrarily pegged to an hour. The math in the formula for TSS has been designed in such a fashion as to result in 1 hour at FTP = 100 TSS. This indicates that the formula was designed with an arbitrarily fixed value in mind for convenience, rather than basing it on physiological reality.  Since training prescription and fitness performance charts in the FTP ecosystem are based on TSS, flaws are propagated throughout the mathematics chain.

III. Conclusion

FTP was borne out of a perceived need for field testing convenience and one might add, an entrepreneurial excitement to build a quantification ecosystem when power-meters hit the market beginning in the late 1990s. 

As a purely performance-based metric, FTP is "useful", just as critical power concept and modeling for CP is useful. However, in comparison to CP, the number of papers scrutinizing FTP has been woefully and remarkably small in number. Many of them demonstrates that the validity of FTP is in question. 

I conclude with a summary of reasons why FTP must be approached with caution by whomsoever is using it or plans to adopt it : 

1) FTP's definition that it is the "highest" workload one can sustain a quasi-steady state is not demonstrated in studies. This might systematically under-estimate the intensity where quasi-steady states can be achieved. This also implies that FTP is an intensity area where one is unambiguously at steady state. 

2) FTP's claim to be a valid and accurate surrogate for lab-based testing for a range of thresholds is unfounded. Besides, any claims that the concepts like critical power and FTP can be interchanged through modeling work is unfounded and probably a serious error. There have been recent calls by scientists to consider CP alone as the gold standard when the goal is to define maximum lactate steady state [13].

3) FTP's claim that it is approximately one hour of power that can be sustained without fatigue is most definitely incorrect. 

4) Despite acknowledgement of variability, accompanying metrics in the FTP ecosystem like Training Stress Scores continue to be arbitrarily pegged to an hour (1 hour at FTP = 100 TSS). This continues to spread the already wide spread confusion that FTP is 1-hour power which it is not. 

5) Widely profilerated estimation techniques for FTP, such as the FTP20 method is incorrect. As the originator of the FTP concept describes, it simply yields a false sense of precision. However, the proliferation of this false sense in the TARWAPM book does not make false sense of precision go away.

Regardless of its conceptual flaws, I acknowledge that FTP has found favor with coaches and athletes who use it simply for its training value. However, testimonials and anecdotal evidence are separate from science. Claims made about FTP and its accompanying ecosystem warrant additional scientific scrutiny. The collection of knowledge we currently have from research suggests that those claims are weak and not based on scientific fact.


1. Allen Hunter. Training and Racing with a Power Meter . VeloPress. Kindle Edition. 

2. Borszcz, Fernando & Tramontin, Artur & Bossi, Arthur & Carminatti, Lorival & Costa, Vitor. (2018). Functional Threshold Power in Cyclists: Validity of the Concept and Physiological Responses. International Journal of Sports Medicine. 39. 10.1055/s-0044-101546. 

3. Borszcz, Fernando & Tramontin, Artur & Costa, Vitor. (2019). Is the Functional Threshold Power Interchangeable With the Maximal Lactate Steady State in Trained Cyclists?. International Journal of Sports Physiology and Performance. 14. 1029-1035. 10.1123/ijspp.2018-0572. 

4. Valenzuela, Pedro L. & Morales, Javier S. & Foster, Carl & Lucia, Alejandro & de la Villa, Pedro. (2018). Is the Functional Threshold Power (FTP) a Valid Surrogate of the Lactate Threshold?. International Journal of Sports Physiology and Performance. 13. 10.1123/ijspp.2018-0008. 

5. Morgan, Paul & Black, Matthew & Bailey, Stephen & Jones, Andrew & Vanhatalo, Anni. (2018). Road cycle TT performance: Relationship to the power-duration model and association with FTP. Journal of Sports Sciences. 10.1080/02640414.2018.1535772. 

6. Poole, David & Ward, Susan & Gardner, Gerald & Whipp, Brian. (1988). Metabolic and respiratory profile of the upper limit for prolonged exercise in man. Ergonomics. 31. 1265-79. 10.1080/00140138808966766. 

7. de Lucas, Ricardo & Mendes de Souza, Kristopher & Costa, Vitor & Grossl, Talita & Guglielmo, Luiz Guilherme. (2013). Time to exhaustion at and above critical power in trained cyclists: The relationship between heavy and severe intensity domains. Science & Sports. 28. e9- e14. 10.1016/j.scispo.2012.04.004. 

8. Thomas, Kevin & Goodall, Stuart & Stone, Mark & Howatson, Glyn & Gibson, Alan & Ansley, Les. (2014). Central and Peripheral Fatigue in Male Cyclists after 4-, 20-, and 40-km Time Trials. Medicine and science in sports and exercise. 47. 10.1249/MSS.0000000000000448. 

9. Jeffries, Owen & Simmons, Richard & Patterson, Stephen & Waldron, Mark. (2019). Functional Threshold Power Is Not Equivalent to Lactate Parameters in Trained Cyclists. Journal of Strength and Conditioning Research. 1. 10.1519/JSC.0000000000003203. 

10. Coggan, Andrew. (2003). Training and racing using a power meter: an introduction. 

11. McGRATH, Eanna & Mahony, Nick & Fleming, Neil & Donne, Bernard. (2019). Is the FTP Test a Reliable, Reproducible and Functional Assessment Tool in Highly-Trained Athletes?. International journal of exercise science. 12. 1334-1345.

12. Perrey, Stephane & Grappe, Fred & Girard, A & Bringard, Aurélien & Alain, Groslambert & William, Bertucci & Rouillon, J. (2003). Physiological and Metabolic Responses of Triathletes to a Simulated 30-min Time-Trial in Cycling at Self-Selected Intensity. International journal of sports medicine. 24. 138-43. 10.1055/s-2003-38200. 

13. Jones, Andrew & Burnley, Mark & Black, Matthew & Poole, David & Vanhatalo, Anni. (2019). The maximal metabolic steady state: redefining the ‘gold standard’. Physiological Reports. 7. 10.14814/phy2.14098.

14. Inglis, Erin Calaine & Iannetta, Danilo & Passfield, Louis & Murias, Juan. (2019). Maximal Lactate Steady State Versus the 20-Minute Functional Threshold Power Test in Well-Trained Individuals: “Watts” the Big Deal?. International Journal of Sports Physiology and Performance. 1-7. 10.1123/ijspp.2019-0214. 

15. Bräuer, Elisabeth & Smekal, Gerhard. (2020). VO2 Steady State at and Just Above Maximum Lactate Steady State Intensity. International Journal of Sports Medicine. 41. 10.1055/a-1100-7253. 

Friday, April 3, 2020

Critical Power Concept in Exercise : Critique And Applications

This referenced article serves as a broad exploration into the power duration relationship and the parameters that result from the hyperbolic power-Tlim characteristic.


It is well established that work requiring high speed and power output is short lived but that at low speed and power can be prolonged. This relationship has been shown in a number of living species, including humans, horses, mouse and salamanders. In human activities, it's validity has been shown for running, cycling, swimming and rowing. It is valid for any activity where the limits of sustainable oxygen consumption is sufficiently challenged.

Within this power-duration curve, there is a maximum level of speed or power that can be tolerated beyond which exercise tolerance until termination can be predicted.

That particular threshold value of speed or power is called "critical velocity" or "critical power". The literature provides an "expanded" definition to be the highest steady state metabolic rate (i.e intensity) that can be sustained solely by oxidative energy provision beyond which homeostasis is lost and exercise tolerance is limited.

Although in lay-speak, we tend to associate "thresholds" with points beyond which things "blow up " in the body, the transitions between intensity regions as far as fatigue related variables go is much more "gradual", as one recent study showed [33].

Regardless, any exercise crossing critical power happens on borrowed time as the organism shifts away from sustaining muscle activity exclusively through aerobic pathways and starts concurrently relying on finite anaerobic stores. A slow rise in VO2 kicks in accelerating the drive towards VO2max and eventual exercise termination.

For sports, critical power is an index of aerobic endurance. It was found to have strong positive correlations with skeletal muscle capillarity, particularly around type I fibers, and type I fiber composition [14].

The association of Critical Power and capillarity in two athletes with different CPs. Source [14].

In a recent review, Jones called CP the "gold standard" when the goal was to determine the maximal metabolic steady state [11]. This appears to be one of the few resolutions to some long standing criticisms of the CP paradigm and lack of validity in relation to metabolic steady state.


Exercise concepts must have good descriptions that link back to what actually takes place in the body. A good model would have a bio-energetic basis. In this respect, critical power (CP) has well established scientific underpinnings, unlike "other" training concepts in commercial circulation today. (There are of course models that are simply empirical, and do not help us understand how model parameters relate to something within our own bodies)

CP is thought to represent the highest rate of aerobic energy supply available for exercise. On an intensity spectrum, it forms the lower limit for the severe exercise intensity regime and an upper limit for the heavy exercise intensity regime. 

The breakdown of metabolic control variables when exercising above CP. Black dots = baseline values. Gray = new values at work > CP. Source [2].

In this severe intensity regime, intramuscular metabolic control breaks down, and such exhaustive exercise results in the attainment of low end-exercise pH, [bicarbonate] and [PCr] values irrespective of the chosen work rate and a continuous increase in blood [lactate], pulmonary VO2 rate and ventilation relative to baseline values.

CP becomes the "threshold" beyond which metabolic control is lost by the individual. 

Beyond CP, a slow component of VO2 that was previously under control, rises so steeply so as to speed up the body's breathing path to VO2max attainment within the span of a few minutes. The slow component of VO2 is thought to arise from the incremental use of fast twitch muscle fiber. Considering this, exercise above CP always happens on 'borrowed time'. 

Some 85% of the slow VO2 rise is linked to the recruitment of energetically costly fast-twitch (FT) muscle fibres as work intensity increases. The energy cost per unit force output is higher for FT fibers than for slow twitch (ST) fibers. The slow component of VO2 is not unique to humans; the same has been demonstrated in horses when they are exercised above their lactate threshold. [3]

The steep rise of slow component of VO2 at work > CP. Source [1]

In the hyperbolic critical power model, the term W' (vocally called W prime) represents a constant amount of work that can be performed above CP and is notionally equivalent to an energy store consisting of O2 reserves, high energy phosphates and a source related to anaerobic glycolysis.  The higher the sustained power output above the CP, the more rapidly the W' will be expended, and the greater will be the rate at which metabolites which have been associated with the fatigue process accumulate. 

The average time to exhaustion in work done above CP maybe in the order of 10-15 minutes at most depending on the size of the athlete's anaerobic reserves and motivation. In some laboratory tests, the average time to exhaustion in test subjects at work above CP was 13 minutes [1]. 

Even at CP, physiological steady state is not necessarily achieved. The time to failure at CP ranged from 25 minutes 1 second to 40 minutes 3 seconds [2]. This inter-individual variability hints to the obvious possibility that better trained athletes can sustain exercise at CP longer than less aerobically trained individuals. Some of this variation may also be linked to unfamiliarity with exercising at the estimated CP ("learning effect"). 

One definition of CP is that it is the "highest, non-steady-state intensity that can be maintained for a period in excess of 20 minutes, but generally no longer than 40 minutes." [2]

CP has been found to be influenced by the carbohydrate availability. Researchers found that 2 hours of high intensity activity can decrease CP over time and that carbohydrate feeding negated some of the decrease [12]. The time rate of fall in muscle glycogen also exhibits inter-individual differences, so the time course of decrease of CP in turn also varies from person to person depending on their physiology. 


Traditionally, CP is determined from multi-duration tests conducted over several days in the laboratory. The resulting work rate vs duration (Power-time, or p-t) relation can be mathematically modeled various fits :

1) The exponential CP model (Hopkins - Nonlinear
2) The 3-parameter CP model (Morton - Nonlinear
3) The 2-parameter CP model (Hill - Nonlinear
4) The linear model (Moritani - Linear
5) The inverse time CP model (Whipp - Linear

These mathematics behind the model are shown below :

CP models and their mathematical representation. Source [9].

Although there doesn't seem to a consensus on what is the best model, there has been relatively more attention and research on the hyperbolic forms [7].   This focus of this writeup is primarily in the use of the 2-parameter hyperbolic model which may not be the best model but is the most simple to apply.

Note : This year, a new paper was published detailing an "omni duration" power duration model. Basically, the authors describe an adopted discontinuous mathematical function that helps some of the traditional CP models achieve a better fit at very long durations (more on protocol and duration dependancies below). Details of this model is within the paper in reference [10].


The 2-parameter hyperbolic form of the p-t relation is shown below from a paper on the topic, clearly demarcating boundaries of moderate, heavy and severe intensity domains [1].

Two parameters are of interest in this model :

1) Critical power, CP : This is the horizontal asymptote of the hyperbola, which when read off the y-axis, yields a value of power that could "theoretically" be sustained for ever but in reality, corresponds to a maximal duration of 60 minutes or less. Its units are in Watts.

2) W prime, W' : This is curvature constant of the model, signifying a constant "work" that can be done above critical power. Its units are in kilojoules.

Below CP, physiological balance is attained. This corresponds to the heavy and moderate areas in the plot. Above CP, VO2 is driven towards maximum and eventual exercise failure. That area is shown as the severe intensity region.

 The geometrical descriptions of CP. Source [1]

In terms of power output and oxygen consumption, the second plot shows the values represented on the exercise intensity regime.

Range of attainable power output in a young male along with the oxygen consumption attained. Shown in the intensity range are lactate threshold (LT) and critical Power (CP) along with VO2max, the point which results in termination of exercise. Source [8]

The hyperbola may also be linearized, in which case the linear relationship becomes one between work done and time duration. The y-intercept would then correspond to W' while the slope of the line would be critical power or velocity. The linear Moritani model is not discussed further here.


Any model is a mathematical simplification of a real world phenomena and by nature, is never fully correct. As far as whole body CP concept is concerned, four major assumptions in the simple 2 parameter CP model has been documented :

1. There are only two components to the energy supply system, termed aerobic and anaerobic.
2. Aerobic supply is unlimited in capacity but rate limited, the limiting parameter being CP.
3. The anaerobic capacity is not rate limited but capacity limited.
4. Exhaustion, by implication, termination of exercise, occurs when all of the anaerobic work capacity is exhausted.

The treatment of these assumptions has been done beautifully by Morton, and the reader interested in understanding the details of each assumption need to read the reference [5] below.  My conclusions from Morton's paper is as follows :

Assumption 1 : There are only two components to the energy supply system, termed aerobic and anaerobic.  Yes, this is largely true but only to an extent. The body has more than two energy systems.

Assumption 2 : Aerobic supply is unlimited in capacity but rate limited, the limiting parameter being CP.  This is not true, the aerobic capacity clearly has a limit in all humans. However, the statement that it is rate limited is correct. There is clearly a limit and you might define it by CP.

Assumption 3 : The anaerobic capacity is not rate limited but capacity limited.  True, explosive power generated from anaerobic capacity is limited. It is not true that it is rate limited.

Assumption 4 : Exhaustion, by implication, termination of exercise, occurs when all of the anaerobic work capacity is exhausted.  The human engine does not necessarily terminate exercise when all the glycogen stores, consequently, anaerobic work capacity, is exhausted. Research proves that at the point of exercise termination, there is still glycogen left in the body. The fine proof is that when nearing exhaustion, if the power output is just slightly lowered, subjects exercising should be able to continue on despite still working at supra-maximal power outputs.

All models have assumptions and to be able to validate the model also means that the assumptions should be correct. If they deviate from reality, the model is wrong, sometimes dead wrong. Like CP, similar assumptions can be generated the concept of FTP and the astute athlete and coach can treat each assumption and try to understand at what point the usage of the model fails and is inapplicable to the athlete.

Note : Around 9 total assumptions about the 2 parameter CP model have been treated in the paper by Morton [5].


Like any mathematical model, GIGO principle applies. All models are wrong, being a simplistic representation of reality.  The CP models are not immune from this deficiency. Other concepts such as FTP also suffer from model related errors.

Some of the weaknesses in CP modeling are listed as follows :

1) Estimated CP and real CP maybe different : Critical power is a parameter estimated from the hyperbolic relationship between power and time or the linear relationship from work and time. There is no guarantee that the estimation from model fits consisting of limited test points actually point to the "real CP", i.e the real physiological boundary demarcating heavy and severe intensities. Unless ofcourse, the parameter is experimentally validated in the lab against the real procedure to determine CP (multiple lab visits at different test durations).

2) Model and protocol dependency : In a very practical research study, scientists compared several models for estimating CP using different combinations of time-to-exhaustion exercise sessions in 13 young recreational cyclists. They not only found that the 3 parameter CP model fit the data best, but when they compared model fits from time duration combinations having more of the short durations, CP was over-estimated and W' under-estimated [9].

Absolute difference (watt) between CP modeling techniques and the criterion model (3-P CP) are presented along with 90% confidence interval around each difference and effect size calculation. Source [9].

In particular to our interest, the 2-parameter CP model was closest to the criterion measure only when mean duration combinations such as 7, 12 and 19 minutes were chosen, whereas when durations were consistently < 10 minutes, the model values were far from accurate [9].

There has been reports of large variations in the calculated value of W' arising from different models, particularly in sub-classes of athletes such as elite athletes [6].

Sub-discussions that arise from model time duration dependencies are as follows :

2.1) Effect of exclusively very short durations in the model : When critical power is calculated from slope of the work-duration relationship using short supra-maximal exercises, the resulting power from models is higher than the power output which corresponds to a lab measured lactate "steady state" work intensity. The critical power also tends to be lower than maximal aerobic power [6].

2.2) Effect of exclusively long durations in the model : When critical power is calculated from very long sub-maximal exercise durations, the resulting power from the models tends to be lower than the power output which corresponds to a lab measured lactate steady state work intensity such as OBLA (onset of blood lactate) [6]. 

2.3) Weakness of some non-linear models : Due to the hyperbolic form of the CP model, small errors in CP translate to large errors in sustainable time duration. This reduces the predictive validity of CP when the model is misapplied by practitioners. The non-linear 2 parameter CP model suffers from a distinct weakness : As time approaches 0 seconds, power becomes infinite and at exhaustion, all of the muscular energy reserves associated with W' are exhausted. This is ofcourse, not necessarily true and the 3 parameter model was formulated to address this weakness, by bringing in an additional Pmax term.

2.4) Weakness of linear models : Linear models are often linearized from non-linear observations and as such introduce statistical errors simply from the linearization process. For example, it is possible that the fit parameters computed from linearized models yield higher values compared to their original non-linear forms.


To get around some of the weaknesses of CP models, careful application is necessary.  I can suggest a few things :

1) Getting the right "intensity" : Critical power is a "heavy" work output above which a slow rise in VO2 can speed the approach to VO2max and eventual exhaustion. As such, it has been suggested that critical power should only be calculated from exhaustion times corresponding to "heavy sub-maximal exercises".  The recommended exhaustion time range is suggested as 6 - 30 minutes [6].  Below and above this range, the validity of the classic CP models are questionable.

This idea that CP is a heavy intensity also brings a concern of long tests like 20 minute tests. The sub-maximal nature of a 20 minute test brings into question its reliability.

The data that is fed into the model matters. Garbage in, garbage out. While conducting tests, effort must be feel "strong" and motivation needs to be very high. Deflated and/or inflated values of power or speed will skew the results one way or the other when modeling.

2) Which CP model to use : Since the nature of the power-duration curve is a non-linear hyperbola, statistically speaking the best model fit would be a non-linear fit without transformation of any of the variables for linearization.

The non-linear 2 and 3 parameter CP models should be preferred over a linear model. However, the 3 parameter CP model was proposed by Morton as a way to get around flaws of the 2 parameter CP model (see section IV). Therefore, of all the 5 models, the 3 parameter CP model would be the sound choice. But this also means 4 or more trials need to be conducted.

Linear models systematically inflate CP than do non-linear models and there is ample evidence from literature that time to fatigue is drastically shortened when testing at work intensities estimated from linear models. This alone would support the move away from linear models.

3) Choice of test durations : Owing to research done in [9], it is best to include a mix of test durations in order to balance the short supra-maximal with the long sub-maximal. 2 and 3 duration tests can be analyzed by linear CP and 2 parameter model. 4 durations or greater can be analyzed with the 3 parameter CP model and linear CP models.

- For 2 durations : Pick from a range of 10-20 minutes.  Avoid very short and very long durations like 3 and 20 minutes.

- For 3 durations : 7, 12 and 15 minutes. If glyoclytic capacity needs to be tested, 3, 10 and 15 minutes is a good spread.

- Pacing : All short duration trials should be done in time trial mode to exhaustion but not "ALL-OUT". Example, a 3 minute test is an aerobic time trial to exhaustion, not a maximal sprint mixed with an aerobic effort.

-  The 3-parameter hyperbolic CP model (Morton model) is deemed protocol independant and works with 4 or more test durations. But I've also been told that when you have a trial that is too close to 20 min, you might get odd values for the Pmax parameter (too high values), which is not realistic.

-  Ideally, testing within specific durations would be conducted on different days. For a single day test, maximum 2 or 3 tests are recommended spaced by ample break but done this way, the impact of prior testing on a subsequent test performance has to be assessed.

5) Validate the model : Experimental validation is the only way to check if a model derived estimates of CP is representative of physiological CP. Until that happens, a model calculated value holds a presumption that it is accurate when it may not be. For example, a model might yield an inflated estimate of CP which would be above true physiological CP as measured in a lab leading to loss of maintenance of homeostasis.


CP and W' are not just parameters of a fitting model. That there are some underpinning physiological relations to them are shown by intervention studies designed to manipulate either one independently of each other. For example, studies show that training adaptations are specific to either CP or W'.  Nutritional and external gaseous interventions also affect the parameters.

Broadly, some of the studies and their references are listed below for further exploration :

1. Endurance training in normal subjects results in an increase in critical power with little or no change in W' [16, 17, 18].

2. Endurance training enhances critical power and end test power in a 3min all out test [19, 20].

3. Sprint cycle training with long rest intervals improves W′ [21].

4. W' is sensitive to, and modified by resistance training with no change in CP [22, 23].

5. W' is sensitive to creatine supplementation [24, 25, 26, 27].

6. Hypoxia systematically reduces CP with no significant impact W' [28]. Conversely, hyperoxia improves CP [31].

7. Supra-CP fatigue inducing work with different recovery durations affects the reconstitution dynamics of W' in different ways, without having an effect on CP [29].

8. Glycogen depletion has been shown to result in a decrease in W' [30].


1) Multi-duration testing : The established lab practice to model CP is done using several bouts of constant load exercise done at varying durations to failure over several days. These bouts are administered in random order and the recommended exercise duration to exhaustion range from 1-20 minutes. The time to exhaustion in these exercises is plotted power output. The hyperbolic 2-parameter Whipp model when fit through this data yields CP and W', where CP is the horizontal asymptote of the curve and W' is the area between the curve and CP which represents a fixed quantity of work that can be done above CP before approaching complete exhaustion. However, the choice of durations would need to be scrutinized to yield a critical power that resembles a severe intensity workload.

2) A 3 minute all out test (3MAOT) has been scientifically established to point towards critical power. The idea with this test is that it is possible to deplete W’ in reasonably short time. Therefore, the idea of the test is to perform work all-out in a span of 3 minutes and deplete W'. The last 30 seconds of the 3 min all out test is supposedly close to the critical power.

There are indications from the scientific community that the 3MAOT field test overestimates CP and underestimates W' so therefore, it is not a reliable measure of capacity in "well trained athletes".

CP calculated from a 3MAOT test. Source [4].

3) Software CP modeling ; Traditionally, CP is determined by acquiring power-time series data over several visits and fitting a chosen model to the data. However, lab visits are expensive and time consuming. With the proliferation of GPS and power meters, these can be reproduced by acquiring mean maximal power and duration data in a given sport over a time span such as recent weeks or months. Once that data is acquired, software can be used to plot the data as a p-t chart. Model fitting is done to solve for the parameters.


1) Predicting Time to Exhaustion :  The most fundamental application for the critical power (or velocity)  model is to help determine the time to exhaustion during work performed above CP. The very purpose of modeling is to find out parameters that can be used with the power output to determine time to exhaustion.

With the simple 2 parameter hyperbolic form using power and work done, time to exhaustion can be represented as :

Tlim = W′ /(P − CP)

As an example, setting W' = 20 KJ, CP = 250W, P = 300W :
Tlim = 20,000 J / (300W - 250W) = 400s = 6.66 minutes.

Similarly, in the distance and speed domain :

Tlim = (D - D')/CS

Setting Critical Speed (CS) = 6 m/s, D = 1600m, D' = 200m :
Tlim = (1600m - 200m) / 6m/s = 233.3s = 3.88 minutes.

This way, the time duration using a given estimated critical power or speed can be predicted.

2) Training Zone Descriptions : Once the critical power (or critical speed) has been determined, training descriptions can be communicated to an athlete.

The following training levels described by Dr. Skiba can be a decent start. These levels may have to be modified depending on the athlete and race performances and/or tests.

Recovery (Light):                          Less than 56% (or go by feel)
Level 2 (Moderate), Endurance :  56-75% of CP
Level 3 (Heavy), Tempo :             76-90% CP
Level 4 (Very Heavy), Critical Power : 91-105% of CP
Level 5 (Severe), VO2max :         106-120% of CP
Anaerobic Capacity (Extreme) :    Greater than 120% of CP

Bettina Karsten in a well-written thesis summed up CP training zones or intensity domains as defined in the literature. She gave extensive scientific references for these "zones". Background reading can be done beginning at section 2.3.1 in reference [32].

As she also highlighted in her work, exercise is a continuum and therefore the absolute "strictness" of these demarcation markers have not been fully demonstrated within research literature to date [32].

Broad CP based training intensity domains. Source [32]. 

Training zones and exercise intensity domains. Source [32]. 

3) Interval Training Prescription : One of the promising areas for using the critical power concept is to explore promoting targeted anaerobic and aerobic effects in an athletes.  HIIT training can be prescribed for individuals proportionate to their D' or W'.  By setting intervals to deplete a fixed percentage of W' and controlling the rest, individuals can complete a fixed distance at different speeds relative to their criticals. Examples of approaches are provided in [15].

4) Race Pacing Strategy : Pacing prescription may also be set for races where the use of running power is prevalent. A 10K race for a talented runner maybe targeted using 95-100% CP. A 5K race performance maybe targeted within a range of 100-105% CP. Again, experimentation is necessary with these ranges and no guidance can be offered set in stone, as courses are different and CP itself may exhibit small day-to-day variations.

For prolonged duration high intensity events, CP is purported to decrease over time. If that is true, it is not clear how effectively one could employ CP to set pace prescription [12]. However, the studies reveal that carbohydrate feeding of around 60g/hour should be an important strategy to negate considerable decreases in CP over long durations [12]. I also suggest the use of a multi-pronged approach for marathons and ultra-marathons, involving the use of pace, heart rate and perceived exertion.

5) Potential in anti-doping : A paper was published arguing that the CP model could be useful for doping detection mainly based on the predictable sensitivities of its parameters to ergogenic aids and other performance-enhancing interventions [13]. I understand this proposal is still in its early stages and needs to be vetted.

6) Educative value : Critical power models have educative value behind them. They can teach concepts underpinning human endurance and record performances.

Curve fitting is easily done in Microsoft Excel.

Filipe Maturana, a PhD candidate, built an app developed on R Shiny which allows you to model CP using a number of time to exhaustion trials. This would be a good model to play around with for what-if analyses.

Using that app, I ran a simple example of how two different combinations of time and power data yield two different values of CP and W' estimate when using a simple linear CP model. The power  and time to exhaustion data was taken from the reference in [9] and the duration combinations used as inputs to the two scenarios were 3+20 minutes and 12+20 minutes. 

Model output showing how different combinations of power duration data can yield different values of CP. 


While there are several exercise concepts out there, the critical power model has been one of the most rigorously studied one in scientific literature, with several lab studies validating the model for athletes. The number of parameters are small (CP and W') and they have physiological meanings.

In this post, only one form of this model - the hyperbolic 2 parameter model - was described in a somewhat broad manner. There are several other models including 3 parameter and extended CP models. In future, this post will be expanded to include a treatment of those other models.

The concern over test protocol, quality of data and error propagation carries across to any CP model. The practitioner must be careful in the use of these models to advise exercise prescription, specially to talented elite athletes. Lab based physiological profiles will be better suited to making informed decisions in these athletes.

However, in a vast majority of recreational athletes, proper use of the field based testing protocol and the modeling based on the data will yield a useful approximation of the endurance capacity of an individual. That it is conceptually the highest power output or speed at physiological steady state is useful in training prescription. Practitioners will also be pleased in utilizing a very scientifically vetted training concept.

What remains to be seen is how the critical power concept marries with the central nervous system theory of fatigue. That the ultimate limiter of exercise performance is not the muscle but the brain was introduced more than a century ago by scientists.

Implicit in the effectiveness of applying the critical power concept is this idea that the performance that is analyzed must be the maximal in nature, implying that the central drive must be maximum for that performance. The role of motivation and internal drive is significant enough to warrant further investigations as part of the critical power concept.

Readers are advised to expand on their knowledge and read the papers referenced below.


1. Jones, A. M., Vanhatalo, A., Burnley, M., Morton, R. H., & Poole, D. C. (2010). Critical power: implications for determination of VO2max and exercise tolerance. Med Sci Sports Exerc, 42(10), 1876-90.

2. Brickley, G., Doust, J., & Williams, C. (2002). Physiological responses during exercise to exhaustion at critical power. European journal of applied physiology, 88(1-2), 146-151.

3. Langsetmo, I., Weigle, G. E., Fedde, M. R., Erickson, H. H., Barstow, T. J., & Poole, D. C. (1997). VO2 kinetics in the horse during moderate and heavy exercise. Journal of Applied Physiology, 83(4), 1235-1241

4. Miller, M. C., & Macdermid, P. W. (2015). Predictive validity of critical power, the onset of blood lactate and anaerobic capacity for cross-country mountain bike race performance. Sport Exerc Med Open J, 1(4), 105-110.

5. Morton, R.H. The critical power and related whole-body bioenergetic models. Eur J Appl Physiol 96, 339–354 (2006).

6. Vandewalle, Henry & Vautier, J-F & Kachouri, M & Lechevalier, J & Monod, H. (1997). Work-exhaustion time relationships and the critical power concept. A critical review. The Journal of sports medicine and physical fitness. 37. 89-102.

7. H. Monod & J. Scherrer (1965) The Work Capacity Of a Synergic Muscular Group, Ergonomics, 8:3, 329-338, DOI: 10.1080/00140136508930810

8. Mark Burnley & Andrew M. Jones (2018) Power–duration relationship: Physiology, fatigue, and the limits of human performance, European Journal of Sport Science, 18:1,
1-12, DOI: 10.1080/17461391.2016.1249524

9. Mattioni Maturana, Felipe & Fontana, Federico & Pogliaghi, Silvia & Passfield, Louis & Murias, Juan. (2017). Critical power: How different protocols and models affect its determination. Journal of Science and Medicine in Sport. 21. 10.1016/j.jsams.2017.11.015.

10. Puchowicz, Michael & Baker, Jonathan & Clarke, David. (2020). Development and field validation of an omni-domain power-duration model. Journal of Sports Sciences. 38. 1-13. 10.1080/02640414.2020.1735609.

11. Jones, Andrew & Burnley, Mark & Black, Matthew & Poole, David & Vanhatalo, Anni. (2019). The maximal metabolic steady state: redefining the ‘gold standard’. Physiological Reports. 7. 10.14814/phy2.14098.

12. Clark, Ida & Vanhatalo, Anni & Thompson, Christopher & Joseph, Charlotte & Black, Matthew & Blackwell, Jamie & Wylie, Lee & Tan, Rachel & Bailey, Stephen & Wilkins, Brad & Kirby, Brett & Jones, Andrew. (2019). Dynamics of the power-duration relationship during prolonged endurance exercise and influence of carbohydrate ingestion. Journal of Applied Physiology. 127. 10.1152/japplphysiol.00207.2019.

13. Puchowicz, M. J., Mizelman, E., Yogev, A., Koehle, M. S., Townsend, N. E., & Clarke, D. C. (2018). The Critical Power Model as a Potential Tool for Anti-doping. Frontiers in physiology, 9, 643.

14. Mitchell, Emma & Martin, Neil & Bailey, Stephen & Ferguson, Richard. (2018). Critical power is positively related to skeletal muscle capillarity and type I muscle fibers in endurance trained individuals. Journal of Applied Physiology. 125. 10.1152/japplphysiol.01126.2017.

15. Pettitt, Robert. (2016). Applying the Critical Speed Concept to Racing Strategy and Interval Training Prescription. International Journal of Sports Physiology and Performance. 11. 10.1123/ijspp.2016-0001.

16. Porszasz, Janos & Emtner, Margareta & Goto, Shinichi & Somfay, Attila & Whipp, Brian & Casaburi, Richard. (2005). Exercise training decreases ventilatory requirements and exercise-induced hyperinflation at submaximal intensities in patients with COPD. Chest. 128. 2025-34. 10.1378/chest.128.4.2025.

17. Gaesser GA, Wilson LA. Effects of continuous and interval training on the parameters of the power-endurance time relationship for high-intensity exercise. International Journal of Sports Medicine. 1988 Dec;9(6):417-421. DOI: 10.1055/s-2007-1025043.

18. Poole, David & Ward, Susan & Whipp, Brian. (1990). The effects of training on the metabolic and respiratory profile of high-intensity cycle ergometer exercise. European journal of applied physiology and occupational physiology. 59. 421-9. 10.1007/BF02388623.

19. Jenkins, David & Quigley, Brian. (1992). Endurance training enhances critical power. Medicine and science in sports and exercise. 24. 1283-9. 10.1249/00005768-199211000-00014.

20. Vanhatalo, Anni & Doust, Jonathan & Burnley, Mark. (2008). A 3-min All-out Cycling Test Is Sensitive to a Change in Critical Power. Medicine and science in sports and exercise. 40. 1693-9. 10.1249/MSS.0b013e318177871a.

21. Jenkins, David & Quigley, Brian. (1993). The influence of high-intensity exercise training on the W-Trelationship. Medicine and science in sports and exercise. 25. 275-82. 10.1249/00005768-199302000-00019.

22. Bishop, David John & Jenkins, D. (1996). The influence of resistance training on the critical power function & time to fatigue at critical power. Australian journal of science and medicine in sport. 28. 101-5.

23. Sawyer, Brandon & Stokes, David & Womack, Christopher & Morton, Richard & Weltman, Arthur & Gaesser, Glenn. (2013). Strength Training Increases Endurance Time to Exhaustion During High-Intensity Exercise Despite No Change in Critical Power. Journal of strength and conditioning research / National Strength & Conditioning Association. 28. 10.1519/JSC.0b013e31829e113b.

24. Vanhatalo, Anni & Jones, Andrew. (2009). Influence of Creatine Supplementation on the Parameters of the “All-Out Critical Power Test”. Journal of Exercise Science & Fitness - J EXERC SCI FIT. 7. 9-17. 10.1016/S1728-869X(09)60002-2.

25. Fukuda, David & Smith-Ryan, Abbie & Kendall, Kristina & Dwyer, Teddi & Kerksick, Chad & Beck, Travis & Cramer, Joel & Stout, Jeffrey. (2010). The Effects of Creatine Loading and Gender on Anaerobic Running Capacity. Journal of strength and conditioning research / National Strength & Conditioning Association. 24. 1826-33. 10.1519/JSC.0b013e3181e06d0e.

26. Smith, Jimmy & Stephens, Daniel & Hall, Emily & Jackson, Allen & Earnest, Conrad. (1998). Effect of oral creatine ingestion on parameters of the work rate-time relationship and time to exhaustion in high-intensity cycling. European journal of applied physiology and occupational physiology. 77. 360-5. 10.1007/s004210050345.

27, Miura, Akira & Kino, Fumiko & Kajitani, Saori & Sato, Haruhiko & Fukuba, Yoshiyuki. (1999). The Effect of Oral Creatine Supplementation on the Curvature Constant Parameter of the Power-Duration Curve for Cycle Ergometry in Humans.. The Japanese journal of physiology. 49. 169-74. 10.2170/jjphysiol.49.169.

28. Dekerle, Jeanne & Mucci, Patrick & Carter, H. (2011). Influence of moderate hypoxia on tolerance to high-intensity exercise. European journal of applied physiology. 112. 327-35. 10.1007/s00421-011-1979-z.

29. Ferguson, Carrie & Rossiter, Harry & Whipp, B & Cathcart, A & Murgatroyd, Scott & Ward, Susan. (2010). Effect of recovery duration from prior exhaustive exercise on the parameters of the power-duration relationship. Journal of applied physiology (Bethesda, Md. : 1985). 108. 866-74. 10.1152/japplphysiol.91425.2008.

30. Miura, Akira & Sato, Haruhiko & Whipp, B & Fukuba, Yoshiyuki. (2000). The effect of glycogen depetion on the curvature constant parameter of the power-duration curve for cycle ergometry. Ergonomics. 43. 133-41. 10.1080/001401300184693.

31. Goulding, Richie & Roche, Denise & Marwood, Simon. (2019). Effect of Hyperoxia on Critical Power and V[Combining Dot Above]O2 Kinetics during Upright Cycling. Medicine & Science in Sports & Exercise. 52. 1. 10.1249/MSS.0000000000002234.

32. Karsten, Bettina. (2014). Analysis of Reliability and Validity of Critical Power Testing in the Field. Thesis Paper.

33. Pethick, Jamie; Winter, Samantha L.; Burnley, Mark Physiological Evidence that the Critical Torque Is a Phase Transition Not a Threshold, Medicine & Science in Sports & Exercise: May 4, 2020 - Volume Publish Ahead of Print - Issue - doi: 10.1249/MSS.0000000000002389