Sunday, February 7, 2021

Reverse Engineering Zwift Physics - A Fun Look


Last year, I performed a simple exercise to reverse engineer a cycling ride done on Fulgaz app to understand in-game variables being employed. I described it in this post. It was nice to receive a message from Fulgaz suggesting that I'd come very close to what they actually use for their sanctioned events. 

Overall, the Fulgaz app seemed very in-tune with the physics behavior we cyclists normally expect to encounter with cycling outdoors. Zwift, on the other hand is tricky. None of the in-game physics constants and parameters are known to the public. 

Given many of us outdoor cyclists find the discrepancies between in-game Zwift physics and real world confusing, it is apt to ask whether Zwift is even based on an "earth" model.

With that fun question - which world is Zwift in? - I make an attempt to understand what makes the app work the way it does. Its far from perfect but hopefully it simulates your own thinking and you can go off and try to do something similar. 

Just don't send me any hate mail. My approach maybe far from perfect.


The modeling tool is the same one I used to model the Fulgaz performance. It takes more than 30  variables and allows breaking a course into many many small segments for steady speed analysis. 

For Zwift, I included some adjustments where the model would accept different g constants, drag co-efficient of bikes, altitude de-rates to power depending on whether a cyclist is acclimated or not, segment-by-segment rolling resistances and drag areas depending on the heading of the cyclist, direction of wind and the known characteristics of the road. 

In short, there's a lot of "handles" that I can pull in order to understand the whacky world of Zwift.


I rode one lap of Mighty Metropolitan using my real bike hooked up to Computrainer. The in-game bike chosen was the 2021 Canyon Aeroad with Zipp 202 wheelset. My weight and height were set to 64.8 kg and 173cm respectively. The bike weight was assumed to be a race-ready 6.8kg, which came off some forums on the internet. 

Fig 1 : A crazy twisty, windy course! Course details on Veloviewer

Power output was dual recorded. Primary source of power was dual-sided power pedals while Racermate maintained the load at 150W. This would also yield second by second transmission differences between the two sites of power application depending on where on the course I changed a gear or my cadence. Trainer load was held at a constant 150W. I freely chose a cadence.

Using a script for processing the GPX file, the course was broken into 60 segments in order to capture all the features of the terrain - flats, downhills and uphill sections. Specific sections that had the glass or road surfaces were marked for rolling resistance adjustments.

Model variables were tweaked as far as practical to match the model segment time to the Zwift recorded segment time and speeds which came off the raw GPX file. The strategies being :

1) Minimize the difference between average model speed and Zwift reported average speed for the course. 


2) Minimize the difference between segment model speed and Zwift reported segment speed individually for uphills, flats and downhills. 

The exercise seemed to often be a tradeoff between the above two. In principle, the two scenarios should be intertwined but the way I derived segment-segment information from the GPX data could have led to some errors. For example, the average speed in a segment was derived from the data which may have been acutely affected by outliers within that segment. 

The best solution would balance the accuracy in average course speeds with the match within each of the segments. I ran a few scenarios to check the sensitivity of the model. 


A CdA of 0.22 sq.m was modeled from frontal area from Bassett et al. (Med Sci Sports Exerc 1999; 31:1665-1676) using my height and weight and co-efficient of drag Cd from Heil was computed as : 

Cd = 4.45 x mass (kg)^-0.45

CdA factors were set to 90% on the flat and downhill sections assuming an aero position but 100% on the uphills. 

Is this CdA representative of the Zwift world? Probably. Earlier, I did some Aero testing using the Chung "regression method" with the same bike and weight settings on another course called Queen's Highway. The CdA that resulted was around 0.21 sq.m (Fig 2). The Crr values were bonkers so instead I tried the Chung "virtual elevation method" and achieved better results (Fig 3). 

In the VE method, the CdA was more like 0.0028 sq.m and Crr around 0.0032. To achieve that, I had to constrain Crr to 0.0032 and used Goal Seek to hone in on the CdA value in order to get the virtual elevation to match the expected elevation profile. Since CdA and Crr have an inverse relationship with each other, I can only find out how sensitive CdA is to a given Crr by changing the Crr value. For this exercise, I simply fixed Crr to 0.0032. 

Fig 2 : (click to view) Results of aero testing done by Chung method. Spreadsheet courtesy of Alex Simmons , Google Wattage Group

Fig 3 : (click to view) Results of the Chung virtual elevation method. Spreadsheet courtesy of Alex Simmons, Google Wattage Group.

From the above results, it appears my simulation tests using CdA between 0.22 and 0.29 sq.m and Crr of 0.003 was alright. However, I'm pretty unsure of modification factors used for the uphill, downhill and flats. Slope is hardly the reason for a rider to change his bike position, infact it must actually be speed that sets the body position. However, I've assumed speed to be low on the uphills to increase the CdA factors and high on the downhills and flats. 

The Crr of tarmac was chosen as 0.003, the same I used for Fulgaz. A Crr of 0.003 is representative of the performance of a high quality tire on a smooth road. Note that the Crr for glassy segments of 65% of tarmac is arbitrary and hypothetical. 

Other factors like acceleration due to gravity, relative humidity, altitude and air temperature in Mighty Metropolitan etc were all based on data from NYC. 


I've attached the results from the tabulation below. 

Fig 4 : (click to view) Tabulation of different case runs with the variables chosen to run the model. The first two runs are representative of earth while the rest are whackier attempts with whimsical variables.

Fig 5 : (click to view) Model performance compared to the "virtual" performance in Zwift for segmented distances of the course using the given variables in Case# 1. 


As you can see through Fig 5, the model did well to bring down the overall error in overall course performance time and speed but seemed to struggle with matching Zwift recorded speed and time data for segments. In the best "earth" scenario Case #1 (see Fig 4), the model flew downhills but rode uphills and flats slower. But trying to make segment performance improve had a tradeoff on the course average performance as shown in Case #2. 

The segment specific speed and time matching were not that great. This could also stem from the errors in speed and time calculation within the segments themselves, which in turn stems from irregularities in the original GPX file. Moreover, since the GPX file is a continuous speed run of my avatar with one segment's speed input being linked to the output from a previous segment (such as steep downhill speeds leading to a climb), the transient nature would probably differ from a purely steady state analysis from the model. 

In other whacky attempts (#3-#6), I employed a "non-earth" scenario by manipulating air density and acceleration due to gravity. The best whacky attempt (#6) provided me a near identical error in course average performance speed and time as Scenario #1 but with improvements in the segment specific matching. These results are "whacky" because a change in the acceleration due to gravity and air density should also affect the CdA values, in other words they are all dependent on each other. However, I've ignored that obvious complexity and stopped further attempts here. 


I tried to reverse engineer the physics variables and parameters set in the whacky world of Zwift. The model came close but a closer match can be achieved only with inputs of whimsical numbers. A purely steady state type cycling power model is perhaps not best to use for matching to running-segment data however overall, the model course time matched closely with Zwift course time. 

If you don't know what to make of this fun attempt, don't worry. I'm just as puzzled how the world of Zwift works. And perhaps I just said it. The world of Zwift may not even be a world on earth. We would perhaps like to assume so, but it might just be the case that we're in a parallel world as earth with similar names of cities and hypothetical physics. Perhaps those flying cars and glassy climbs where riders seem to have the ability to climb at 40kph and descend at 80kph are enough proof.

As a cyclist who is clearly in-tune with the world around him and how his bike rides in that world for over 15 years, the way Zwift overreports speeds on flats and downhills seems overly flattering. That said, I love my in-game CdA and rolling resistances and whatever other Zwifty physics constants there might be. The enjoyment of Zwifting far overrules the eccentricities of this software.


*  *  *

Sunday, September 20, 2020

The Race of a Lifetime : Tadej POGAČAR's Stage 20 Time Trial Analysis

Photo courtesy :

Great comebacks are always a fascination for sports observers both from an entertainment and statistics perspective. Don't we all thrive for that moment when fortunes can be reversed and the underdog can win? In social psychology, this phenomenon even has a special name - schadenfreude.

Such a reversal in fortune happened during Stage 20 of the 36.2 km individual time trial at the Tour de France when a 21 year old Tadej Pogačar reclaimed nearly 2 minutes over his nearest rival Primoz Roglič, all but securing the title of the coveted yellow jersey and taking home 500,000 Euros in hard won prize money.

This was a rare feat to witness 20 days into the 3500 km Tour de France, and many had made up their minds that 57 seconds was a large chunk of time to win back from a highly motivated Primoz who had been in sitting in yellow for 11 days in a row. In the aftermath, the sport's pundits are going to be looking closely at how this was accomplished by the youngster, who beat just about every veteran of the time trial format available to contest that day.

Allow me to devote a brief section below to the analysis of the actual time trial performance and the corresponding power demands without going too much into the mathematics of it all. Please note this analysis remains to be validated since the official performance data from Team UAE Emirates is unavailable to the public as of today. Sources of my information are highlighted below and where required, educated guesses are employed. I also discuss my results towards the end of the article.

Assumptions & Considerations

I've used the following assumptions & considerations in this first order analysis :

  • Weight/Height : 66 kg/176 cm (Source)
  • Assumed Drag Area, CdA : T1/T2/T3/Finish = 0.22/0.24/0.3/0.3 sq.m (arbitrary but educated)
  • Assumed Rolling Resistance Co-efficient, Crr : 0.002-0.0023, 25mm width (Vittoria Corsa tubeless)
  • Assumed drivetrain efficiency : 98%
  • Bike T1-T2 : TT bike w/ rim profile 60mm/Full Disc at 8.3 kg
  • Bike T2-Finish : Road bike w/ rim profile 30mm/30mm at 6.8 kg (current UCI limit, source)
  • Gear : Aerodynamic skin-suit and streamlined TT helmet
  • Weather : Historical weather for 3-5pm local French time w/ winds 8.5-12 kph at 93-105 degrees range.
  • Roads : Good (smooth asphalt) w/ mountainous terrain
  • Course GPX source : Ritchie Porte's Strava data  
  • Performance time data : Pro Cycling Stats 
  • Model used : A widely cited & validated general purpose model of human power requirements in cycling
  • Secondary power data for comparison : Thomas de Gendt's Strava data for Stage 20


The race course was broken up into 4 segments corresponding to the official time checkpoints for the stage. A 1st order physics model was used in combination with official timings at those checkpoints to reverse calculate a suitable matching power output. I quote "suitable" as the numbers could change up or down depending on the actual conditions. From the potential locus of power outputs, this is a workable number for the rider, as I validate it below.  

Stage 20 ITT course profile


The modeling indicates that for the first two sections totaling 30.3 km, the use of a special purpose TT bike weighing in at an assumed 8.3 kg and a body shape of CdA 0.22 sq.m required an average power output of 427 Watts. The results indicate a positive split with an average power of 451 Watts for the 1st segment until T1 and 402 Watts for the 2nd segment T1-T2. 

In the vicinity of T2 at 30.3 km, a bike change happened where the TT bike was exchanged for a lighter road bike due to requirements necessitated by the gradient. This climb is at an average 8% gradient, kicking up to 20% in places. The bike change cost anywhere from 6-8 seconds in total, depending on how you start and stop the watch. This time cost is factored into the overall performance time. 

Thus, in the last 5.9 km of this climb, the use of the assumed 6.8ckg road bike required an approximate average of 412 Watts at an estimated 6.2 W/kg (power to rider weight). The power demand for T2-T3 and T3-Finish of approximately 3.3 and 2.6 km each were 432 and 392 W (6.5 & 5.9 W/kg respectively).

The results are plotted in the image below :

(Click to zoom) : Actual performance times along with corresponding modeled average power outputs for Tadej Pogacar in the final individual time trial of Stage 20 of the 2020 Tour de France.


This is an unverified analysis done based on checkpoint timings obtained from Pro Cycling stats and other publicly available information. An average power output of 419 Watts was required for this performance as per the modeling. What is definitely in question is the pacing profile over the course of time duration, which needs to be validated with real data.

Such a power output is not totally unrealistic for Tadej, given we know that in the 140km Mountain Stage on  Stage 8 of the Tour, he displayed a power output of 428 Watts over the Col de Peyresourde, climbing it in one of the fastest times recorded in recent history and an estimated power to weight ratio of over 6.5 W/kg. This was after 2 massive climbs before it and 120 km in the legs.

The modeled power output of 412W on the final 5.9 km climb equates to a power to weight ratio of 6.2 Watts/kg. Compare this to Thomas de Gendt's data from the same stage where he rode with an average of 405W at a power to weight of 5.9 Watts/kg. This is consistent with Thomas' performance data that shows he climbed 1:51 minutes slower than Tadej. 

The overall data indicates a positive split of power across elapsed time duration. I justify this with two potentially valid points : 

1. High motivation at the start, giving the rider the urge to ride hard in the first half. Tadej was in fact chasing what looked like an improbable target, a 58 second deficit to win the Tour de France. He might have purposely fired all cylinders, thus accounting for a potential loss of valuable seconds later during the bike change and any other unforeseen events on the climb. 

2. The decrease in power output in the second half might be attributed to a combination of accumulated fatigue and/or a change of the power demand and the impact on feelings from a sudden change to a lighter bike on a steep climb. The "sudden" change to a new bike and the lack of objective power data from an absent head unit meant that Tadej had to guage his effort carefully. It could be that despite a drop in power and cadence, Tadej maintained the "same" or even "greater" level of perceived effort compared to previous flat sections of the course. However, this is just my speculation.

The choice of tire rolling resistances and drag areas although arbitrary, are not a totally wild guess. We know that Team UAE Emirates is sponsored by Vittoria in 2020, the tubeless varieties of which have reportedly exhibited some of the lowest rolling resistances at race speeds. Therefore, I have started off with an ideal case of 0.002 increasing this to 0.0023 at the climb. I figured the weaving on the climb at slow speeds combined with the quality of road on the gradient poses less than ideal conditions, justifying the small increase to Crr. 

Reported co-efficients of rolling resistance for some bicycle racing tires at race speeds. Source : Aerocoach

Professional TT riders are known to be slippery, exhibiting well under 0.25 sq.m of drag area in ideal conditions (smaller riders reportedly presenting less than 0.2 sq.m!) I have started off with an ideal scenario of 0.22 sq.m in the TT position due to Tadej's height and weight, increasing this to 0.3 sq.m on the climb which corresponds to a climbing position adopted with the hands on the hoods. Again, these numbers are arbitrarily chosen but there is no way at present to verify what the real numbers in open terrain might be. I do have some references from a Twitter conversation to believe that my choices are conservative for a top professional rider. 

CFD simulation results showing the individual contributions of wheels, bicycle and rider to CdA as well as the net CdA. Source : Fabio Malizia, Katholieke Universiteit, Leuven

The total system weight with rider and all accessories is an unknown. A premium TT bike setup of 8.3kg and lightweight road bike setup of 6.3kg are not unexpected and matches recorded observations on the internet.  However, the weight of his kit, shoes, helmet, bottle etc are unknowns. I have reasons to believe this will be under 1kg in total however the uncertainty in analysis from the final climb will stem from the uncertainty in system weight and rolling resistance. Regardless, the modeled power outputs are likely not very far off from the actual numbers. 


I titled this race as the "race of a lifetime". Indeed, performances like these are hard to come by simply due to the immense difficulty of turning around such time advantages over a pile of fatigue and mental exhaustion 20 days into the Tour de France.

In some respects, Tadej's race performance has been likened to a pivotal moment in 1989 when the American Greg Lemond, bustling with energy and ready to try new technologies, beat the yellow jersey holder Laurent Fignon with the use of aerodynamic gear and in turn, winning the Tour de France. 

Whether Tadej's victory was a matter of such marginal gains at the end of the day is debatable. Yes, two purpose made bikes were used in the time trial in an unusual manner, but this is increasingly becoming common in the top races these days. Moreover, unlike 1989, both Primoz and Tadej were arguably evenly matched in terms of technology, the funding and competent attention required to apply the technology. In fact, on race-day, they both undertook bike changes before the 6 km climb so any small variations in equipment came really down to supply differences from the equipment sponsors.

Did Tadej just ride his usual top race, as he does every time and was it Primoz who slowed and fizzled out? Well, I think that is clear to see. A race is indeed won by someone who slows the least. And what promoted this spectacular fall when the day demanded the best? Whether it was the massive pressure upon his Primoz's shoulders, or whether it was the failure of his power pacing model, or whether it was the fatigue, or ALL of the above, we will not know for sure. 

What speaks to me from this performance is that marginal gains did not win, and something else contributed. Certainly Tadej rode the time trial of his life, and converted the opportunity of a lifetime to a magnificent victory. And I think in that moment, the individual qualities of what makes one rider better than another in the heat of the moment won. It really is a victory for the human element.

Years after his crushing defeat in the 1989 Tour, Laurent Fignon would write that despite getting over it, "you never stop grieving over an event like that; the best you can manage is to contain the effect it has on your mind." I hope that Primoz, as amazing a rider he has been to reach this level, is able to contain the effect of this race outcome on his mind and move on. He has more than a few good years of a top fight left in him at the very top. But an able and worthy opponent stands beside to check that in the form of Tadej Pogačar.

Thanks for reading. Comments and observations welcome below.

Sunday, August 30, 2020

Fulgaz App : Validating Model Prediction & Performance Results


After my self-inflicted Poor Man's Tour de France that ended on 22nd August, I took a week of recovery and lunged into the Fulgaz app's 3 week fundraising campaign called French Tour. This "curated" Tour campaign features most of the celebrated climbs of the Tour de France in the high Alps along with other famous circuits in and around France. With a real-time leaderboard and 381 virtual kilometres with 14000m of climbing over 21 stages, its a challenging event to keep my mind occupied while the actual Tour de France plays out. 

When riding the Tour in Fulgaz, you see a beautiful HD or 4K video of the course taken by a volunteer rider on a high resolution cam. The volunteer would obviously ride the course at their own capacity, so when you use the app, the speed of the footage can be set to "reactive", so effectively it'd speed up or slow down based on how closely you match the recording (for example, 1x, 0.9x, 0.8x or >1x). 

Being new to Fulgaz, I was quite impressed with the app's in-built features and sliders to "tune" nearly everything that would have appreciable impact on the ride - for example system weight, rolling resistance, drag area, even wind speed and direction. The speed with which the app loads is extremely fast, about a second on my Windows 10 pc with a 32gb RAM. Whats more, you can download all the high resolution videos to stave off any buffering troubles. A download would take 15 minutes for a full HD video on my modest internet connection. 

All this was fascinating, given that a) I'm quite new to indoor cycling apps and b) Zwift, another leading indoor cycling app of which I'm a paying customer, keeps a lot of these variables under tight secrecy so effectively you have little clue what is driving the model. 


Sometime ago, I built myself a cycling performance model for personal use. The model is built based on Martin's power model for cycling which I like to use for personal and coaching related estimation purposes. I can build as many segments of a course in the model and make tweaks to inspect how it changes my performance. This is handy for climbing and TT predictions and even drafting simulations. 

For Stage 3 of the French Tour, registrants would have to climb the 1200m vertical Col du Galibier. I was interested in knowing how my performance results in Fulgaz would compare with the model predictions for the same given power input. So I did the Galibier this morning, staying completely aerobic, sweating buckets, powers tuned to steady perfection with a Computrainer erg controller and a second Powertap pedal (those curious how I used the Computrainer with Fulgaz can send me an email or comment). Once I had my performance, I input the same powers into the model along with the driving variables input that I'd input into the app. Results are below. 


Given the assumptions I used (stated in the graphic below), the app performance results and the model predictions converged very well, which I'm pleased with. This gives me further trust in the app. Note how the positive and negative errors negate each other over time. Also note that generally, the errors are within 5% and the average error for 17.9 segments I manually built is -0.26%. 

(Click to view) "Virtual" performance in the Fulgaz app compared to model predictions for given power. The ride was done on a Computrainer using Fulgaz app. Strava results :

I believe the errors are partly due to :

1) The chosen granularity of the course, which is a km at a "constant elevation". In reality, the road might step up or step down several times within a kilometre. However, for my purposes, this would suffice.

2) I did not ride km segments at "constant power". Infact, I modulated it based on how I felt. 

3) I've assumed a constant rolling resistance per segment of 0.003. If the Fulgaz app changes rolling resistance in real time based on the segment you're on, that could affect the speed slightly. 

Also note that I have not applied altitude power-attenuation ("de-rate") in the model, which is because this is a virtual environment. However, we know for a fact, from both tested runners and cyclists, that aerobic capacity drops at moderate to high altitudes, how much depending on acclimatization levels and individual attributes. So in reality, actual times are very likely going to be slower. How much slower is another conversation. I hope to tackle that in an upcoming post. 


The close results from my model and the actual performance on Fulgaz makes the Fulgaz app a reliable training tool, in so far as it is used for constant, steady speed climbing (I have yet to test it for solo TT efforts against the wind). It also validates the Martin model (which has probably been done several times before by several people). When I shared this article with Mike Clucas of Fulgaz, he essentially confirmed that my reverse engineering closely matches the inputs that drive their model, atleast in curated events such as the French Tour. 

This post generally speaks to the need for indoor cycling training apps to make transparent to a customer what is driving their in-game physics. If in-game physics is non transparent, predictions based on widely used open source models will be vastly different to in-app performance. 

If the variables that impact the in-game performance are not transparent, you can't effectively do "what-if" predictions as almost all cyclists do in real life ("if I ride with x equipment and/or shed a few pounds, how would that affect my performance?"). This does not help those who take their training and racing very seriously and like to pre-plan for the event.

One might argue that indoor cycling apps are built like "games", hence the physics can deviate to an extent simply because it is a game. But there can be impacts. Depending on the magnitude of the deviation, a host of things can be affected ranging from perceived exertion, fatigue, CP and W' dynamics and most importantly the nutritional needs that an app based performance requires.  Either-way, if you can't predict something with physics, it's unverifiableunpredictable and might I add, possibly unstable

If on the other hand, all indoor cycling apps used a verifiable model, one could effectively standardize/minimize/account for a source of variability while the rest of the differentiation can be in the graphics, software performance and other perks unique to each app. 

I look forward to actually climbing the beautiful Galibier in reality, if I'm lucky enough to put together my coin collection and go to France. Huge thanks to everyone at Fulgaz for keeping me entertained during this tumultuous time.


Sunday, August 23, 2020

The Poor Man's Tour de France : Virtual Stage Racing in GT Mimicry

Readers might recall that last year, I attempted a Poor Man's Giro d'Italia, a tongue in cheek name for a stage racing simulation in which the objective was to follow the Giro while riding "short" stages pretty much everyday by myself on local roads. The main motivation behind the exercise was to collect data and compare them to research studies attempted into Grand Tours and Grand Tour racers. 

I'd wanted to replicate something like that this year but with some additional realism to racing. Obviously for this to happen, the intensities would have to be high and I'd have to race with other people. With the whole Covid-19 situation demolishing the race calendar throughout the world, I turned to Zwift for the obvious solution. 

And thereby, I began another self-inflicted stage racing attempt called Poor Man's Tour de France in July. 

I have a few points to make on this mini-adventure before I share the data :

1. The races began on 17th July and lasted upto 21st August. All race results are recorded in my Zwift power user profile. I started Zwift as a beginner in the E/D category and moved up to C by Stage 13. (To download my data in Excel .csv format, you can click on the plot below in Figure 2 where it links to the tabulated data). 

2. All races were done with a single sided pedal based power meter and a heart rate monitor on a non-smart trainer. 

3. The trainer used was the Feedback Sports Omnium Over-drive unit. This is a roller unit which is extremely portable, and perhaps the most portable of all trainers. Owing to direct contact between tire and roller, rolling friction and the dynamics of tire pressure becomes a bit more important than direct-drive units. The unit has minimal inertia. therefore, there is little to no way to coast during racing. If you stop pedaling, you lose power and stop very quickly. On the plus side, riding with this trainer has considerably improved my pedaling conditioning. Due to the direct wheel-on-roller experience, I was also able to get instant audible feedback on stomping vs smooth pedaling patterns.

4. Due to lack of a direct drive setup, I found I had constraints with the inherent power curve available within the above trainer. With the gearing available to me and that power curve, I rarely escalated past cruise powers greater than 200 Watts, for fear of damaging the rollers (I'd already damaged one earlier this year and lost nearly 3 weeks to have a replacement under warranty shipped out to me from Hong Kong!). This also limited the short maximal sprint power outputs I could display within 300 Watts (I was generally not interested in sprinting) 

5. Choice of daily distances were variable. Terrain type was a mix between rolling hill races, mountain stages, few crits and uphill time trials. I skewed the race stages more towards rolling hilly races. 

6. I felt racing every day on Zwift while maneuvering around time constraints as a parent to a 2 year old wasn't really easy. Therefore, I had a few more recovery days in between stages than what would be standard for a Tour. In general, I didn't exceed more than 3 days without a race but the norm was racing every second day as fatigue started accumulating. 

Figure 1 : The author's "pain cave", cobbled together during Covid-19 shelter in place restrictions in UAE. Materials used : book shelf, baby high chair, normal chair, ironing stand, yoga block, a weighing machine, Feedback Omnium Overdrive trainer, road bike, laptop and a 19 inch wide screen monitor. 

Below is the interactive data for all 21 stages of the Poor Man's Tour. Note that data is presented against a logarithmic y-axis to make the plot more readable. Scrolling over the data lines should show data points.


Table 1 : List of races done during 21 day Poor Man's Tour de France on Zwift

Figure 2 : Data for 21 days from a self-inflicted stage racing simulation called The Poor Man's Tour de France (top to bottom) - Total heartbeats, calculated calories, Zwift reported calories, work done, elevation, Trimp points, average heart rate, bike stress, normalized power, average power, average cadence, distance, RPE, TSS/km, TSS/km, & duration. Note that BikeStress is a training metric native to Golden Cheetah which establishes race intensities as a function of duration and intensity. Click on the line to view the data. 


1. Total Distance, Elevation & Calories : Over the total of 21 stages, I completed 600km of racing with a net ascent of 8666m burning an estimated 12000-13000 kcals. This equates to 17% of the actual Tour de France distance with an elevation gain nearly the height of Mt. Everest. These are modest numbers.

2. Heart rate : The range of racing heart rates were between 151-194 bpm across the 21 stages. The highest heart rates were featured in stages with high stochasticity in pacing effort. For example, the two crits I attempted on Stage 9 and Stage 14 both of which had rolling terrain showed the highest heart rates.  However, there were other crits I attempted which did not feature high heart rates (for example Stage 18). Although the normalized power for those stages were also high, this does not explain the higher heart rates. Perhaps cadence is another factor that might offer a clue, meaning the stages with higher cadence could feature high heart rate. There may also be a hidden con-founder somewhere that is outside of this data (diet, sleep, fatigue, other activities in life...). 

3. Power : In terms of normalized power, the range was from 117W-193W over 21 days of racing. Interestingly, in the early stages, I was just getting used to riding at high intensities on a trainer and not wholly happy with the cooling air flow available to me. So the early stages featured low powers at high heart rates. As the stages evolved, I got fitter in terms of being able to deliver higher power to the pedals at similar heart rates. I also got myself a bigger industrial size fan which could push out more air volume! There was a plateauing phenomena in powers as racing progressed which I attribute to day-to-day fatigue and the inherent power curve limitations of the non-smart trainer. 

4. Aggregate stress : Over 21 days of riding, the aggregate TRIMP based stress was 3008 for a daily stress of 143 AU/day. The aggregate Bikestress (a correlate for TSS) was around 2124, giving a daily figure of 101 AU/day. These were all calculated in Golden Cheetah. Total kilojoules burned was 12036, resulting in an average of 573 KJ/day. On a per day basis, these numbers are higher than the same data from Poor Man's Giro d'Italia. 

5. Distance specific intensity : In terms of TSS/km and Trimp/km, two metrics that maybe indicative of ride intensity as a function of unit distance, the highest values were incurred in stages that featured either a mountain climb time trial, a mountain race or a high intensity crit race. For example, of all the stages, the ones I rode on Stage 3 (L'Etape du Tour Stage 3) and Stage 7 (Alpe du Zwift TT) posted the highest values of intensity per distance. This again agrees with my findings previously from Poor Man's Giro d'Italia and the research data I posted there from the Sanders investigation of Grand Tour racing. With distance and per day stress metrics stated as above, one area of inquiry is whether there are differences in the numbers between indoor and outdoor racing. With constraints of air flow and cooling indoors, one might expect to see higher race intensities indoors. Comparing the Zwift racing data with last year's Poor Man's Giro, the distance specific intensity metric Trimp/km is definitely higher this year on Zwift. However, this is not exactly an apples-to-apples comparison because I did not do a true "stage racing simulation" last year. However, the argument that indoor intensities should be higher is a rational one and something to discuss and further explore. 

6. RPE : Across 21 days of racing, RPE varied from a low of 6 to 10! The hardest I felt was during Stage 2 (L'Etape du Tour) which featured a mountain ascent of 1538m. Because this was one of the earlier stages, I was in no shape to climb continuously for 3 hours with poor air flow. In the final 20 minutes, I did hop off the bike once to take a break, thinking I was going to have a heart attack. Part of the challenge that day in my pain cave was lack of air flow to cool myself for that long! The standard deviation in RPE across the stages was quite low, however, indicating that the intensity on all days were more or less quite similar to each other. 

7. Cadence : My average cadence across all stages was 83 and the highest cadence was during Stage 19 which was a rolling hills ITT of 28 kilometers in length, where I staved off fatigue by riding at 90+ cadence. I reckon the stages with high cadence were excellent stimulus to the VO2max region of training intensities. One of the things I'm pleased with as I attempted this challenge is that I got quite experienced with being able to regulate my cadence to tune my perceived effort within different racing situations. It may have been that this factor also affected my heart rates over the course of each day's race. 

8. Nutrition : In general, being able to race everyday on Zwift means having to rely heavily on carbohydrates; success seems to depend on how well the stores of glycogen are topped up between races. My fuel system is one that is biased towards carbohydrates, which may also be partly explained by the fact that I'm a habitual carbohydrate consumer. There were some days were I didn't have the luxury to manage the diet well to feel fully topped up before the next race. On days where I felt I needed an extra "boost", I used the top ergogenic aid known to man, that's right - Coca Cola! Caffeine works. 

9. Race Competition : In general, I have only good things to say about Zwift as it is a tremendous motivational tool. During the 21 stages, I enjoyed many days sitting in the peloton and sharing the effort that got us all across the line with good timings. But I was in no way a match for those who could utilize some of the "gaming" aspects of virtual racing.

I think Zwift has to figure out some way to weed out sandbaggers. Although the final listings on Zwift Power website excludes those cheating below their actual categories, the race dynamics are affected by the presence of these individuals. For example, it is often the first 2-3 minutes of an e-race where your placement is made or broken due to massive power surges to find position. The presence of more able riders who are cheating below their category could compel others to ride just as hard in order to get on their wheel , as a result many gaps are formed disadvantaging the lower order riders who have "missed the draft". This point maybe moot. 

Figure 3 : The author in a "break" of select group of riders from the Namibian Race League
during Stage 17. 


The results discussed in the section above generally agrees with the data found from Grand Tours that the mountain stages are where the action really is in-terms of stress and intensity. Although the race intensities were high and the monotony day to day was also high. Zwift provided a great way to beat that monotony with the ability to select from numerous races spread across different maps with different competitors. For example, I found South Africans and Japanese race subtly different when compared to Brits! Maybe that is an imaginative observation, but it still is an observation. 

Overall, while I found I was making improvements in the duration specific power outputs as the races progressed, I found myself hitting a plateau due to a combination of fatigue and power curve limitations on the trainer. In other words, there were diminishing returns after a point. 

From the Poor Man's Tour de France racing challenge, I was quickly able to learn which e-races suit me and which races wouldn't. Therefore, the choice of many rolling hilly races was intentional. I also included mountain stages. Flat, all-out races were few. 

If I redid the Poor Man's Tour de France again, I'd figure out a way to balance out the percentage of race distance spread between mountains, flats and rollers. But I can't say if the actual Tour de France traditionally or even this year has actually been balanced either! Often we hear that the Tour stages are deliberately designed to suit some of the top French stars. That doesn't seem to be any different this year. 

In conclusion, with the constraints that were upon my time, I think this was sufficient racing stimulus. Due to the plateauing effect of power and the accumulating fatigue as the stages progressed, I had to draw the line somewhere to minimize the losses. 

With a few days left for the actual Tour de France, I will be able to smugly soak in the racing footage and maybe even pretend to co-relate to it with my own experience, ha!

Wednesday, July 15, 2020

Tour de France : Key Statistics

The following is an easily accessible graph showing key statistics for the Tour de France from the years 1903-2019.

I've highlighted some epochs in the data, namely the two World Wars and the 1990's doping era culminating in the Armstrong doping saga. For the wars, the plot makes it seem as if racing took place but this is just a visual effect. No racing took place during those years.

To go along with this data, you might also like my previous post on modern bicycles and cycling speeds. There, I explored whether bicycles themselves have made any appreciable impact to speeds.

Hopefully this datasheet can be used in future years as a live plot.

All data obtained from and compiled with Datawrapper.

Wednesday, April 29, 2020

Functional Threshold Power : A Scientific Scrutiny

Certain entities in the world have originated competing claims about cycling performance concepts, test protocols, and training zones the rest of the world must adhere to. The astute athlete cum observer would want to find out which ones stand scientific scrutiny and separate myth from fact.

In that spirit, this post is an appraisal of the definition and estimation techniques of Functional Threshold Power (FTP) which are at the core of this power-based training concept. It follows from the last post, where I explored a scientifically vetted threshold concept called critical power (CP) and most of its nuances, including application related issues.

This post attempts to explain with simple arguments and scientific references why FTP, although as "useful" a performance metric as it may be to some people, is a pseudo-scientific concept at best. 

I. Introduction to FTP : Definition and Estimation

FTP was conceptualized as a field-based practical method of estimating a threshold phenomena using cycling power meter technology. It is, like Critical Power, used as an endurance index to design training prescription as well as classify cycling talent. A threshold, as a reminder, is an intensity marker just above which physiological responses will sharply change whereas below it, attains a steady-state within a tolerance band. 

The training concept was formalized by Coggan in the book Training and Racing with a Power Meter (TARWAPM) in the early 2000s and an ecosystem was built around FTP consisting of sister metrics (NP, IF, TSS, etc) and software marketed by Training Peaks group. Some of the history behind how this all came to be is documented on TARWAPM's blog page.

Andrew Coggan PhD signing copies of TARWAPM books. Source : TARWAPM's blog page. 

Let me cut to the chase and quote the 3rd edition of TARWAPM's definition of FTP, marking in red some key terms that I will explore further : 

"FTP is the highest power that a rider can sustain in a quasi-steady state without fatiguing. When power exceeds FTP, fatigue will occur much sooner (generally after approximately one hour in well-trained cyclists), whereas power just below FTP can be maintained considerably longer. [1]"  --- (1)

The text lists around 7 different methods to estimate FTP.

1) From a power & time-frequency distribution chart from cycling training and racing data.
2) From routine steady power intervals, repeats or longer climbs.
3) From normalized power (NP) during hard mass-start races of approximately one hour.
4) From a one-hour time trial by inspecting a smoothed time-series plot of power.
5) From a power duration model. obtained by testing for CP, and where the resulting model derived value of CP is suggested to be interchangeable with FTP.
6) From the proprietary mFTP model in WKO4.
7) From the FTP testing protocol consisting of a 28-minute warmup, a main set of 20 minutes, and a cooldown of 10-15 minutes. 

The 7th method is probably the most notoriously proliferated in cycling lexicon. The premise being that subtracting 5% from the main set time trial of 20 minutes after a hard warm-up will estimate FTP (hereby, called FTP20 for simplicity).  A. Coggan has not until recently distanced himself from this estimation technique, saying that it was Allen Hunter's contribution and not his.

There are other methods of estimating FTP which is coded into popular programs like Zwift and Trainer Road and yet another confusing bunch of "new" test protocols on Training Peaks' website. The validity of these techniques are in question besides the obvious danger of under/overestimating some individuals. Therefore, this post is purely focused on the original concept of FTP and its test protocols as codified in TARWAPM. 

II. A Deeper Inspection of FTP

Let me inspect in slightly more detail the terms highlighted in red in the definition of FTP in (1). 

A) The Issue of "Quasi-steady State" 

A quasi-steady state is meant to describe a transient situation where physiological variables such as blood lactate and VO2 are rising but remain within the zone of uncertainty. Using field power estimates, a quasi-steady state can be attributed to the variation of power values over the time duration. 

Scientific studies demonstrate the ability to work at a "quasi-steady state"  at critical power, where workloads are on the order of 10-15 Watts higher than what can be sustained for one hour. Critical power, from my previous post, corresponds to a workload of approximately mid-way between lactate threshold (or gas exchange threshold) and VO2max.

Moreover, the time to exhaustion at that workload is lower, around 24 minutes or so. This has been demonstrated in both physically active subjects and competitive cyclists (see Figures 1,2 below). Besides, even a shorter maximal time trial lasting around 30 minutes was demonstrated to show quasi-steady-state behavior in both power outputs and physiological variables (see Figures 2B, 2C). 

Therefore, the FTP concept systematically underestimates the wattage, or the workload that can be achieved at a quasi-steady state. From a lab testing standpoint, this same observation was noted with MLSS suggesting that a steady state in VO2 can be achieved even beyond the power at MLSS [15].

Figure 1 : The Poole study showed that in physically active subjects, the group mean of metabolic demand when working at constant-load exercise at critical power, much higher than that of lactate threshold, resulted in steady state VO2. Time to exhaustion in these subjects was 17.7 +/- 1.2 minutes. From research reference [6].
Figure 2 : The de Lucas study in competitive cyclists showed that quasi steady state VO2 was achieved at workloads at CP. Here again, the group mean for time to exhaustion was 22.9 +/ 7.5 minutes. From research reference [7].
Figure 2B : Quasi-steady state power outputs were shown in intense 30 min TTs conducted on well-trained triathletes. See reference [12]. 

Figure 2C : Quasi steady states in metabolic demand and blood lactate values were shown in a much more intense, shorter TT lasting 30 minutes on well trained triathletes. Moreover, study demonstrates that subjects can sustain very high values of blood lactate for extended time, some > 10mM. See reference [12].

B) The Issue of "Absence of Fatigue"

The highest workload one can sustain for about an hour does not or cannot occur in the absence of fatigue.

A well-documented study in well trained cyclists who completed a 4, 20 and 40km time trial demonstrated that central and peripheral fatigue occurred in all distances, including the 40km TT which took approximately 65 minutes to complete.  The pattern of central vs peripheral fatigue shifts from peripheral dominant over 4km towards central fatigue over 20km.  In other words, the decline in the ability to produce force residing within the central nervous system was higher in the longer time trials. Given this data, exercising at the highest workload corresponding to quasi steady-state in the "absence of fatigue" is notionally incorrect. 
Figure 3 : Exercise induced impairment in the ability to produce muscular force measured in well trained cyclists who executed 4km, 20km and 40km time trials. Fatigue, central and peripheral, are prominent in all duration time trials with central fatigue being highest during longer time trials. Reference and adapted from [8]. 

C) The Issue of "Highest Power" at Quasi-steady State 

Following the arguments from A), the power output that can be sustained for "approximately" an hour does not correspond to the "highest power" for which a quasi steady-state can be achieved. The study referenced in Figure 3 shows that lactate values for the 20km TT also stabilized around the 8km mark and barely increases until the end spurt, despite being at a 15-20W higher than the 40km TT wattage.

FTP arbitrarily pegs a duration of "approximately" one hour to "highest power at quasi-steady state" which is not actually the case. What is obvious though, is that the workload at FTP is an unambiguously steady-state and therefore, is not the highest "quasi-steady state" intensity. 

Figure 4 : Lactate values in a 20km TT stabilize at the 8km mark and barely increase until the end spurt, despite being a 15-20W higher than a 40km TT. The latter took 65min to complete, which going by FTP definition, would correspond to "approximately" one hour.  Reference and adapted from [8]

D) The Issue of "Functional Threshold" As a Surrogate for Laboratory-Based Testing

FTP originated as a practical field-based alternative to lactate threshold testing. "Threshold" in concept refers to sharp distinctions in physiological responses associated with exercise slightly below and above a specific intensity value.  

However, as we have seen previously, the maximum intensity of "quasi-steady" state exercise has been shown to have sustainable durations much lesser than approximately an hour. So it is questionable why FTP should lay claim to accurately representing threshold in a wide group of people. 

Having stated that, how do comparisons between FTP and laboratory indicators of threshold match up in scientific studies? 

Let's take a look : 

1) FTP compared against individual anaerobic threshold (IAT) power:  Although empirical demonstrations have shown FTP20 and IAT are close, authors of one recent study stated: "…it is difficult to accept FTP as a thoroughly valid concept. We found large limits of agreement between most variables, suggesting a high level of inter-individual variability in the relationship between FTP20 vs. FTP60 and between both measurements vs. IAT (me: stepwise lactate profile test)."  [2]  

In other words, wide limits of agreement in a Bland-Altman plot shows that any agreement between a surrogate method (FTP) and a laboratory-based marker, here IAT, must be ambiguous.

Figure 5 : Bland altman plot of FTP20 compared to individual anaerobic threshold (IAT)  in 23 well trained cyclists. See reference [2].

2) Against maximum lactate steady state (MLSS) power: The same authors from the study above compared FTP with another threshold concept called MLSS and found generally good agreement. However, even in this study, wide limits of agreement were bserved between FTP20 and MLSS among different groups of cyclists with different training statuses, implying ambiguous agreement between the two when we look at heterogeneous samples [3]

Figure 6: Comparison of FTP20 with MLSS in 15 cyclists - 7 trained and 8 well trained. See reference [3].

Another study that studied validity to MLSS concluded: "The results indicate that the PO at FTP95% is different to MLSS, and that changes in the PO at MLSS after training were not reflected by FTP95%.  Even when using an adjusted percentage (ie, 88% rather than 95% of FTP20), the large variability in the data is such that it would not be advisable to use this as a representation of MLSS." [14]

3) Against Lactate Threshold (LT) power: Foster came to a similar conclusion as the previous two studies when comparing LT and FTP20. They wrote: ".....caution should be taken when using the FTP interchangeably with the LT as the bias between markers seems to depend on the athletes’ fitness status. Whereas the FTP provides a good estimate of the LT in trained cyclists, in recreational cyclists, FTP may underestimate LT." [4]

Figure 7 : Limits of agreement between FTP and lactate threshold studied in 20 healthy cyclists. See reference [4]. 

4) FTP Compared Against A Range of Blood Lactate Threshold Markers: One study compared FTP20 with a range of laboratory-based blood lactate measurements, such as LT, LT at 4mmol blood lactate, Dmax derived LT, and IAT (LT = lactate threshold).  The main objective was to find the best correlate of FTP in a single study. 

They demonstrated that all computations resulted in numbers that differed significantly from FTP20. Despite the strongest correlation being between FTP and LT4.0, a large dispersion of approximately 100 Watts was found in the inter-individual data questioning their equivalence. The study concluded: "...we suggest that FTP does not have an equivalent physiological basis to any of the tests used herein and, therefore, cannot be used interchangeably." [9] 

Figure 8 : FTP compared to a host of lactate parameters in 20 competitive cyclists. See reference [9].

The overall picture from the previous studies shows that claiming FTP can be used as an accurate surrogate for laboratory-based measures of threshold is at best, unfounded.

E) The Issue of FTP20 method and "False Sense of Precision" 

As a matter of practical convenience, the second and third editions of TARWAPM suggested the FTP20 method as a way to estimate 60 minute FTP.

The issue with this technique is that the 95% computation is probably an average for a large group of cyclists but not exactly applicable to you or I mainly due to inter-individual variability [5]. Some people will be at 93%, some at 90%, some at 85%. This was also shown by A. Coggan himself (see Figure 9).

One prominent exercise physiologist told me: "A value of 92-93% is probably closer on average, whereas a value of 95% would, therefore bring the estimated threshold back towards 30MMP (30 minute mean maximal power)!"

Figure 9 : 95% of 20min power is not necessarily one hour FTP. Source : Facebook fan-page of TARWAPM.

As published by A. Coggan in a whitepaper in March 2003, the real effect of employing an arbitrary correction factor to 20 min power may simply be to convey a false sense of precision [10].

While its understood that he would like to distance himself from the FTP20 method, I would add that continuing to perpetrate the false sense of precision in the TARWAPM book does not make false sense of precision go away. Besides, the entire discussion of whether the correction factor should be 95%, or 90% does not take away from the fact that FTP is arbitrary linked to "approximately one hour" with an unfounded claim to being the "highest workload" at quasi-steady state. Will two wrongs make a right?

F) The Issue of FTP derived as CP from W-time plot

In my previous post, we looked into several research studies showing how critical power defines the boundary between heavy and severe intensity. In numerous research studies, work-rates at markers of thresholds such as LT and MLSS were found to be lower than work-rates at CP. In fact, it falls somewhere midway between LT and VO2max, depending on which study you look at. 

With that information available, CP is a high-intensity workload that may be sustained only approximately 30 minutes or less. Therefore, approximately one hour of power (FTP) and CP should not be considered interchangeable in principle without data. As one research team noted, a fresh study involving a wider cohort of subjects is worthwhile to continue to test this idea of interchangeability [11]. 

In a study conducted by Morgan, FTP20 and CP correlated with each other but the limits of agreement were found to be relatively large (+ 10.9 to -13.1%) such that the authors argued: "...limits of agreement between CP and FTP in this study may be too large to be practically meaningful for athletes and coaches, and that the agreement between the two variables may be coincidental." [5]

The idea advanced in TARWAPM Ed.3 that FTP can be estimated from a linearized Work-Time plot and considered interchangeable with Critical Power is unfounded.

G) The Issue of Secret Sauce In Modeled FTP

In TARWAPM, one of the methods to estimate FTP is from modeling it from a collection of mean maximal power (MMP) values collected in a specific time frame window. The value of FTP is the resulting parameter solved from the fit, called modeled FTP or mFTP.

However, mFTP modeling is only available in the proprietary software WKO4 (now WKO5). On the Wattage forums, A.Coggan has claimed that data from over 200 MMP values show mFTP to be 60 +/- 13 min, and he's used this as an argument to claim that FTP is sustained for "approximately" an hour.

TARWAPM calls the modeling technique the "secret sauce" implying that the proprietary fitting method is not available for open scrutiny, only its outputs are. This roadblock might explain why most studies have used the FTP20 estimation technique to explore their research questions. Compare this to the CP concept which is pretty much open-source and tenable to research to advance our understanding in wide groups of people and wide groups of sporting activities.

H) The Issue of FTP Based Stress Metrics and One Hour 

While TARWAPM's definition that FTP is based "approximately around an hour" continues to be proliferated, other metrics in the FTP ecosystem such as the Bannister style "Training Stress Score" (TSS) is still arbitrarily pegged to an hour. The math in the formula for TSS has been designed in such a fashion as to result in 1 hour at FTP = 100 TSS. This indicates that the formula was designed with an arbitrarily fixed value in mind for convenience, rather than basing it on physiological reality.  Since training prescription and fitness performance charts in the FTP ecosystem are based on TSS, flaws are propagated throughout the mathematics chain.

III. Conclusion

FTP was borne out of a perceived need for field testing convenience and one might add, an entrepreneurial excitement to build a quantification ecosystem when power-meters hit the market beginning in the late 1990s. 

As a purely performance-based metric, FTP is "useful", just as critical power concept and modeling for CP is useful. However, in comparison to CP, the number of papers scrutinizing FTP has been woefully and remarkably small in number. Many of them demonstrates that the validity of FTP is in question. 

I conclude with a summary of reasons why FTP must be approached with caution by whomsoever is using it or plans to adopt it : 

1) FTP's definition that it is the "highest" workload one can sustain a quasi-steady state is not demonstrated in studies. This might systematically under-estimate the intensity where quasi-steady states can be achieved. This also implies that FTP is an intensity area where one is unambiguously at steady state. 

2) FTP's claim to be a valid and accurate surrogate for lab-based testing for a range of thresholds is unfounded. Besides, any claims that the concepts like critical power and FTP can be interchanged through modeling work is unfounded and probably a serious error. There have been recent calls by scientists to consider CP alone as the gold standard when the goal is to define maximum lactate steady state [13].

3) FTP's claim that it is approximately one hour of power that can be sustained without fatigue is most definitely incorrect. 

4) Despite acknowledgement of variability, accompanying metrics in the FTP ecosystem like Training Stress Scores continue to be arbitrarily pegged to an hour (1 hour at FTP = 100 TSS). This continues to spread the already wide spread confusion that FTP is 1-hour power which it is not. 

5) Widely profilerated estimation techniques for FTP, such as the FTP20 method is incorrect. As the originator of the FTP concept describes, it simply yields a false sense of precision. However, the proliferation of this false sense in the TARWAPM book does not make false sense of precision go away.

Regardless of its conceptual flaws, I acknowledge that FTP has found favor with coaches and athletes who use it simply for its training value. However, testimonials and anecdotal evidence are separate from science. Claims made about FTP and its accompanying ecosystem warrant additional scientific scrutiny. The collection of knowledge we currently have from research suggests that those claims are weak and not based on scientific fact.


1. Allen Hunter. Training and Racing with a Power Meter . VeloPress. Kindle Edition. 

2. Borszcz, Fernando & Tramontin, Artur & Bossi, Arthur & Carminatti, Lorival & Costa, Vitor. (2018). Functional Threshold Power in Cyclists: Validity of the Concept and Physiological Responses. International Journal of Sports Medicine. 39. 10.1055/s-0044-101546. 

3. Borszcz, Fernando & Tramontin, Artur & Costa, Vitor. (2019). Is the Functional Threshold Power Interchangeable With the Maximal Lactate Steady State in Trained Cyclists?. International Journal of Sports Physiology and Performance. 14. 1029-1035. 10.1123/ijspp.2018-0572. 

4. Valenzuela, Pedro L. & Morales, Javier S. & Foster, Carl & Lucia, Alejandro & de la Villa, Pedro. (2018). Is the Functional Threshold Power (FTP) a Valid Surrogate of the Lactate Threshold?. International Journal of Sports Physiology and Performance. 13. 10.1123/ijspp.2018-0008. 

5. Morgan, Paul & Black, Matthew & Bailey, Stephen & Jones, Andrew & Vanhatalo, Anni. (2018). Road cycle TT performance: Relationship to the power-duration model and association with FTP. Journal of Sports Sciences. 10.1080/02640414.2018.1535772. 

6. Poole, David & Ward, Susan & Gardner, Gerald & Whipp, Brian. (1988). Metabolic and respiratory profile of the upper limit for prolonged exercise in man. Ergonomics. 31. 1265-79. 10.1080/00140138808966766. 

7. de Lucas, Ricardo & Mendes de Souza, Kristopher & Costa, Vitor & Grossl, Talita & Guglielmo, Luiz Guilherme. (2013). Time to exhaustion at and above critical power in trained cyclists: The relationship between heavy and severe intensity domains. Science & Sports. 28. e9- e14. 10.1016/j.scispo.2012.04.004. 

8. Thomas, Kevin & Goodall, Stuart & Stone, Mark & Howatson, Glyn & Gibson, Alan & Ansley, Les. (2014). Central and Peripheral Fatigue in Male Cyclists after 4-, 20-, and 40-km Time Trials. Medicine and science in sports and exercise. 47. 10.1249/MSS.0000000000000448. 

9. Jeffries, Owen & Simmons, Richard & Patterson, Stephen & Waldron, Mark. (2019). Functional Threshold Power Is Not Equivalent to Lactate Parameters in Trained Cyclists. Journal of Strength and Conditioning Research. 1. 10.1519/JSC.0000000000003203. 

10. Coggan, Andrew. (2003). Training and racing using a power meter: an introduction. 

11. McGRATH, Eanna & Mahony, Nick & Fleming, Neil & Donne, Bernard. (2019). Is the FTP Test a Reliable, Reproducible and Functional Assessment Tool in Highly-Trained Athletes?. International journal of exercise science. 12. 1334-1345.

12. Perrey, Stephane & Grappe, Fred & Girard, A & Bringard, Aurélien & Alain, Groslambert & William, Bertucci & Rouillon, J. (2003). Physiological and Metabolic Responses of Triathletes to a Simulated 30-min Time-Trial in Cycling at Self-Selected Intensity. International journal of sports medicine. 24. 138-43. 10.1055/s-2003-38200. 

13. Jones, Andrew & Burnley, Mark & Black, Matthew & Poole, David & Vanhatalo, Anni. (2019). The maximal metabolic steady state: redefining the ‘gold standard’. Physiological Reports. 7. 10.14814/phy2.14098.

14. Inglis, Erin Calaine & Iannetta, Danilo & Passfield, Louis & Murias, Juan. (2019). Maximal Lactate Steady State Versus the 20-Minute Functional Threshold Power Test in Well-Trained Individuals: “Watts” the Big Deal?. International Journal of Sports Physiology and Performance. 1-7. 10.1123/ijspp.2019-0214. 

15. Bräuer, Elisabeth & Smekal, Gerhard. (2020). VO2 Steady State at and Just Above Maximum Lactate Steady State Intensity. International Journal of Sports Medicine. 41. 10.1055/a-1100-7253.