Optimization

Written By Axiom Admin

Last updated About 1 month ago

Optimization

Optimization in this context means changing your strategy's parameters — setup conditions, entry triggers, exit levels, allocation sizes, confirmation counts, cooldown bars — and observing how the results change. That sounds like a rational thing to do. Often it is. But optimization is also the fastest way to fool yourself into believing you found something real when you actually found something that fits the past and will not survive the future.

This page is about how to tune honestly. Not how to avoid tuning altogether — tuning is part of the work — but how to tell the difference between understanding your strategy better and accidentally sculpting it to match the historical data.

What is safe to tune

Some parameters have a clear relationship to a market reality you can reason about independently of the backtest results:

Commission and slippage — tune these to match your actual broker and asset, not to make the equity curve look better. These are calibration, not optimization.
Position sizing — adjusting allocation percent to match your actual risk tolerance is calibration. Adjusting it until the equity curve peaks is fitting.
Direction mode — choosing Long Only vs. Short Only vs. Swing Mode based on your thesis about the market is a legitimate decision. Choosing it based on which mode produces the prettiest backtest is not.
Risk controls — setting max drawdown and intraday loss limits to match your actual risk policy is calibration. Loosening them until the strategy no longer gets halted is fitting.

The test for whether a tuning decision is safe: can you explain why this value should be this number without referencing the backtest results? If the answer is "because my broker charges 0.1% commission," that is calibration. If the answer is "because 0.08% commission makes the equity curve look better," that is fitting.

What is dangerous to tune

Some parameters have a loose relationship to market reality and a tight relationship to the historical data:

Setup confirmation counts. Adding a 3-bar confirmation filter sounds disciplined. But if you chose 3 because 2 had too many trades and 4 had too few, you fitted the confirmation to the historical trade distribution, not to a thesis about how long a signal needs to persist. The safe version: "I want the trend to establish for at least three bars because my experience says one-bar spikes reverse." The dangerous version: "I tried 1 through 5 and 3 gave the best profit factor."
Cooldown bars. A 5-bar cooldown after cancel sounds reasonable — until you realize you chose 5 because it happens to avoid the specific re-entry patterns that hurt in the test window. Cooldown should reflect something about the market behavior you are modeling (a settling period after a false breakout, a time window after a news spike), not something about what makes the equity curve smoother.
Entry TTL (time to live). Shortening or lengthening the expiration on limit orders to match the specific pace of the historical data.
Number of setups. Adding a third or fourth setup to capture the trades that the first two missed. Each new setup is a new degree of freedom. Each degree of freedom makes it easier for the backtest to look good and harder for the strategy to generalize.
Take profit and stop loss levels. Adjusting percentages or price offsets until the reward-to-risk ratio looks optimal in the test window.
Expression conditions. Tightening or loosening threshold values (e.g., RSI < 30 vs. RSI < 25) to match the specific historical turning points.

None of these are wrong to adjust. The danger is adjusting them based on what makes the equity curve look best rather than what makes the rules make sense for the market you are trading.

What overfitting looks like here

Overfitting is not a single event. It is a gradual process where your strategy's rules become increasingly tuned to the historical data you are testing against, without becoming more likely to work in the future. Here is what that process looks like in this tool:

Scenario: The optimization false positive

You start with a reasonable thesis: buy when the price is above a moving average and RSI is below a threshold, sell at a fixed percentage take profit.

Run 1: 60 trades, profit factor 1.2, max drawdown 18%. Decent but not exciting.

You adjust: add a setup with a volume filter. Run 2: 35 trades, profit factor 1.5, max drawdown 12%. Better.

You adjust: add a second take profit at a tighter level for partial exit. Run 3: 35 trades, profit factor 1.7, max drawdown 10%. Looking good.

You adjust: tighten the RSI threshold from 30 to 25 and add a 3-bar confirmation to the setup. Run 4: 22 trades, profit factor 2.1, max drawdown 7%.

The equity curve is smooth. The numbers are strong. You are tempted to say you found something.

But look at what happened: you went from 60 trades to 22 by adding filters and thresholds. Each adjustment removed the trades that hurt and kept the trades that helped — in the test window. The 22 remaining trades are the subset of historical trades that survived your increasingly specific filter stack. They may represent a real pattern, or they may represent the specific 22 moments in this particular data set where your exact combination of conditions happened to work.

The test

Take the settings from Run 4 and apply them to a different time window — either earlier history or a held-out segment you did not look at during development. Do not adjust anything. Run the exact same settings on the new window.

Three things can happen:

The profit factor holds within a reasonable range (say, Run 4 showed 2.1 and the validation window shows 1.4–1.8). The pattern may be real. The degradation is expected — every strategy performs somewhat worse on unseen data — but the core shape survived. This is the best outcome you can realistically hope for.

The profit factor drops significantly (2.1 on the development window, 0.8–1.0 on validation). The optimization found noise. The filters and thresholds you added were tuned to the specific patterns in the development window, and those patterns did not repeat. You are back to the drawing board — not to re-optimize, but to ask what about your thesis was wrong or whether the thesis needs fewer parameters to express.

The profit factor is actually higher on the validation window. Be suspicious. This usually means the development window was unusually difficult for the strategy, and the validation window was unusually favorable. It is not proof that the strategy is robust — it is evidence that the strategy's performance is regime-dependent, and the two windows happened to represent different regimes.

This is walk-forward reasoning: develop on one window, validate on another. It is not a guarantee. But it is the difference between knowing you found something that survived at least two windows versus something that survived only the one you stared at.

How to tell signal from noise

The sensitivity question

Change one parameter at a time by a small amount in both directions. If the results change dramatically, the strategy is sensitive to that parameter. High sensitivity to a single parameter is a warning: it means the profitability depends on hitting a narrow sweet spot that may not exist in the future.

A robust strategy is one where reasonable variations in parameters produce reasonable variations in results — not one where moving a threshold by 5 points turns profit into loss.

The trade-count question

Be suspicious of any optimization that improves results by reducing trade count. Fewer trades mean fewer data points. A strategy with 15 high-quality trades looks great — but 15 trades is not enough to distinguish a real edge from a lucky streak. You need enough trades to have statistical weight behind the numbers.

There is no magic number, but as a rough guide: if your strategy produces fewer than 30–50 trades in the test window, treat the results with extra skepticism, no matter how good they look.

The regime question

Does the strategy make money across different market regimes, or does all the profit cluster in one type of period? If the equity curve goes up during a trending period and flat or down during a range, you do not have a strategy that works — you have a strategy that bets on trending conditions. That can be valid, but only if you also have a regime filter and a plan for what happens when the regime changes.

The complexity question

Can you explain what each setup, each filter, and each threshold is supposed to capture in plain language? If you cannot articulate why a parameter is set to the value it is without pointing at the backtest, you have not understood the parameter — you have memorized the results.

Complexity should be earned through understanding, not accumulated through optimization. A strategy with two setups that you understand deeply is stronger than one with six setups you assembled because the equity curve improved each time you added one.

A practical optimization process

Start with calibration, not optimization. Set commission, slippage, and position sizing to match your actual trading conditions. These are not variables to optimize — they are constraints to calibrate.
Define your thesis first. Before touching parameters, write down in plain language what your strategy is trying to capture. "Buy when momentum shifts in an uptrend, exit at a target or when momentum reverses." This is your anchor. Every parameter change should serve this thesis.
Change one thing at a time. If you change three parameters simultaneously and the results improve, you do not know which change caused the improvement. Isolate your changes.
Test sensitivity, not just performance. After finding a setting that works, move it ±10–20% and see what happens. If the results survive the variation, the setting is robust. If they collapse, the setting is fragile.
Validate on a different window. Split your test period into a development segment and a validation segment. Develop on one, validate on the other. If you cannot afford to hold out data, at least shift the window forward by a few months and retest.
Watch trade count. If optimization reduces your trades below 30, the statistical significance of the improvement is questionable regardless of how good the numbers look.
Stop when you understand. The goal of optimization is not the best equity curve. It is a set of parameters you understand well enough to explain, defend, and recognize when they stop working.

See Operating Checklist for how optimization fits into the full build-test-iterate cycle.