Flat infographic illustration of many small interlocking gears merging into one precise larger movement, rendered in deep navy, off-white, and accent green. Represents the algorithmic precision and systematic decision-making behind a tax-loss harvesting engine that ranks, filters, and audits each portfolio decision.
A well-designed tax-loss harvesting engine operates as an interconnected system: each component — candidate ranking, replacement scoring, wash-sale checking, and audit logging — feeds into the next, producing decisions that may be traced and reviewed rather than simply trusted.

Most tax-loss harvesting products ask you to trust the user interface.

HarvestEngine is trying to earn trust a different way: by making the algorithm itself worth inspecting.

That matters because tax-loss harvesting is not just a pretty dashboard problem. It is a portfolio-engine problem. The product has to decide which names to hold, when a loss is meaningful enough to act on, which replacement is close enough to preserve exposure, and how to do all of that without drifting into tax mistakes or hand-wavy black-box behavior.

The core idea: the algorithm is not there to sound smart. It is there to make each harvest, replacement, and risk-control decision more defensible, more consistent, and easier to audit.

What the engine actually does

What does the HarvestEngine algorithm actually do with a taxable portfolio at a high level?

At a high level, HarvestEngine organizes the taxable account into three sleeves — beta, long, and an optional short — with a continuous tax-loss-harvesting layer running across them.

  1. Beta sleeve: broad-market ETFs (VOO, VTI, IXUS, BND, AGG) for cheap, efficient market exposure. Limited harvest surface but excellent tracking and simplicity.
  2. Long sleeve: the direct-index portfolio of single stocks tracking your chosen benchmark. The engine ranks the eligible universe with a four-leg composite model and this is where most of the harvest surface lives, because each stock can be harvested independently.
  3. Short sleeve, gated: an optional advanced overlay for users who want extra harvest surface and the ability to reshape risk without selling appreciated longs. Adds margin, borrow, and tax complexity, so it is opt-in only.

The tax-loss-harvesting layer watches existing positions across all three sleeves continuously, finds losses worth realizing, and swaps into the most similar wash-safe replacement. It is the behavior that ties the sleeves together — not a sleeve of its own.

That may sound abstract, so here is the practical meaning: the product is not randomly swapping one stock for another. It is using an explicit scoring and filtering system that can be explained line by line.

The long-side model is factor-grounded, not vibes-based

How does HarvestEngine score and rank stocks in the long-sleeve universe rather than just replicating an index?

Every stock in the chosen index universe gets scored on four legs:

  • Value: how cheap or expensive the business looks using yield-based and balance-sheet-aware measures
  • Momentum: whether the stock's medium-term trend is helping or hurting
  • Quality: profitability, leverage, and margin strength
  • Idiosyncratic volatility: how noisy the stock is after stripping out the general market move

The current production composite uses this weighting:

LegWeightWhy it matters
Value25%Prevents the engine from blindly hugging expensive names just because they are in the index
Momentum30%Helps avoid stepping in front of obvious deterioration
Quality30%Biases the sleeve toward stronger businesses
Idiosyncratic volatility15%Penalizes the noisiest names after market exposure is removed
Long-side composite
composite = 0.25·V + 0.30·M + 0.30·Q + 0.15·I
Each leg V, M, Q, I is a 0–1 score from cross-sectional rank-Z normalization within peer group.
Visual weight breakdown — long-side composite

This is one of the first reasons customers should care. The engine is not just trying to replicate an index cheaply. It is trying to build a better-behaved direct-index sleeve while staying inside benchmark shape.

The normalization is one of the real moats

Why does cross-sectional normalization within peer groups matter more than universal scoring thresholds?

A lot of retail finance software uses universal thresholds that sound reasonable but fail in practice. A utility company should not be judged on the same raw scale as a software company. A 15-times earnings multiple means different things in different sectors. So does an 8% return on equity.

HarvestEngine fixes that by ranking metrics within the best available peer group first, then converting those ranks into normalized scores. In plain English: a stock gets judged against the right neighbors, not against the entire market at once.

Customers should care because this makes the engine less naive. It is one of the differences between generic screening logic and something closer to institutional portfolio construction.

Two side-by-side comparison cards. Left card with a coral header chip labeled UNIVERSAL THRESHOLD shows a single flat ruler line cutting across geometric stock-shape icons from different sectors — one shape is incorrectly labeled CHEAP. Right card with a blue header chip labeled PEER-GROUP NORMALIZATION shows the same shapes grouped into sector clusters, each cluster with its own ruler line, and a gold checkmark above the correctly-scored shape in each cluster.
Universal threshold scoring applies one fixed cutoff to all stocks, potentially producing misleading signals across sectors with structurally different valuations. Peer-group normalization first ranks each stock within its own sector, then converts that rank into a score — so a utility company is compared against other utilities, and a software company against software peers.

The harvest trigger is volatility-aware

Why does a fixed percentage-loss threshold fail as a harvest trigger across stocks with different volatility profiles?

Many products implicitly act like every 5% loss means the same thing. It does not.

A 5% move in a low-volatility stock may be meaningful. A 5% move in a very high-volatility stock may be noise. HarvestEngine uses a dynamic thresholding framework so the trigger reflects the position's own recent behavior, not just one universal number.

Vol-adjusted harvest floors (Phase B5)
floor5d  = max(−15%,  −2σ · √5)
floor20d = max(−25%,  −2σ · √20)
σ is the EWMA daily-log-return σ (5-day half-life). The absolute cliffs (−15% / −25%) are belt-and-suspenders against thinly-traded low-σ names.
Drag the slider — see how the threshold moves with the stock's own volatility.
1.50%
5-day floor:
20-day floor:
Implied annual σ:

Examples to try: σ = 0.8% (KO-style defensive — a -3% in 5d is meaningful) vs σ = 3.0% (NVDA-style high-vol — even -8% in 5d is routine).

That matters to customers because it reduces dumb churn. The engine should be harvesting real opportunities, not just reacting to every twitch in a noisy name.

The replacement logic is where trust is usually won or lost

How does HarvestEngine select a replacement after selling a losing position without tripping the wash-sale rule?

After a loss is detected, the product has to answer the hard question: what should replace the sold name so the portfolio stays aligned without tripping the wash-sale rule?

HarvestEngine scores candidates using a similarity model built from four pieces:

Replacement dimensionWeightPurpose
GICS business similarity50%Keep the business and industry shape close
Beta proximity20%Keep market sensitivity close
Correlation proxy20%Keep the stock's behavior profile close
Volatility match10%Avoid swapping into a completely different risk shape
Replacement similarity score (V3)
similarity = 0.50·GICS + 0.20·βprox + 0.20·corr + 0.10·volmatch
Each term is 0–1. GICS dominates because keeping the business shape close is the cheapest way to preserve exposure without tripping wash-sale risk.
Visual weight breakdown — replacement similarity

Before ranking, candidates also have to survive a quality filter. If the candidate is in an earnings window, has just suffered an unusually bad short-term move, or carries other obvious warning signs, the engine can drop or penalize it.

This is why customers should care: the replacement is not "close enough, probably." It is a structured attempt to preserve exposure and avoid obvious own-goals.

Artificial intelligence is used with guardrails, not as a magic trick

How does HarvestEngine use AI in the ranking process while preventing inconsistent or unpredictable outputs?

HarvestEngine does use a language model as an optional re-ranker on top of the deterministic candidate list. But it does so with two important protections:

  • Deterministic cache: the same input set returns the same output, rather than a new whimsical ranking each time
  • Confidence gate: if the model is not confident enough, the system falls back to the deterministic ranking

That matters because customers should never have to wonder whether the product is making different decisions on the same facts just because a model felt different today.

Four connected nodes in a horizontal flow. A blue chip labeled DETERMINISTIC LIST feeds right into a purple chip labeled LLM RE-RANKER with a sub-label OPTIONAL. The output passes to a gold diamond shape labeled CONFIDENCE GATE. When confidence is sufficient the flow continues right to a green chip labeled AI-ADJUSTED RANKING. A coral feedback arrow arcs from the confidence gate back to the deterministic list node, labeled LOW CONFIDENCE FALL BACK.
HarvestEngine uses a language model as an optional re-ranker on top of a deterministic candidate list. A deterministic cache ensures the same inputs always produce the same output. A confidence gate drops the model's adjustment when confidence falls below a threshold, returning to the deterministic ranking — so inconsistent model behavior cannot propagate into production decisions.

The short sleeve is academically grounded, not just the inverse of the long model

Why does HarvestEngine use a separate academically grounded model for short candidates rather than inverting the long ranking?

For advanced users, HarvestEngine also has a gated short-overlay path. This is important because many systems make the lazy mistake of assuming the worst long ideas are automatically the best shorts. Academic evidence says that is not good enough.

Instead, the short sleeve uses a dedicated model built around four anomaly families:

  • Accruals: are earnings being flattered by working-capital accounting rather than cash (Sloan 1996, The Accounting Review)
  • Net issuance: is the company diluting shareholders aggressively (Pontiff & Woodgate 2008, Journal of Finance)
  • Beneish M-score: do the accounting statements look suspicious enough to deserve caution (Beneish 1999, Financial Analysts Journal)
  • Idiosyncratic volatility: is the stock unusually noisy in a way that has historically mattered (Ang-Hodrick-Xing-Zhang 2006, Journal of Finance)
Short-side composite
short_score = 0.40·accruals + 0.30·issuance + 0.15·Beneish + 0.15·IVol
Higher = stronger short candidate. Weights of any unavailable component are redistributed proportionally.
Visual weight breakdown — short-side composite
Show the Beneish M-score formula (8 ratios)
Beneish M-score (1999)
M = −4.84 + 0.92·DSRI + 0.528·GMI + 0.404·AQI + 0.892·SGI
     + 0.115·DEPI − 0.172·SGAI + 4.679·TATA − 0.327·LVGI
M > −1.78 flags potential earnings manipulation. Composed of 8 ratios over 2 fiscal periods: DSRI (days sales in receivables), GMI (gross margin index), AQI (asset quality), SGI (sales growth), DEPI (depreciation index), SGAI (SG&A index), TATA (total accruals to assets), LVGI (leverage index).
Show the idiosyncratic-volatility regression
IVol — Ang-Hodrick-Xing-Zhang 2006
Step 1 — OLS regression: rstock = α + β · rmarket + ε
Step 2 — IVol = σ(ε)
Daily log-returns over a 252-trading-day window, market = SPY. The standard deviation of the residuals strips out market exposure and leaves the stock's own noise.

Most customers will never use this sleeve, and that is fine. The reason it still matters is what it says about product philosophy: the engine does not fake sophistication by calling the bottom of the long rank a short book. It uses a separate model when the problem changes.

The tax gates are part of the product, not an afterthought

How does HarvestEngine build tax-compliance rules into the engine rather than treating them as an afterthought?

Customers should especially care about this part. Many tools make attractive promises about potential tax savings without showing how the rule layer works. HarvestEngine builds rule gates directly into the engine.

That includes wash-sale awareness on the long side, straddle-aware checks, and caution around qualified-dividend timing issues in the advanced overlay. The point is not to show off tax-code trivia. The point is that a harvest is only useful if it stands up when you look back at it later.

Vertical sequential flow diagram. A blue chip at the top labeled CANDIDATE HARVEST flows down through three gold gate chips stacked vertically: WASH-SALE WINDOW with a calendar icon, STRADDLE AWARENESS with a balance scale icon, and QUALIFIED DIVIDEND with a clock icon. A green chip at the bottom labeled HARVEST APPROVED with a checkmark marks the successful outcome when all gates pass.
Tax-compliance gates run in sequence before any harvest executes. The wash-sale window check scans all household accounts and lots. The straddle-awareness check detects offsetting positions that may defer losses. The qualified-dividend timing check flags positions held fewer than 61 days. A harvest proceeds only after all three gates pass.

The algorithm writes an audit trail

Why is the audit trail one of the most practical differentiators in HarvestEngine's design?

This may be the most practical differentiator of all.

Each important decision can be logged with its provenance: why the candidate survived, why another candidate was dropped, what score the replacement received, and which risk gates were active. That means the customer is not just buying automation. They are buying something closer to inspectable decision infrastructure.

That is a very different experience from a product that simply says, "Trust us, we optimized it."

HARVEST DECISION — AUDIT RECORD Symbol sold XYZ · lot acquired 2024-03-12 · basis $148.40 Replacement selected ABC · similarity 0.84 (GICS 0.90, β 0.81) Composite score 0.71 (V=0.68, M=0.74, Q=0.70, IVol=0.76) Gates evaluated Wash-sale PASS · Straddle PASS · Dividend-timing PASS Decided at 2026-05-14T14:02:31Z · pipeline phase: optimization Ranking model composite v3 · similarity + factor + risk outcome: HARVEST APPROVED — proposal queued for user review
Each harvest decision produces a structured audit record capturing the symbol and lot sold, the replacement selected and its similarity score, the composite ranking score, which tax-compliance gates were evaluated, and the ranking model responsible. This record persists so the investor can review the reasoning for any past trade — not just its result.

So why should a customer care?

Why does the quality of the underlying algorithm determine whether a TLH product is worth trusting with real assets?

Because the algorithm determines whether the product is just convenient or actually worth trusting with meaningful taxable assets.

A stronger engine means:

  • better odds that harvested losses are meaningful, not noisy
  • better replacements after a sale
  • fewer dumb exposure changes
  • fewer tax-rule mistakes
  • clearer explanations when you want to know why the product acted

In other words, the algorithm is not a technical curiosity. It is the thing turning a tax-loss harvesting app into a tax-aware portfolio system.

The simple takeaway

What is the simplest case for why the algorithm behind a TLH product matters more than its interface?

Most products want you to trust the interface. HarvestEngine wants to be the first product where a thoughtful customer can trust the engine too.

That is why the algorithm matters. Not because it sounds advanced, but because it makes the product more understandable, more auditable, and more defensible when real money and real taxes are involved.

Read this next with TLH 101, why big firms push TLH, and how HarvestEngine uses AI without pretending to be your adviser.

Try the engine in the sandbox