Skip to main content

Customer Lifetime Value

BG/NBD · Gamma-Gamma · Buy-Til-You-Die

Estimate per-customer expected future value from past recency-frequency-monetary (RFM) data using the Buy-Til-You-Die family of models — BG/NBD for transaction frequency (with a Beta-Geometric drop-out process) and gamma-gamma for spend per transaction. CLV underpins acquisition-budget allocation, retention-treatment targeting, and personalisation reward shaping. Foundational paper: Fader, Hardie & Lee (2005).

Why it matters

Customer-centric retail starts with knowing what a customer is worth

Cost ratio of acquiring a new customer vs retaining an existing one — the structural reason CLV is the right currency for marketing decisions.
Source: Bain & Company; HBR retention research (long-cited industry benchmark).
~80/20
Pareto split of CLV in most retail bases — top 20% of customers generate ~80% of margin. Identifying them shapes loyalty and personalisation budgets.
Source: Fader (2012), Customer Centricity.
Pareto/NBD
Schmittlein-Morrison-Colombo (1987) original model; BG/NBD (Fader et al. 2005) is the closed-form approximation that practitioners actually use.
Source: Schmittlein, Morrison & Colombo (1987), Management Science.
+10–30%
Marketing-ROI lift documented when retention spend is allocated proportional to CLV vs. uniform-per-customer or revenue-only.
Source: Reinartz & Kumar (2003), Journal of Marketing.

Where the decision sits

Strategic input to acquisition, retention, loyalty, and personalisation

CLV is not itself a decision — it is an estimate that downstream decisions consume. The classic uses are: acquisition (don’t pay more than expected CLV to acquire a customer), retention (allocate save-the-customer spend in proportion to expected CLV), loyalty (which segments are worth a contract), and personalisation (CLV-aware reward shaping in the recommender). Decisions live in loyalty design, personalisation, promotional planning; CLV gives them a common scoring axis.

Transaction historyx, t_x, T per customer
Estimate CLVBG/NBD + GG
Score & rankCLV percentile
Allocate budgetretention, loyalty

Problem & formulation

BG/NBD frequency model + gamma-gamma spend model

Decision support
Estimation, not optimisation
Statistical model
BG/NBD + Gamma-Gamma
Inputs per customer
RFM (x, t_x, T)
Reference
Fader-Hardie-Lee (2005)

Per-customer summary statistics (RFM)

SymbolMeaningUnit
\(x\)Number of repeat transactions observed (frequency)integer
\(t_x\)Time of last transaction (recency)weeks
\(T\)Length of observation windowweeks
\(\bar{m}\)Average spend per transaction (monetary)$

BG/NBD frequency model

Each customer has a latent purchase rate \(\lambda\) (Poisson) and a latent drop-out probability \(p\) per transaction (Beta-Geometric). Heterogeneity across customers is modelled by gamma and beta priors:

$$\lambda \sim \mathrm{Gamma}(r, \alpha) \qquad p \sim \mathrm{Beta}(a, b)$$

After a customer makes a transaction, they drop out with probability \(p\); otherwise they remain “alive”. The expected number of future transactions in a window of length \(t^{\ast}\), given history \((x, t_x, T)\), has a closed form (Fader et al. 2005, Eq. 7):

$$\mathbb{E}\bigl[Y(t^{\ast}) \,\big|\, x, t_x, T\bigr] \;=\; \frac{(a + b + x - 1) / (a - 1) \,\bigl[1 - (\alpha / (\alpha + t^{\ast}))^{r+x} \,_2 F_1(\cdot)\bigr]}{1 + \mathbb{1}\{x > 0\} \cdot (a / (b + x - 1)) \cdot (\alpha + T)^{r+x} / (\alpha + t_x)^{r+x}}$$

Practitioners use the lifetimes Python package or its R equivalent. The hypergeometric \(_2 F_1\) is evaluated numerically. For exposition we use a simplified expected-rate proxy in the solver below.

Gamma-gamma spend model

Spend per transaction is modelled separately. Each customer has a latent average-spend parameter, gamma-distributed; observed spends are gamma noise around it:

$$\mathbb{E}[\bar M \,|\, x, \bar m] \;=\; \frac{(p \cdot \bar m \cdot x) + (q \cdot \nu)}{p \cdot x + q}$$

A Bayesian shrinkage estimator: blend the customer’s observed average spend \(\bar m\) (weighted by \(p \cdot x\), proportional to evidence) with the population mean \(\nu\) (weighted by \(q\), the prior). Customers with few transactions get pulled toward the population mean; high-frequency customers keep their own average.

Combined CLV (discounted)

$$\mathrm{CLV}_i \;=\; \mathbb{E}[Y_i(t^{\ast})] \cdot \mathbb{E}[\bar M_i] \cdot \mathrm{margin} \cdot \mathrm{DiscountFactor}(t^{\ast})$$

CLV = expected future transactions × expected spend per transaction × gross margin, discounted at the firm’s cost of capital. Discount factor for continuous time is \(\int_0^{t^{\ast}} e^{-\delta t} \, dt\) per period; with monthly transactions and annual discount rate \(r\), use \((1+r)^{-1}\) per year.

Decision-time use

Sort customers by CLV; allocate retention/loyalty budget proportional to (predicted CLV) − (cost-to-treat), pruning the tail where the cost dominates. The classical rule: spend up to (CLV × treatment-effect probability) per customer.

Interactive solver

200 simulated customers · BG/NBD-style proxy + gamma-gamma spend

CLV estimator
Heterogeneous customer base · CLV histogram + Pareto-share KPI
★★ Closed-form proxy
Future window for CLV
Fraction of spend that is profit
Cost of capital
Total CLV ($)
Avg CLV per customer ($)
Top 20% share of CLV
Predicted dropouts
90th-percentile CLV ($)
Retention budget @ 50% CLV
CLV histogram Cumulative CLV (Lorenz curve) Top-20%-of-customers cutoff

Under the hood

For each simulated customer, we draw a latent purchase rate from \(\mathrm{Gamma}(r=2, \alpha = r / \lambda^{\text{mean}})\) and an alive/dropout state from \(\mathrm{Bernoulli}(0.85)\). Expected future transactions in window \(t^{\ast}\) is \(\lambda \cdot t^{\ast} \cdot \mathbb{1}\{\text{alive}\}\) (a deliberately simplified BG/NBD proxy — the full closed form involves \(_2 F_1\)). Per-transaction spend is drawn from \(\mathrm{Gamma}(\nu, \nu / m^{\text{mean}})\). CLV is the product of (expected transactions × expected spend × margin), discounted to present value at \((1+r)^{-t^{\ast}/52}\). The histogram and Lorenz curve summarise the full base; the “top 20%” KPI confirms the Pareto pattern.

Reading the solution

Three patterns to watch for

  • Heavy right tail. CLV histograms are almost always right-skewed: most customers are low-value, a small minority are high-value. The mean is misleading; use the median or percentile cutoffs.
  • Pareto share. The top 20% typically owns 60-85% of CLV. The right number depends on category — subscriptions concentrate more, grocery less.
  • Dropouts dominate. Many customers never come back after the first purchase; the BG/NBD model encodes this directly via the dropout probability \(p\).

Sensitivity questions

  • Discount rate up (10% → 25%)? — CLV shrinks; the top tail compresses.
  • Horizon longer (1 year → 3 years)? — CLV grows but more uncertain; consider truncating.
  • Margin halved? — total CLV halves; allocation rank doesn’t change — budget per customer should scale by margin.

Model extensions

Pareto/NBD (original)

Schmittlein-Morrison-Colombo 1987 model: continuous-time dropout, more general than BG/NBD. Closed form involves more numerical integration.

Contract / subscription CLV

For subscription businesses, CLV simplifies to revenue / (1 - retention rate). Different model: discrete-time geometric churn.

CLV with covariates

Time-varying covariates (channel, season, marketing exposure) added as multiplicative effects on \(\lambda\). Bayesian hierarchical extension.

Bayesian CLV (Cohort)

Posterior distribution over CLV per customer, not just point estimate. Useful for calibrated decision-making under uncertainty.

CLV-aware acquisition

Cap acquisition cost at expected new-customer CLV. Gupta-Lehmann 2003 acquisition / retention LP.

CLV-aware loyalty design

Loyalty program rewards calibrated to retain high-CLV segments. Ties directly to loyalty program design.

RL with CLV reward

Train a recommendation policy with long-term CLV as the reward signal, not single-session revenue. Active research frontier.

Personalisation →
Customer-base valuation

Aggregate CLV is a balance-sheet asset (customer equity). Important in M&A and valuation analyses. Gupta-Lehmann-Stuart 2004.

Key references

Fader, P. S., Hardie, B. G. S. & Lee, K. L. (2005).
RFM and CLV: Using iso-value curves for customer base analysis.
Journal of Marketing Research 42(4): 415–430. doi:10.1509/jmkr.2005.42.4.415 (BG/NBD canonical paper.)
Schmittlein, D. C., Morrison, D. G. & Colombo, R. (1987).
Counting your customers: Who are they and what will they do next?
Management Science 33(1): 1–24. doi:10.1287/mnsc.33.1.1 (Pareto/NBD original.)
Fader, P. S. & Hardie, B. G. S. (2009).
Probability models for customer-base analysis.
Journal of Interactive Marketing 23(1): 61–69. (Best practitioner overview.)
Reinartz, W. J. & Kumar, V. (2003).
The impact of customer relationship characteristics on profitable lifetime duration.
Journal of Marketing 67(1): 77–99.
Gupta, S., Lehmann, D. R. & Stuart, J. A. (2004).
Valuing customers.
Journal of Marketing Research 41(1): 7–18. doi:10.1509/jmkr.41.1.7.25084
Fader, P. S. (2012, 2nd ed. 2020).
Customer Centricity: Focus on the Right Customers for Strategic Advantage.
Wharton Digital Press.
Pfeifer, P. E., Haskins, M. E. & Conroy, R. M. (2005).
Customer lifetime value, customer profitability, and the treatment of acquisition spending.
Journal of Managerial Issues 17(1): 11–25.
Berger, P. D. & Nasr, N. I. (1998).
Customer lifetime value: Marketing models and applications.
Journal of Interactive Marketing 12(1): 17–30.

Back to the retail domain

CLV sits in the Promotion × Strategic cell — the customer-equity score that turns marketing budget into the most valuable currency: future cash flow.

Open Retail Landing
Educational solver · simplified BG/NBD-proxy and gamma-gamma · production CLV uses the full hypergeometric closed form (e.g., the lifetimes Python package).