Customer Lifetime Value

BG/NBD · Gamma-Gamma · Buy-Til-You-Die

Estimate per-customer expected future value from past recency-frequency-monetary (RFM) data using the Buy-Til-You-Die family of models — BG/NBD for transaction frequency (with a Beta-Geometric drop-out process) and gamma-gamma for spend per transaction. CLV underpins acquisition-budget allocation, retention-treatment targeting, and personalisation reward shaping. Foundational paper: Fader, Hardie & Lee (2005).

Why it matters

Customer-centric retail starts with knowing what a customer is worth

5×

Cost ratio of acquiring a new customer vs retaining an existing one — the structural reason CLV is the right currency for marketing decisions.

Source: Bain & Company; HBR retention research (long-cited industry benchmark).

~80/20

Pareto split of CLV in most retail bases — top 20% of customers generate ~80% of margin. Identifying them shapes loyalty and personalisation budgets.

Source: Fader (2012), Customer Centricity.

Pareto/NBD

Schmittlein-Morrison-Colombo (1987) original model; BG/NBD (Fader et al. 2005) is the closed-form approximation that practitioners actually use.

Source: Schmittlein, Morrison & Colombo (1987), Management Science.

+10–30%

Marketing-ROI lift documented when retention spend is allocated proportional to CLV vs. uniform-per-customer or revenue-only.

Source: Reinartz & Kumar (2003), Journal of Marketing.

Where the decision sits

Strategic input to acquisition, retention, loyalty, and personalisation

CLV is not itself a decision — it is an estimate that downstream decisions consume. The classic uses are: acquisition (don’t pay more than expected CLV to acquire a customer), retention (allocate save-the-customer spend in proportion to expected CLV), loyalty (which segments are worth a contract), and personalisation (CLV-aware reward shaping in the recommender). Decisions live in loyalty design, personalisation, promotional planning; CLV gives them a common scoring axis.

Transaction historyx, t_x, T per customer

Estimate CLVBG/NBD + GG

Score & rankCLV percentile

Allocate budgetretention, loyalty

Problem & formulation

BG/NBD frequency model + gamma-gamma spend model

Decision support

Estimation, not optimisation

Statistical model

BG/NBD + Gamma-Gamma

Inputs per customer

RFM (x, t_x, T)

Reference

Fader-Hardie-Lee (2005)

Per-customer summary statistics (RFM)

Symbol	Meaning	Unit
$x$	Number of repeat transactions observed (frequency)	integer
$t_x$	Time of last transaction (recency)	weeks
$T$	Length of observation window	weeks
$\bar{m}$	Average spend per transaction (monetary)	$

BG/NBD frequency model

Each customer has a latent purchase rate $\lambda$ (Poisson) and a latent drop-out probability $p$ per transaction (Beta-Geometric). Heterogeneity across customers is modelled by gamma and beta priors:

$$\lambda \sim \mathrm{Gamma}(r, \alpha) \qquad p \sim \mathrm{Beta}(a, b)$$

After a customer makes a transaction, they drop out with probability $p$; otherwise they remain “alive”. The expected number of future transactions in a window of length $t^{\ast}$, given history $(x, t_x, T)$, has a closed form (Fader et al. 2005, Eq. 7):

$$\mathbb{E}\bigl[Y(t^{\ast}) \,\big|\, x, t_x, T\bigr] \;=\; \frac{(a + b + x - 1) / (a - 1) \,\bigl[1 - (\alpha / (\alpha + t^{\ast}))^{r+x} \,_2 F_1(\cdot)\bigr]}{1 + \mathbb{1}\{x > 0\} \cdot (a / (b + x - 1)) \cdot (\alpha + T)^{r+x} / (\alpha + t_x)^{r+x}}$$

Practitioners use the lifetimes Python package or its R equivalent. The hypergeometric $_2 F_1$ is evaluated numerically. For exposition we use a simplified expected-rate proxy in the solver below.

Gamma-gamma spend model

Spend per transaction is modelled separately. Each customer has a latent average-spend parameter, gamma-distributed; observed spends are gamma noise around it:

$$\mathbb{E}[\bar M \,|\, x, \bar m] \;=\; \frac{(p \cdot \bar m \cdot x) + (q \cdot \nu)}{p \cdot x + q}$$

A Bayesian shrinkage estimator: blend the customer’s observed average spend $\bar m$ (weighted by $p \cdot x$, proportional to evidence) with the population mean $\nu$ (weighted by $q$, the prior). Customers with few transactions get pulled toward the population mean; high-frequency customers keep their own average.

Combined CLV (discounted)

$$\mathrm{CLV}_i \;=\; \mathbb{E}[Y_i(t^{\ast})] \cdot \mathbb{E}[\bar M_i] \cdot \mathrm{margin} \cdot \mathrm{DiscountFactor}(t^{\ast})$$

CLV = expected future transactions × expected spend per transaction × gross margin, discounted at the firm’s cost of capital. Discount factor for continuous time is $\int_0^{t^{\ast}} e^{-\delta t} \, dt$ per period; with monthly transactions and annual discount rate $r$, use $(1+r)^{-1}$ per year.

Decision-time use

Sort customers by CLV; allocate retention/loyalty budget proportional to (predicted CLV) − (cost-to-treat), pruning the tail where the cost dominates. The classical rule: spend up to (CLV × treatment-effect probability) per customer.

Interactive solver

200 simulated customers · BG/NBD-style proxy + gamma-gamma spend

CLV estimator

Heterogeneous customer base · CLV histogram + Pareto-share KPI

★★ Closed-form proxy

Customers

Horizon $t^{\ast}$ (weeks)

Future window for CLV

Gross margin

Fraction of spend that is profit

Annual discount rate

Cost of capital

Mean purchase rate (per yr)

Mean spend per transaction

Seed

—

Total CLV ($)

—

Avg CLV per customer ($)

—

Top 20% share of CLV

—

Predicted dropouts

—

90th-percentile CLV ($)

—

Retention budget @ 50% CLV

CLV histogram Cumulative CLV (Lorenz curve) Top-20%-of-customers cutoff

Under the hood

For each simulated customer, we draw a latent purchase rate from $\mathrm{Gamma}(r=2, \alpha = r / \lambda^{\text{mean}})$ and an alive/dropout state from $\mathrm{Bernoulli}(0.85)$. Expected future transactions in window $t^{\ast}$ is $\lambda \cdot t^{\ast} \cdot \mathbb{1}\{\text{alive}\}$ (a deliberately simplified BG/NBD proxy — the full closed form involves $_2 F_1$). Per-transaction spend is drawn from $\mathrm{Gamma}(\nu, \nu / m^{\text{mean}})$. CLV is the product of (expected transactions × expected spend × margin), discounted to present value at $(1+r)^{-t^{\ast}/52}$. The histogram and Lorenz curve summarise the full base; the “top 20%” KPI confirms the Pareto pattern.

Reading the solution

Three patterns to watch for

Heavy right tail. CLV histograms are almost always right-skewed: most customers are low-value, a small minority are high-value. The mean is misleading; use the median or percentile cutoffs.
Pareto share. The top 20% typically owns 60-85% of CLV. The right number depends on category — subscriptions concentrate more, grocery less.
Dropouts dominate. Many customers never come back after the first purchase; the BG/NBD model encodes this directly via the dropout probability $p$.

Sensitivity questions

Discount rate up (10% → 25%)? — CLV shrinks; the top tail compresses.
Horizon longer (1 year → 3 years)? — CLV grows but more uncertain; consider truncating.
Margin halved? — total CLV halves; allocation rank doesn’t change — budget per customer should scale by margin.

Model extensions

Pareto/NBD (original)

Schmittlein-Morrison-Colombo 1987 model: continuous-time dropout, more general than BG/NBD. Closed form involves more numerical integration.

Contract / subscription CLV

For subscription businesses, CLV simplifies to revenue / (1 - retention rate). Different model: discrete-time geometric churn.

CLV with covariates

Time-varying covariates (channel, season, marketing exposure) added as multiplicative effects on $\lambda$. Bayesian hierarchical extension.

Bayesian CLV (Cohort)

Posterior distribution over CLV per customer, not just point estimate. Useful for calibrated decision-making under uncertainty.

CLV-aware acquisition

Cap acquisition cost at expected new-customer CLV. Gupta-Lehmann 2003 acquisition / retention LP.

CLV-aware loyalty design

Loyalty program rewards calibrated to retain high-CLV segments. Ties directly to loyalty program design.

RL with CLV reward

Train a recommendation policy with long-term CLV as the reward signal, not single-session revenue. Active research frontier.

Personalisation →

Customer-base valuation

Aggregate CLV is a balance-sheet asset (customer equity). Important in M&A and valuation analyses. Gupta-Lehmann-Stuart 2004.

Key references

Fader, P. S., Hardie, B. G. S. & Lee, K. L. (2005).

RFM and CLV: Using iso-value curves for customer base analysis.

Journal of Marketing Research 42(4): 415–430. doi:10.1509/jmkr.2005.42.4.415 (BG/NBD canonical paper.)

Schmittlein, D. C., Morrison, D. G. & Colombo, R. (1987).

Counting your customers: Who are they and what will they do next?

Management Science 33(1): 1–24. doi:10.1287/mnsc.33.1.1 (Pareto/NBD original.)

Fader, P. S. & Hardie, B. G. S. (2009).

Probability models for customer-base analysis.

Journal of Interactive Marketing 23(1): 61–69. (Best practitioner overview.)

Reinartz, W. J. & Kumar, V. (2003).

The impact of customer relationship characteristics on profitable lifetime duration.

Journal of Marketing 67(1): 77–99.

Gupta, S., Lehmann, D. R. & Stuart, J. A. (2004).

Valuing customers.

Journal of Marketing Research 41(1): 7–18. doi:10.1509/jmkr.41.1.7.25084

Fader, P. S. (2012, 2nd ed. 2020).

Customer Centricity: Focus on the Right Customers for Strategic Advantage.

Wharton Digital Press.

Pfeifer, P. E., Haskins, M. E. & Conroy, R. M. (2005).

Customer lifetime value, customer profitability, and the treatment of acquisition spending.

Journal of Managerial Issues 17(1): 11–25.

Berger, P. D. & Nasr, N. I. (1998).

Customer lifetime value: Marketing models and applications.

Journal of Interactive Marketing 12(1): 17–30.

Back to the retail domain

CLV sits in the Promotion × Strategic cell — the customer-equity score that turns marketing budget into the most valuable currency: future cash flow.

Open Retail Landing

Educational solver · simplified BG/NBD-proxy and gamma-gamma · production CLV uses the full hypergeometric closed form (e.g., the lifetimes Python package).

Symbol	Meaning	Unit
\(x\)	Number of repeat transactions observed (frequency)	integer
\(t_x\)	Time of last transaction (recency)	weeks
\(T\)	Length of observation window	weeks
\(\bar{m}\)	Average spend per transaction (monetary)	$