Stochastic Differential Equations & Ito Formula
Comprehensive Interview Cheatsheet - 50 Key Topics
Topic 1: Definition of a Stochastic Process and Key Examples (Brownian Motion, Poisson Process)
Stochastic Process: A stochastic process is a collection of random variables \( \{X_t\}_{t \in T} \) indexed by a set \( T \) (often representing time), defined on a common probability space \( (\Omega, \mathcal{F}, \mathbb{P}) \). For each \( t \in T \), \( X_t \) is a random variable, and for each \( \omega \in \Omega \), \( X_t(\omega) \) is a realization (or sample path) of the process.
- State Space: The set of possible values that \( X_t \) can take (e.g., \( \mathbb{R} \), \( \mathbb{Z} \)).
- Index Set \( T \): Typically \( T = [0, \infty) \) (continuous-time) or \( T = \{0, 1, 2, \dots\} \) (discrete-time).
- Filtration \( \{\mathcal{F}_t\}_{t \in T} \): A family of \( \sigma \)-algebras such that \( \mathcal{F}_s \subseteq \mathcal{F}_t \subseteq \mathcal{F} \) for \( s \leq t \). Represents the "information" available up to time \( t \). A stochastic process \( \{X_t\} \) is adapted to \( \{\mathcal{F}_t\} \) if \( X_t \) is \( \mathcal{F}_t \)-measurable for all \( t \).
Brownian Motion (Wiener Process): A continuous-time stochastic process \( \{W_t\}_{t \geq 0} \) with the following properties:
- Initial Condition: \( W_0 = 0 \) almost surely (a.s.).
- Independent Increments: For any \( 0 \leq t_1 < t_2 < \dots < t_n \), the increments \( W_{t_2} - W_{t_1}, W_{t_3} - W_{t_2}, \dots, W_{t_n} - W_{t_{n-1}} \) are independent.
- Gaussian Increments: \( W_t - W_s \sim \mathcal{N}(0, t - s) \) for \( 0 \leq s < t \).
- Continuous Paths: The sample paths \( t \mapsto W_t \) are continuous a.s.
Brownian motion is a martingale with respect to its natural filtration, meaning \( \mathbb{E}[W_t | \mathcal{F}_s] = W_s \) for \( s \leq t \).
Key Properties of Brownian Motion:
- Mean: \( \mathbb{E}[W_t] = 0 \).
- Variance: \( \text{Var}(W_t) = t \).
- Covariance: \( \text{Cov}(W_s, W_t) = \min(s, t) \).
- Quadratic Variation: \( \langle W \rangle_t = t \).
\[ \text{For } s \leq t, \quad \text{Cov}(W_s, W_t) = \mathbb{E}[W_s W_t] = s. \]
\[ \text{Quadratic variation: } \lim_{\|\Pi\| \to 0} \sum_{i=1}^n (W_{t_i} - W_{t_{i-1}})^2 = t \quad \text{a.s.}, \] where \( \Pi = \{0 = t_0 < t_1 < \dots < t_n = t\} \) is a partition of \( [0, t] \) and \( \|\Pi\| = \max_i (t_i - t_{i-1}) \).
Example: Simulating Brownian Motion
To simulate a Brownian motion path on \( [0, T] \) with \( N \) steps:
- Discretize time: \( t_i = i \Delta t \), where \( \Delta t = T/N \).
- Generate independent increments: \( \Delta W_i \sim \mathcal{N}(0, \Delta t) \).
- Construct the path: \( W_{t_i} = \sum_{j=1}^i \Delta W_j \).
Python pseudocode:
import numpy as np
T = 1.0
N = 1000
dt = T / N
dW = np.sqrt(dt) * np.random.randn(N)
W = np.cumsum(dW)
W = np.insert(W, 0, 0) # W_0 = 0
Poisson Process: A continuous-time stochastic process \( \{N_t\}_{t \geq 0} \) counting the number of events occurring in the interval \( [0, t] \). It has the following properties:
- Initial Condition: \( N_0 = 0 \) a.s.
- Independent Increments: For any \( 0 \leq t_1 < t_2 < \dots < t_n \), the increments \( N_{t_2} - N_{t_1}, \dots, N_{t_n} - N_{t_{n-1}} \) are independent.
- Stationary Increments: \( N_{t + s} - N_t \sim \text{Poisson}(\lambda s) \) for \( s, t \geq 0 \), where \( \lambda > 0 \) is the intensity (or rate) of the process.
- Jump Process: The sample paths are right-continuous with left limits (càdlàg), and jumps of size 1 occur at event times.
Key Properties of Poisson Process:
- Mean: \( \mathbb{E}[N_t] = \lambda t \).
- Variance: \( \text{Var}(N_t) = \lambda t \).
- Interarrival Times: The times between jumps \( \tau_i = T_i - T_{i-1} \) are i.i.d. \( \text{Exp}(\lambda) \), where \( T_i \) is the time of the \( i \)-th jump.
- Probability of \( k \) events in \( [0, t] \):
Conditional on \( N_t = n \), the jump times \( T_1, \dots, T_n \) are distributed as the order statistics of \( n \) i.i.d. \( \text{Uniform}(0, t) \) random variables.
Example: Simulating a Poisson Process
To simulate a Poisson process path on \( [0, T] \):
- Generate interarrival times \( \tau_i \sim \text{Exp}(\lambda) \).
- Compute jump times: \( T_i = \sum_{j=1}^i \tau_j \).
- Stop when \( T_i > T \).
- Construct the path: \( N_t = \max \{ i : T_i \leq t \} \).
Python pseudocode:
import numpy as np
lambda_ = 2.0
T = 1.0
tau = np.random.exponential(1/lambda_, size=1000) # Generate many interarrival times
T_jumps = np.cumsum(tau)
T_jumps = T_jumps[T_jumps <= T] # Truncate at T
N = np.arange(len(T_jumps) + 1) # N_t values at jump times
Compound Poisson Process: A generalization of the Poisson process where jumps are not necessarily of size 1. Let \( \{Y_i\}_{i=1}^\infty \) be i.i.d. random variables (jump sizes) independent of \( \{N_t\} \). The compound Poisson process is defined as:
\[ X_t = \sum_{i=1}^{N_t} Y_i. \]If \( \mathbb{E}[Y_i] = \mu \) and \( \text{Var}(Y_i) = \sigma^2 \), then:
- Mean: \( \mathbb{E}[X_t] = \lambda \mu t \).
- Variance: \( \text{Var}(X_t) = \lambda (\sigma^2 + \mu^2) t \).
Characteristic Function of Compound Poisson Process:
\[ \mathbb{E}[e^{iuX_t}] = \exp \left( \lambda t (\phi_Y(u) - 1) \right), \] where \( \phi_Y(u) = \mathbb{E}[e^{iuY_1}] \) is the characteristic function of \( Y_1 \).Important Notes and Pitfalls:
- Brownian Motion:
- Brownian motion is not differentiable anywhere (its paths are almost surely of unbounded variation on any interval).
- The quadratic variation \( \langle W \rangle_t = t \) is a key property used in Itô calculus.
- Brownian motion is a Markov process and a martingale.
- Do not confuse Brownian motion with a random walk (which is discrete-time).
- Poisson Process:
- The Poisson process is a counting process with jumps of size 1. For compound Poisson processes, jumps can be of arbitrary size.
- The interarrival times are memoryless: \( \mathbb{P}(\tau_i > t + s | \tau_i > s) = \mathbb{P}(\tau_i > t) \).
- The Poisson process is a Lévy process (a process with stationary and independent increments).
- For large \( \lambda t \), \( N_t \) can be approximated by \( \mathcal{N}(\lambda t, \lambda t) \) (normal approximation).
- General Stochastic Processes:
- A stochastic process is stationary if its finite-dimensional distributions are invariant under time shifts (i.e., \( \{X_{t+s}\}_{t \in T} \) has the same distribution as \( \{X_t\}_{t \in T} \) for all \( s \)).
- A process is ergodic if time averages converge to ensemble averages (e.g., \( \lim_{T \to \infty} \frac{1}{T} \int_0^T X_t \, dt = \mathbb{E}[X_0] \)).
- Always specify the filtration when discussing martingales or adapted processes.
Example: Poisson Process as a Limit of Binomial Processes
Consider a sequence of binomial processes \( \{N_t^{(n)}\}_{n=1}^\infty \) where:
- Time is discretized into \( n \) intervals of length \( \Delta t = t/n \).
- In each interval, an event occurs with probability \( p = \lambda \Delta t = \lambda t / n \).
- \( N_t^{(n)} \) is the number of events in \( [0, t] \).
Then \( N_t^{(n)} \sim \text{Binomial}(n, \lambda t / n) \), and as \( n \to \infty \), \( N_t^{(n)} \) converges in distribution to \( \text{Poisson}(\lambda t) \). This is an example of the Poisson limit theorem.
Derivation:
\[ \mathbb{P}(N_t^{(n)} = k) = \binom{n}{k} \left( \frac{\lambda t}{n} \right)^k \left( 1 - \frac{\lambda t}{n} \right)^{n - k}. \] Taking \( n \to \infty \): \[ \lim_{n \to \infty} \mathbb{P}(N_t^{(n)} = k) = \frac{e^{-\lambda t} (\lambda t)^k}{k!}. \]Practical Applications:
- Finance:
- Brownian motion is the foundation of the Black-Scholes model for option pricing. The stock price \( S_t \) is modeled as: \[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t, \] where \( \mu \) is the drift and \( \sigma \) is the volatility.
- Poisson processes are used to model jump diffusions (e.g., Merton's jump-diffusion model) or credit risk (e.g., default times).
- Compound Poisson processes model insurance claims or operational risk.
- Physics:
- Brownian motion describes the random motion of particles suspended in a fluid (e.g., pollen grains in water).
- Poisson processes model radioactive decay or photon emissions.
- Queueing Theory:
- Poisson processes model arrival times of customers in a queue (e.g., M/M/1 queue).
- Compound Poisson processes model batch arrivals.
- Biology:
- Poisson processes model neuron spike trains or mutation occurrences.
- Engineering:
- Stochastic processes are used in signal processing (e.g., filtering noisy signals) and reliability theory (e.g., failure times of components).
Common Interview Questions:
- What is the difference between a stochastic process and a random variable?
- Explain why Brownian motion is not differentiable.
- Derive the mean and variance of a Poisson process.
- What is the distribution of the first jump time of a Poisson process?
- How would you simulate a Brownian motion path?
- Explain the Markov property of Brownian motion.
- What is the quadratic variation of Brownian motion, and why is it important?
- How does the compound Poisson process generalize the Poisson process?
- What is the relationship between the Poisson process and the exponential distribution?
- Explain how Brownian motion is used in the Black-Scholes model.
- What is the difference between a martingale and a Markov process?
- How would you model a stock price with jumps using a Poisson process?
- What is the characteristic function of a compound Poisson process, and how is it derived?
- Explain the Poisson limit theorem.
- What is the probability that a Brownian motion stays within \( [-a, a] \) for all \( t \in [0, T] \)? (Hint: Use the reflection principle.)
Topic 2: Filtration and Adapted Processes in Probability Spaces
Probability Space: A probability space is a mathematical triplet \((\Omega, \mathcal{F}, \mathbb{P})\) where:
- \(\Omega\) is the sample space, representing all possible outcomes.
- \(\mathcal{F}\) is a \(\sigma\)-algebra of subsets of \(\Omega\), representing the collection of events to which probabilities can be assigned.
- \(\mathbb{P}\) is a probability measure, assigning probabilities to events in \(\mathcal{F}\).
Filtration: A filtration \(\{\mathcal{F}_t\}_{t \geq 0}\) is a family of \(\sigma\)-algebras on \(\Omega\) such that for all \(s \leq t\), \(\mathcal{F}_s \subseteq \mathcal{F}_t \subseteq \mathcal{F}\). Intuitively, \(\mathcal{F}_t\) represents the information available up to time \(t\).
A probability space equipped with a filtration is called a filtered probability space, denoted \((\Omega, \mathcal{F}, \{\mathcal{F}_t\}_{t \geq 0}, \mathbb{P})\).
Adapted Process: A stochastic process \(\{X_t\}_{t \geq 0}\) is adapted to the filtration \(\{\mathcal{F}_t\}_{t \geq 0}\) if, for every \(t \geq 0\), \(X_t\) is \(\mathcal{F}_t\)-measurable. This means that the value of \(X_t\) is known given the information available up to time \(t\).
Natural Filtration: The natural filtration \(\{\mathcal{F}_t^X\}_{t \geq 0}\) of a stochastic process \(\{X_t\}_{t \geq 0}\) is the smallest filtration to which \(X_t\) is adapted. It is defined as:
\[ \mathcal{F}_t^X = \sigma(X_s : 0 \leq s \leq t), \] where \(\sigma(X_s : 0 \leq s \leq t)\) is the \(\sigma\)-algebra generated by the random variables \(X_s\) for \(0 \leq s \leq t\).Progressively Measurable Process: A stochastic process \(\{X_t\}_{t \geq 0}\) is progressively measurable with respect to the filtration \(\{\mathcal{F}_t\}_{t \geq 0}\) if, for every \(t \geq 0\), the mapping \((s, \omega) \mapsto X_s(\omega)\) from \([0, t] \times \Omega\) to \(\mathbb{R}\) is \(\mathcal{B}([0, t]) \otimes \mathcal{F}_t\)-measurable. Here, \(\mathcal{B}([0, t])\) is the Borel \(\sigma\)-algebra on \([0, t]\).
Note: Every progressively measurable process is adapted, but not every adapted process is progressively measurable.
Example: Brownian Motion and Its Natural Filtration
Let \(\{W_t\}_{t \geq 0}\) be a standard Brownian motion on \((\Omega, \mathcal{F}, \mathbb{P})\). The natural filtration \(\{\mathcal{F}_t^W\}_{t \geq 0}\) of \(W_t\) is:
\[ \mathcal{F}_t^W = \sigma(W_s : 0 \leq s \leq t). \]By definition, \(W_t\) is adapted to \(\{\mathcal{F}_t^W\}_{t \geq 0}\). Moreover, Brownian motion is progressively measurable with respect to its natural filtration.
Properties of Filtrations:
- Right-Continuity: A filtration \(\{\mathcal{F}_t\}_{t \geq 0}\) is right-continuous if for all \(t \geq 0\), \[ \mathcal{F}_t = \bigcap_{s > t} \mathcal{F}_s. \]
- Completion: A filtration \(\{\mathcal{F}_t\}_{t \geq 0}\) is complete if \(\mathcal{F}_0\) contains all \(\mathbb{P}\)-null sets of \(\mathcal{F}\).
Derivation: Why is Brownian Motion Adapted to Its Natural Filtration?
Let \(\{W_t\}_{t \geq 0}\) be a standard Brownian motion. We want to show that \(W_t\) is \(\mathcal{F}_t^W\)-measurable for all \(t \geq 0\).
- By definition, \(\mathcal{F}_t^W = \sigma(W_s : 0 \leq s \leq t)\).
- The \(\sigma\)-algebra \(\sigma(W_s : 0 \leq s \leq t)\) is the smallest \(\sigma\)-algebra such that all \(W_s\) for \(0 \leq s \leq t\) are measurable.
- Since \(W_t\) is one of the generators of \(\mathcal{F}_t^W\), it is trivially \(\mathcal{F}_t^W\)-measurable.
Thus, \(W_t\) is adapted to \(\{\mathcal{F}_t^W\}_{t \geq 0}\).
Practical Applications:
- Stochastic Calculus: Filtrations and adapted processes are fundamental in stochastic calculus, particularly in the definition of the Itô integral. The Itô integral \(\int_0^T H_s \, dW_s\) is defined for processes \(H_t\) that are adapted to the filtration generated by \(W_t\).
- Option Pricing: In the Black-Scholes model, the price of a financial derivative is modeled as an adapted process to the filtration generated by the underlying asset's price process. This ensures that the price at time \(t\) depends only on information available up to time \(t\).
- Martingale Theory: A martingale is a stochastic process \(\{M_t\}_{t \geq 0}\) adapted to a filtration \(\{\mathcal{F}_t\}_{t \geq 0}\) such that \(\mathbb{E}[|M_t|] < \infty\) and \(\mathbb{E}[M_t | \mathcal{F}_s] = M_s\) for all \(s \leq t\). Martingales are central to the theory of stochastic processes and financial mathematics.
Example: Adapted Process in Option Pricing
Consider a European call option with maturity \(T\) and strike price \(K\) on a stock \(S_t\). The payoff at time \(T\) is \((S_T - K)^+\). The price of the option at time \(t \leq T\) is given by:
\[ V_t = \mathbb{E}\left[ e^{-r(T-t)}(S_T - K)^+ \mid \mathcal{F}_t \right], \] where \(r\) is the risk-free rate, and \(\{\mathcal{F}_t\}_{t \geq 0}\) is the filtration generated by \(S_t\). The process \(V_t\) is adapted to \(\{\mathcal{F}_t\}_{t \geq 0}\) because it depends only on information available up to time \(t\).Common Pitfalls and Important Notes:
- Adapted vs. Predictable: An adapted process is not necessarily predictable. A process \(\{H_t\}_{t \geq 0}\) is predictable if it is measurable with respect to the \(\sigma\)-algebra generated by all left-continuous adapted processes. Predictability is a stronger condition than adaptedness and is required for defining the Itô integral for general integrands.
- Right-Continuity of Filtrations: In many applications, particularly in stochastic calculus, it is assumed that the filtration is right-continuous. This ensures that the process "does not jump ahead" in time, which is crucial for the definition of stopping times and the strong Markov property.
- Completion of Filtrations: Completing a filtration by including all null sets is often necessary to ensure that conditional expectations are well-defined and that martingales have regular versions.
- Natural Filtration vs. Augmented Filtration: The natural filtration of a process may not be complete or right-continuous. The augmented filtration is obtained by completing the natural filtration and making it right-continuous. For Brownian motion, the augmented filtration is both complete and right-continuous.
Key Takeaways:
- A filtration \(\{\mathcal{F}_t\}_{t \geq 0}\) represents the flow of information over time.
- An adapted process \(\{X_t\}_{t \geq 0}\) is one where \(X_t\) is \(\mathcal{F}_t\)-measurable for all \(t \geq 0\), meaning its value at time \(t\) depends only on information available up to time \(t\).
- The natural filtration \(\{\mathcal{F}_t^X\}_{t \geq 0}\) of a process \(\{X_t\}_{t \geq 0}\) is the smallest filtration to which \(X_t\) is adapted.
- Progressive measurability is a stronger condition than adaptedness and is often required in stochastic calculus.
Topic 3: Martingales, Submartingales, and Supermartingales
Stochastic Process: A collection of random variables \( \{X_t\}_{t \in T} \) defined on a common probability space \( (\Omega, \mathcal{F}, \mathbb{P}) \), indexed by a set \( T \) (often representing time).
Filtration: A family of \( \sigma \)-algebras \( \{\mathcal{F}_t\}_{t \geq 0} \) such that \( \mathcal{F}_s \subseteq \mathcal{F}_t \subseteq \mathcal{F} \) for all \( s \leq t \). It represents the information available up to time \( t \).
Adapted Process: A stochastic process \( \{X_t\}_{t \geq 0} \) is adapted to the filtration \( \{\mathcal{F}_t\}_{t \geq 0} \) if \( X_t \) is \( \mathcal{F}_t \)-measurable for all \( t \geq 0 \).
Martingale: An adapted stochastic process \( \{M_t\}_{t \geq 0} \) with \( \mathbb{E}[|M_t|] < \infty \) for all \( t \geq 0 \) is a martingale if for all \( s \leq t \),
\[ \mathbb{E}[M_t \mid \mathcal{F}_s] = M_s \quad \text{a.s.} \]A martingale represents a "fair game" where the expected future value, given all past information, is equal to the current value.
Submartingale: An adapted stochastic process \( \{X_t\}_{t \geq 0} \) with \( \mathbb{E}[|X_t|] < \infty \) for all \( t \geq 0 \) is a submartingale if for all \( s \leq t \),
\[ \mathbb{E}[X_t \mid \mathcal{F}_s] \geq X_s \quad \text{a.s.} \]A submartingale represents a process that, on average, tends to increase over time (e.g., a "favorable game").
Supermartingale: An adapted stochastic process \( \{X_t\}_{t \geq 0} \) with \( \mathbb{E}[|X_t|] < \infty \) for all \( t \geq 0 \) is a supermartingale if for all \( s \leq t \),
\[ \mathbb{E}[X_t \mid \mathcal{F}_s] \leq X_s \quad \text{a.s.} \]A supermartingale represents a process that, on average, tends to decrease over time (e.g., an "unfavorable game").
Martingale Property (Discrete Time): For a discrete-time process \( \{M_n\}_{n \in \mathbb{N}} \), the martingale property is:
\[ \mathbb{E}[M_{n+1} \mid \mathcal{F}_n] = M_n \quad \text{a.s.} \]Submartingale and Supermartingale Properties (Discrete Time):
Submartingale:
\[ \mathbb{E}[X_{n+1} \mid \mathcal{F}_n] \geq X_n \quad \text{a.s.} \]Supermartingale:
\[ \mathbb{E}[X_{n+1} \mid \mathcal{F}_n] \leq X_n \quad \text{a.s.} \]Doob Decomposition: Any submartingale \( \{X_t\}_{t \geq 0} \) can be uniquely decomposed as:
\[ X_t = M_t + A_t, \]where \( \{M_t\}_{t \geq 0} \) is a martingale and \( \{A_t\}_{t \geq 0} \) is a predictable, increasing process with \( A_0 = 0 \).
Example 1: Martingale (Symmetric Random Walk)
Let \( \{Z_n\}_{n \in \mathbb{N}} \) be a sequence of i.i.d. random variables with \( \mathbb{E}[Z_n] = 0 \). Define the random walk \( M_n = \sum_{k=1}^n Z_k \). Then \( \{M_n\}_{n \in \mathbb{N}} \) is a martingale with respect to the natural filtration \( \mathcal{F}_n = \sigma(Z_1, \dots, Z_n) \).
Proof:
- Adaptedness: \( M_n \) is \( \mathcal{F}_n \)-measurable by construction.
- Integrability: \( \mathbb{E}[|M_n|] \leq \sum_{k=1}^n \mathbb{E}[|Z_k|] < \infty \).
- Martingale property: \[ \mathbb{E}[M_{n+1} \mid \mathcal{F}_n] = \mathbb{E}[M_n + Z_{n+1} \mid \mathcal{F}_n] = M_n + \mathbb{E}[Z_{n+1} \mid \mathcal{F}_n] = M_n + \mathbb{E}[Z_{n+1}] = M_n. \]
Example 2: Submartingale (Stock Price under Risk-Neutral Measure)
Consider a stock price \( S_t \) following geometric Brownian motion under the risk-neutral measure \( \mathbb{Q} \):
\[ dS_t = r S_t dt + \sigma S_t dW_t, \]where \( r \) is the risk-free rate, \( \sigma \) is volatility, and \( W_t \) is a \( \mathbb{Q} \)-Brownian motion. The discounted stock price \( \tilde{S}_t = e^{-rt} S_t \) is a martingale under \( \mathbb{Q} \). However, the undiscounted stock price \( S_t \) is a submartingale because:
\[ \mathbb{E}^\mathbb{Q}[S_t \mid \mathcal{F}_s] = e^{r(t-s)} S_s \geq S_s \quad \text{for } t \geq s. \]Example 3: Supermartingale (Gambler's Ruin)
Consider a gambler with initial wealth \( X_0 \) who bets $1 on a fair coin flip at each time step. The wealth process \( \{X_n\}_{n \in \mathbb{N}} \) is a martingale. However, if the gambler bets a fraction \( f \) of their wealth at each step (with \( 0 < f < 1 \)), the wealth process \( Y_n \) is a supermartingale:
\[ Y_{n+1} = \begin{cases} Y_n (1 + f) & \text{with probability } 1/2, \\ Y_n (1 - f) & \text{with probability } 1/2. \end{cases} \]Then:
\[ \mathbb{E}[Y_{n+1} \mid \mathcal{F}_n] = \frac{1}{2} Y_n (1 + f) + \frac{1}{2} Y_n (1 - f) = Y_n \leq Y_n. \]However, if the bets are unfavorable (e.g., winning probability \( p < 1/2 \)), \( Y_n \) becomes a strict supermartingale.
Optional Stopping Theorem (OST): Let \( \{M_t\}_{t \geq 0} \) be a martingale and \( \tau \) a stopping time with \( \mathbb{P}(\tau < \infty) = 1 \). If either:
- \( \tau \) is bounded a.s., or
- \( \mathbb{E}[|M_\tau|] < \infty \) and \( \lim_{t \to \infty} \mathbb{E}[|M_t| \mathbb{1}_{\tau > t}] = 0 \),
then:
\[ \mathbb{E}[M_\tau] = \mathbb{E}[M_0]. \]For submartingales, \( \mathbb{E}[X_\tau] \geq \mathbb{E}[X_0] \), and for supermartingales, \( \mathbb{E}[X_\tau] \leq \mathbb{E}[X_0] \).
Example 4: Application of OST (Gambler's Ruin)
Consider a gambler with initial wealth \( a \) playing a fair game (martingale) until they either reach wealth \( N \) or go broke. Let \( \tau = \inf\{n \geq 0 : X_n = 0 \text{ or } X_n = N\} \). By the OST:
\[ \mathbb{E}[X_\tau] = X_0 = a. \]But \( \mathbb{E}[X_\tau] = N \cdot \mathbb{P}(X_\tau = N) + 0 \cdot \mathbb{P}(X_\tau = 0) = N \cdot \mathbb{P}(X_\tau = N) \). Thus:
\[ \mathbb{P}(X_\tau = N) = \frac{a}{N}, \quad \mathbb{P}(X_\tau = 0) = 1 - \frac{a}{N}. \]Martingale Convergence Theorem: Let \( \{X_n\}_{n \in \mathbb{N}} \) be a submartingale such that \( \sup_n \mathbb{E}[X_n^+] < \infty \). Then \( X_n \) converges almost surely to an integrable random variable \( X_\infty \).
Important Notes and Pitfalls:
- Filtration Matters: The martingale property depends on the chosen filtration. A process may be a martingale with respect to one filtration but not another.
- Integrability: The condition \( \mathbb{E}[|X_t|] < \infty \) is crucial. Without it, the conditional expectations may not be well-defined.
- Optional Stopping Theorem Conditions: The OST does not hold for arbitrary stopping times. The conditions (boundedness or uniform integrability) are essential. For example, in the gambler's ruin problem, if \( N \to \infty \), the stopping time may not satisfy the OST conditions.
- Sub/Supermartingales and Drift: In continuous time, a process \( X_t \) is a submartingale if its drift is non-negative, and a supermartingale if its drift is non-positive. For example, an Itô process \( dX_t = \mu_t dt + \sigma_t dW_t \) is a submartingale if \( \mu_t \geq 0 \) a.s., and a supermartingale if \( \mu_t \leq 0 \) a.s.
- Local Martingales: A process \( \{M_t\}_{t \geq 0} \) is a local martingale if there exists a sequence of stopping times \( \tau_n \uparrow \infty \) such that \( \{M_{t \wedge \tau_n}\}_{t \geq 0} \) is a martingale for each \( n \). Local martingales are not necessarily martingales (e.g., \( M_t = \exp(W_t - t/2) \) is a local martingale but not a martingale).
- Martingale Representation Theorem: In a Brownian filtration, any martingale can be represented as a stochastic integral with respect to Brownian motion. This is a powerful tool in mathematical finance (e.g., hedging in complete markets).
Martingale Characterization of Brownian Motion: A continuous local martingale \( \{M_t\}_{t \geq 0} \) with \( M_0 = 0 \) and quadratic variation \( \langle M \rangle_t = t \) is a standard Brownian motion.
Practical Applications:
- Mathematical Finance:
- Martingales are central to the theory of arbitrage pricing. The Fundamental Theorem of Asset Pricing states that a market is arbitrage-free if and only if there exists an equivalent martingale measure (EMM).
- The discounted stock price under the risk-neutral measure is a martingale, which is the basis for the Black-Scholes model and derivative pricing.
- Submartingales and supermartingales are used to model processes with drift (e.g., stock prices under the physical measure).
- Stochastic Control: Martingales are used in the dynamic programming approach to stochastic control problems, where the value function is often a martingale (or sub/supermartingale) under optimal control.
- Probability Theory:
- Martingales provide a unified framework for studying convergence theorems (e.g., Law of Large Numbers, Central Limit Theorem).
- The Martingale Convergence Theorem is a key tool in proving almost sure convergence of stochastic processes.
- Statistics: Martingales are used in sequential analysis, where stopping times and the Optional Stopping Theorem play a crucial role in hypothesis testing.
- Queueing Theory: Martingales are used to analyze the stability and performance of queueing systems.
Key Relationships:
- If \( \{M_t\} \) is a martingale, then \( \{|M_t|\} \) is a submartingale.
- If \( \{X_t\} \) is a submartingale, then \( \{X_t^+\} \) (where \( X_t^+ = \max(X_t, 0) \)) is also a submartingale.
- If \( \{X_t\} \) is a supermartingale and \( \phi \) is a concave, non-decreasing function, then \( \{\phi(X_t)\} \) is a supermartingale (provided \( \mathbb{E}[|\phi(X_t)|] < \infty \)).
Topic 4: Quadratic Variation of a Stochastic Process
Definition: The quadratic variation of a stochastic process \( X_t \) over the interval \([0, T]\) is a measure of the "total variability" of the process due to its continuous fluctuations. It is defined as the limit in probability of the sum of squared increments of the process over a partition of the interval as the mesh size of the partition tends to zero.
Formally, for a partition \( \Pi = \{0 = t_0 < t_1 < \dots < t_n = T\} \) of \([0, T]\), the quadratic variation \([X]_T\) is given by:
\[ [X]_T = \lim_{||\Pi|| \to 0} \sum_{i=1}^n (X_{t_i} - X_{t_{i-1}})^2, \] where \( ||\Pi|| = \max_{1 \leq i \leq n} (t_i - t_{i-1}) \) is the mesh size of the partition.Key Formula for Quadratic Variation:
- Quadratic Variation of a Brownian Motion: \[ [W]_t = t, \] where \( W_t \) is a standard Brownian motion.
- Quadratic Variation of an Itô Process: For an Itô process \( X_t \) defined by: \[ dX_t = \mu_t \, dt + \sigma_t \, dW_t, \] the quadratic variation is: \[ [X]_t = \int_0^t \sigma_s^2 \, ds. \]
- Cross-Quadratic Variation (Quadratic Covariation): For two Itô processes \( X_t \) and \( Y_t \), the cross-quadratic variation is: \[ [X, Y]_t = \int_0^t \sigma_s^X \sigma_s^Y \, ds, \] where \( dX_t = \mu_t^X \, dt + \sigma_t^X \, dW_t \) and \( dY_t = \mu_t^Y \, dt + \sigma_t^Y \, dW_t \).
- Properties of Quadratic Variation:
- \( [X]_t \) is non-decreasing in \( t \).
- \( [X, Y]_t = \frac{1}{4} \left( [X + Y]_t - [X - Y]_t \right) \).
- If \( X_t \) is of finite variation, then \( [X]_t = 0 \).
Finite Variation vs. Quadratic Variation:
- A process \( X_t \) is said to be of finite variation if its total variation over \([0, T]\) is finite, i.e., \[ \sup_{\Pi} \sum_{i=1}^n |X_{t_i} - X_{t_{i-1}}| < \infty, \] where the supremum is taken over all partitions \( \Pi \) of \([0, T]\).
- If \( X_t \) is of finite variation, then its quadratic variation \( [X]_t = 0 \).
- Brownian motion \( W_t \) is not of finite variation but has quadratic variation \( [W]_t = t \).
Example 1: Quadratic Variation of Brownian Motion
Let \( W_t \) be a standard Brownian motion. We compute its quadratic variation over \([0, T]\).
Consider a partition \( \Pi = \{0 = t_0 < t_1 < \dots < t_n = T\} \). The sum of squared increments is:
\[ \sum_{i=1}^n (W_{t_i} - W_{t_{i-1}})^2. \]As \( ||\Pi|| \to 0 \), this sum converges in \( L^2 \) (and hence in probability) to \( T \). Thus,
\[ [W]_T = T. \]Intuition: The quadratic variation of Brownian motion grows linearly with time, reflecting its highly irregular, non-differentiable paths.
Example 2: Quadratic Variation of an Itô Process
Let \( X_t \) be an Itô process defined by:
\[ dX_t = \mu_t \, dt + \sigma_t \, dW_t, \] where \( \mu_t \) and \( \sigma_t \) are adapted processes. We derive its quadratic variation.The increment \( X_{t_i} - X_{t_{i-1}} \) can be approximated as:
\[ X_{t_i} - X_{t_{i-1}} \approx \mu_{t_{i-1}} (t_i - t_{i-1}) + \sigma_{t_{i-1}} (W_{t_i} - W_{t_{i-1}}). \]Squaring this and summing over the partition:
\[ \sum_{i=1}^n (X_{t_i} - X_{t_{i-1}})^2 \approx \sum_{i=1}^n \sigma_{t_{i-1}}^2 (W_{t_i} - W_{t_{i-1}})^2 + \text{terms that vanish as } ||\Pi|| \to 0. \]The first term converges to \( \int_0^T \sigma_s^2 \, ds \), while the other terms (involving \( dt \) or \( dt \cdot dW_t \)) vanish in the limit. Thus,
\[ [X]_T = \int_0^T \sigma_s^2 \, ds. \]Example 3: Cross-Quadratic Variation of Two Itô Processes
Let \( X_t \) and \( Y_t \) be two Itô processes:
\[ dX_t = \mu_t^X \, dt + \sigma_t^X \, dW_t, \quad dY_t = \mu_t^Y \, dt + \sigma_t^Y \, dW_t. \]The cross-quadratic variation \( [X, Y]_t \) is computed as follows:
Using the polarization identity:
\[ [X, Y]_t = \frac{1}{4} \left( [X + Y]_t - [X - Y]_t \right). \]Compute \( [X + Y]_t \) and \( [X - Y]_t \):
\[ d(X + Y)_t = (\mu_t^X + \mu_t^Y) \, dt + (\sigma_t^X + \sigma_t^Y) \, dW_t \implies [X + Y]_t = \int_0^t (\sigma_s^X + \sigma_s^Y)^2 \, ds, \] \[ d(X - Y)_t = (\mu_t^X - \mu_t^Y) \, dt + (\sigma_t^X - \sigma_t^Y) \, dW_t \implies [X - Y]_t = \int_0^t (\sigma_s^X - \sigma_s^Y)^2 \, ds. \]Thus,
\[ [X, Y]_t = \frac{1}{4} \left( \int_0^t (\sigma_s^X + \sigma_s^Y)^2 \, ds - \int_0^t (\sigma_s^X - \sigma_s^Y)^2 \, ds \right) = \int_0^t \sigma_s^X \sigma_s^Y \, ds. \]Important Notes and Common Pitfalls:
- Quadratic Variation vs. Variance:
- Quadratic variation \( [X]_t \) is a pathwise property, defined for each realization of the process.
- Variance \( \text{Var}(X_t) \) is an expectation over all possible paths. For Brownian motion, \( \text{Var}(W_t) = t \), which coincides with \( [W]_t \), but this is not true for general processes.
- Finite Variation Processes:
- Processes with finite variation (e.g., differentiable functions) have zero quadratic variation. This is why the \( dt \) term in an Itô process does not contribute to the quadratic variation.
- Brownian motion has infinite variation but finite quadratic variation, which is why it is not differentiable.
- Itô's Formula and Quadratic Variation:
- Quadratic variation is central to Itô's formula. For a twice continuously differentiable function \( f \), the Itô formula is: \[ df(X_t) = f'(X_t) \, dX_t + \frac{1}{2} f''(X_t) \, d[X]_t. \]
- The term \( d[X]_t \) arises from the quadratic variation of \( X_t \).
- Discrete-Time Analogue:
- In discrete time, the quadratic variation of a process \( X_n \) is simply the sum of squared increments: \[ [X]_N = \sum_{n=1}^N (X_n - X_{n-1})^2. \]
- This is useful for intuition and numerical approximations (e.g., in volatility modeling).
- Common Mistake: Ignoring Quadratic Variation in Integrals:
- When computing stochastic integrals, it is crucial to account for the quadratic variation term. For example, \( \int_0^t W_s \, dW_s \) is not \( \frac{1}{2} W_t^2 \) (as in deterministic calculus) but \( \frac{1}{2} (W_t^2 - t) \), where the \( -t \) term comes from \( d[W]_t = dt \).
Practical Applications:
- Stochastic Calculus and Itô's Formula:
- Quadratic variation is a fundamental concept in stochastic calculus, appearing in Itô's formula and stochastic differential equations (SDEs).
- Itô's formula relies on the quadratic variation of the underlying process to derive the correct form of the chain rule for stochastic processes.
- Financial Mathematics:
- In the Black-Scholes model, the quadratic variation of the log-price process determines the volatility of the asset. For a geometric Brownian motion \( S_t = S_0 e^{\sigma W_t + (\mu - \frac{1}{2} \sigma^2)t} \), the quadratic variation of \( \log S_t \) is \( \sigma^2 t \).
- Quadratic variation is used in the derivation of the Black-Scholes PDE and in the calculation of hedging strategies.
- Volatility Modeling:
- In stochastic volatility models (e.g., Heston model), the quadratic variation of the asset price process is used to define the volatility process.
- Realized volatility, a key concept in empirical finance, is an estimator of the quadratic variation of the log-price process over a given time interval.
- Martingale Representation Theorem:
- The quadratic variation process is used in the martingale representation theorem, which states that any martingale adapted to the filtration of a Brownian motion can be represented as a stochastic integral with respect to that Brownian motion.
- Numerical Methods for SDEs:
- In the Euler-Maruyama method for simulating SDEs, the quadratic variation of the driving noise process (e.g., Brownian motion) is used to discretize the stochastic integral.
- Higher-order numerical schemes (e.g., Milstein scheme) explicitly account for the quadratic variation term to achieve better convergence properties.
Common Quant Interview Questions on Quadratic Variation:
-
Question: What is the quadratic variation of a standard Brownian motion \( W_t \) over the interval \([0, T]\)?
Answer: The quadratic variation of \( W_t \) over \([0, T]\) is \( [W]_T = T \). This is a fundamental result and can be derived by showing that the sum of squared increments converges to \( T \) in \( L^2 \).
-
Question: Let \( X_t = W_t^2 \), where \( W_t \) is a standard Brownian motion. Compute the quadratic variation \( [X]_t \).
Answer: Using Itô's formula, we have: \[ dX_t = 2W_t \, dW_t + dt. \] The quadratic variation is then: \[ [X]_t = \int_0^t (2W_s)^2 \, ds = 4 \int_0^t W_s^2 \, ds. \] Alternatively, note that \( X_t = W_t^2 \) is not an Itô process in the standard form, but its quadratic variation can be computed directly from the definition or by using the fact that \( d[X]_t = (dX_t)^2 \).
-
Question: Suppose \( X_t \) is a process of finite variation. What is its quadratic variation?
Answer: If \( X_t \) is of finite variation, then its quadratic variation \( [X]_t = 0 \). This is because the sum of squared increments is dominated by the square of the total variation, which tends to zero as the mesh size of the partition tends to zero.
-
Question: Let \( X_t \) and \( Y_t \) be two Itô processes driven by the same Brownian motion \( W_t \). Express the quadratic variation \( [X, Y]_t \) in terms of their diffusion coefficients.
Answer: If \( dX_t = \mu_t^X \, dt + \sigma_t^X \, dW_t \) and \( dY_t = \mu_t^Y \, dt + \sigma_t^Y \, dW_t \), then the cross-quadratic variation is: \[ [X, Y]_t = \int_0^t \sigma_s^X \sigma_s^Y \, ds. \] This follows from the polarization identity and the properties of quadratic variation.
-
Question: Why does the \( dt \) term in an Itô process not contribute to the quadratic variation?
Answer: The \( dt \) term corresponds to the finite variation part of the process. Since quadratic variation measures the "roughness" of the process due to continuous fluctuations (e.g., Brownian motion), and finite variation processes have zero quadratic variation, the \( dt \) term does not contribute to \( [X]_t \).
-
Question: Compute \( \int_0^t W_s \, dW_s \) using Itô's formula and explain the role of quadratic variation in the result.
Answer: Let \( f(x) = \frac{1}{2} x^2 \). By Itô's formula: \[ df(W_t) = f'(W_t) \, dW_t + \frac{1}{2} f''(W_t) \, d[W]_t = W_t \, dW_t + \frac{1}{2} dt. \] Integrating both sides from \( 0 \) to \( t \): \[ \frac{1}{2} W_t^2 = \int_0^t W_s \, dW_s + \frac{1}{2} t \implies \int_0^t W_s \, dW_s = \frac{1}{2} (W_t^2 - t). \] The term \( -\frac{1}{2} t \) arises from the quadratic variation \( [W]_t = t \). This is in contrast to deterministic calculus, where \( \int_0^t W_s \, dW_s \) would be \( \frac{1}{2} W_t^2 \).
-
Question: Let \( X_t = \int_0^t \sigma_s \, dW_s \), where \( \sigma_t \) is an adapted process. What is the quadratic variation \( [X]_t \)?
Answer: The process \( X_t \) is a martingale with \( dX_t = \sigma_t \, dW_t \). Its quadratic variation is: \[ [X]_t = \int_0^t \sigma_s^2 \, ds. \] This follows directly from the definition of quadratic variation for Itô processes.
Topic 5: Itô's Definition of Stochastic Integrals (for Simple Processes)
Stochastic Integral: A stochastic integral is a generalization of the deterministic integral to incorporate randomness. Itô's stochastic integral is defined for a class of stochastic processes called integrands with respect to another stochastic process called the integrator, typically a Brownian motion \( W_t \).
Simple Process: A stochastic process \( \phi(t, \omega) \) is called simple if it can be written as a finite linear combination of indicator functions of time intervals and random variables. Formally,
\[ \phi(t, \omega) = \sum_{i=0}^{n-1} \phi_i(\omega) \cdot \mathbb{I}_{[t_i, t_{i+1})}(t), \] where \( 0 = t_0 < t_1 < \dots < t_n = T \) is a partition of \([0, T]\), \( \phi_i(\omega) \) is \( \mathcal{F}_{t_i} \)-measurable, and \( \mathbb{I}_{[t_i, t_{i+1})}(t) \) is the indicator function for the interval \([t_i, t_{i+1})\). The \( \mathcal{F}_t \)-measurability ensures that \( \phi_i \) is known at time \( t_i \) based on the information available up to that time.Itô Integral for Simple Processes: Let \( \phi(t, \omega) \) be a simple process as defined above. The Itô integral of \( \phi \) with respect to Brownian motion \( W_t \) over \([0, T]\) is defined as:
\[ \int_0^T \phi(t, \omega) \, dW_t = \sum_{i=0}^{n-1} \phi_i(\omega) \cdot (W_{t_{i+1}} - W_{t_i}). \] Here, \( W_{t_{i+1}} - W_{t_i} \) is the increment of the Brownian motion over \([t_i, t_{i+1})\).Example: Let \( \phi(t, \omega) \) be a simple process defined on \([0, 2]\) with partition \( t_0 = 0, t_1 = 1, t_2 = 2 \) and \( \phi_0(\omega) = W_0 \), \( \phi_1(\omega) = W_1 \). Compute the Itô integral \( \int_0^2 \phi(t, \omega) \, dW_t \).
Solution:
The simple process is given by:
\[ \phi(t, \omega) = W_0 \cdot \mathbb{I}_{[0, 1)}(t) + W_1 \cdot \mathbb{I}_{[1, 2)}(t). \] The Itô integral is: \[ \int_0^2 \phi(t, \omega) \, dW_t = W_0 \cdot (W_1 - W_0) + W_1 \cdot (W_2 - W_1). \] Simplifying, we get: \[ \int_0^2 \phi(t, \omega) \, dW_t = W_0 W_1 - W_0^2 + W_1 W_2 - W_1^2. \]Properties of the Itô Integral for Simple Processes:
- Linearity: For simple processes \( \phi \) and \( \psi \), and constants \( a, b \), \[ \int_0^T (a \phi(t) + b \psi(t)) \, dW_t = a \int_0^T \phi(t) \, dW_t + b \int_0^T \psi(t) \, dW_t. \]
- Martingale Property: The Itô integral \( I_t = \int_0^t \phi(s) \, dW_s \) is a martingale with respect to the filtration \( \mathcal{F}_t \), provided \( \mathbb{E}\left[\int_0^T \phi(s)^2 \, ds\right] < \infty \). This means: \[ \mathbb{E}[I_t | \mathcal{F}_s] = I_s \quad \text{for} \quad s \leq t. \]
- Itô Isometry: For a simple process \( \phi \), \[ \mathbb{E}\left[\left(\int_0^T \phi(t) \, dW_t\right)^2\right] = \mathbb{E}\left[\int_0^T \phi(t)^2 \, dt\right]. \]
Derivation of Itô Isometry (for Simple Processes):
Let \( \phi(t, \omega) = \sum_{i=0}^{n-1} \phi_i(\omega) \cdot \mathbb{I}_{[t_i, t_{i+1})}(t) \). The Itô integral is:
\[ I_T = \int_0^T \phi(t) \, dW_t = \sum_{i=0}^{n-1} \phi_i \cdot (W_{t_{i+1}} - W_{t_i}). \] The square of the integral is: \[ I_T^2 = \left(\sum_{i=0}^{n-1} \phi_i \cdot (W_{t_{i+1}} - W_{t_i})\right)^2 = \sum_{i=0}^{n-1} \phi_i^2 (W_{t_{i+1}} - W_{t_i})^2 + 2 \sum_{i < j} \phi_i \phi_j (W_{t_{i+1}} - W_{t_i})(W_{t_{j+1}} - W_{t_j}). \] Taking expectations and using the independence of increments of Brownian motion: \[ \mathbb{E}[I_T^2] = \sum_{i=0}^{n-1} \mathbb{E}[\phi_i^2] \mathbb{E}[(W_{t_{i+1}} - W_{t_i})^2] + 2 \sum_{i < j} \mathbb{E}[\phi_i \phi_j (W_{t_{i+1}} - W_{t_i})] \mathbb{E}[(W_{t_{j+1}} - W_{t_j})]. \] Since \( \mathbb{E}[(W_{t_{i+1}} - W_{t_i})] = 0 \) and \( \mathbb{E}[(W_{t_{i+1}} - W_{t_i})^2] = t_{i+1} - t_i \), the second term vanishes, and we get: \[ \mathbb{E}[I_T^2] = \sum_{i=0}^{n-1} \mathbb{E}[\phi_i^2] (t_{i+1} - t_i) = \mathbb{E}\left[\int_0^T \phi(t)^2 \, dt\right]. \]Important Notes and Common Pitfalls:
- Adaptedness: The integrand \( \phi(t, \omega) \) must be adapted to the filtration \( \mathcal{F}_t \). This means \( \phi(t, \omega) \) must be \( \mathcal{F}_t \)-measurable for each \( t \). Failing to ensure adaptedness is a common mistake.
- Non-Anticipativity: The Itô integral is non-anticipative, meaning the value of the integral at time \( t \) depends only on the path of \( W_s \) for \( s \leq t \). This is a key difference from the Stratonovich integral, which is anticipative.
- Limitations of Simple Processes: The definition of the Itô integral for simple processes is a stepping stone to defining the integral for more general processes (e.g., predictable processes with \( \mathbb{E}\left[\int_0^T \phi(t)^2 \, dt\right] < \infty \)). Simple processes are not sufficient for many practical applications, but they provide the foundation for the general theory.
- Quadratic Variation: The Itô integral has non-zero quadratic variation, which is a critical property in stochastic calculus. For the Itô integral \( I_t = \int_0^t \phi(s) \, dW_s \), the quadratic variation is: \[ [I, I]_t = \int_0^t \phi(s)^2 \, ds. \]
- Differentiability: Unlike deterministic integrals, the Itô integral is not differentiable in the classical sense. This is because Brownian motion is nowhere differentiable.
Practical Applications:
- Financial Mathematics: The Itô integral is fundamental in modeling the evolution of asset prices in continuous-time finance. For example, the Black-Scholes model for option pricing relies on stochastic integrals to describe the dynamics of stock prices: \[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t, \] where \( S_t \) is the stock price, \( \mu \) is the drift, \( \sigma \) is the volatility, and \( W_t \) is a Brownian motion. The solution to this SDE involves an Itô integral.
- Stochastic Control: In stochastic control theory, the Itô integral is used to model systems subject to random perturbations. The goal is to optimize a cost functional, often involving stochastic integrals, over a set of admissible controls.
- Filtering Theory: The Itô integral appears in the derivation of the Kalman-Bucy filter, which is used to estimate the state of a linear dynamical system from noisy observations.
- Physics: In statistical mechanics and quantum field theory, stochastic integrals are used to model systems with random fluctuations, such as the motion of particles in a fluid (Brownian motion).
Quant Interview Question: Let \( W_t \) be a standard Brownian motion. Compute \( \mathbb{E}\left[\left(\int_0^1 W_t \, dW_t\right)^2\right] \).
Solution:
First, note that \( \phi(t) = W_t \) is not a simple process, but we can approximate it by a sequence of simple processes. However, for the purpose of this problem, we can use the Itô isometry directly (assuming \( \phi(t) = W_t \) is admissible). The Itô isometry states:
\[ \mathbb{E}\left[\left(\int_0^1 W_t \, dW_t\right)^2\right] = \mathbb{E}\left[\int_0^1 W_t^2 \, dt\right]. \] To compute \( \mathbb{E}\left[\int_0^1 W_t^2 \, dt\right] \), we use Fubini's theorem to interchange the expectation and integral: \[ \mathbb{E}\left[\int_0^1 W_t^2 \, dt\right] = \int_0^1 \mathbb{E}[W_t^2] \, dt. \] Since \( W_t \sim \mathcal{N}(0, t) \), we have \( \mathbb{E}[W_t^2] = t \). Thus: \[ \int_0^1 \mathbb{E}[W_t^2] \, dt = \int_0^1 t \, dt = \frac{1}{2}. \] Therefore: \[ \mathbb{E}\left[\left(\int_0^1 W_t \, dW_t\right)^2\right] = \frac{1}{2}. \]Topic 6: Extension of Itô Integrals to General Integrands (L² Theory)
Itô Integral for Simple Processes: The Itô integral was initially defined for simple processes, which are stochastic processes of the form:
\[ \phi(t) = \sum_{i=0}^{n-1} \phi_i \cdot \mathbb{I}_{[t_i, t_{i+1})}(t), \] where \(0 = t_0 < t_1 < \dots < t_n = T\) is a partition of \([0, T]\), \(\phi_i\) is \(\mathcal{F}_{t_i}\)-measurable, and \(\mathbb{I}\) is the indicator function. The Itô integral of \(\phi\) with respect to a Brownian motion \(W_t\) is: \[ \int_0^T \phi(t) \, dW_t = \sum_{i=0}^{n-1} \phi_i (W_{t_{i+1}} - W_{t_i}). \]L² Space of Integrands: The space \(L^2([0, T] \times \Omega)\) consists of all progressively measurable processes \(\phi(t, \omega)\) such that: \[ \mathbb{E} \left[ \int_0^T \phi(t)^2 \, dt \right] < \infty. \] This space is equipped with the norm: \[ \|\phi\|_{L^2} = \left( \mathbb{E} \left[ \int_0^T \phi(t)^2 \, dt \right] \right)^{1/2}. \]
Isometry Property for Simple Processes: For a simple process \(\phi\), the Itô integral satisfies the Itô isometry: \[ \mathbb{E} \left[ \left( \int_0^T \phi(t) \, dW_t \right)^2 \right] = \mathbb{E} \left[ \int_0^T \phi(t)^2 \, dt \right]. \] This property is key to extending the Itô integral to general \(L^2\) integrands.
Extension to General \(L^2\) Integrands: For any \(\phi \in L^2([0, T] \times \Omega)\), there exists a sequence of simple processes \(\{\phi_n\}\) such that: \[ \|\phi - \phi_n\|_{L^2} \to 0 \quad \text{as} \quad n \to \infty. \] The Itô integral of \(\phi\) is defined as the \(L^2\)-limit: \[ \int_0^T \phi(t) \, dW_t = \lim_{n \to \infty} \int_0^T \phi_n(t) \, dW_t, \] where the limit is taken in \(L^2(\Omega)\).
Itô Isometry for General \(L^2\) Integrands: The Itô isometry extends to all \(\phi \in L^2([0, T] \times \Omega)\): \[ \mathbb{E} \left[ \left( \int_0^T \phi(t) \, dW_t \right)^2 \right] = \mathbb{E} \left[ \int_0^T \phi(t)^2 \, dt \right]. \]
Example: Verifying \(L^2\) Convergence
Let \(\phi(t) = W_t\), the Brownian motion itself. We show that \(W_t \in L^2([0, T] \times \Omega)\) and compute its Itô integral.
- Check \(L^2\) condition: \[ \mathbb{E} \left[ \int_0^T W_t^2 \, dt \right] = \int_0^T \mathbb{E}[W_t^2] \, dt = \int_0^T t \, dt = \frac{T^2}{2} < \infty. \] Thus, \(W_t \in L^2([0, T] \times \Omega)\).
- Approximate by simple processes: Define \(\phi_n(t) = \sum_{i=0}^{n-1} W_{t_i} \cdot \mathbb{I}_{[t_i, t_{i+1})}(t)\), where \(t_i = iT/n\). Then: \[ \|\phi - \phi_n\|_{L^2}^2 = \mathbb{E} \left[ \int_0^T (W_t - \phi_n(t))^2 \, dt \right] \to 0 \quad \text{as} \quad n \to \infty. \]
- Compute the Itô integral: The Itô integral of \(W_t\) is: \[ \int_0^T W_t \, dW_t = \frac{W_T^2}{2} - \frac{T}{2}, \] which can be verified using the Itô formula.
Martingale Property of Itô Integrals: For \(\phi \in L^2([0, T] \times \Omega)\), the process: \[ M_t = \int_0^t \phi(s) \, dW_s, \quad t \in [0, T], \] is a continuous square-integrable martingale with respect to the filtration \(\{\mathcal{F}_t\}\). That is:
- \(\mathbb{E}[M_t^2] < \infty\) for all \(t \in [0, T]\),
- \(M_t\) is \(\mathcal{F}_t\)-measurable,
- \(\mathbb{E}[M_t \mid \mathcal{F}_s] = M_s\) for all \(s \leq t\).
Derivation: Extension of Itô Integral to \(L^2\) Integrands
- Density of Simple Processes: Simple processes are dense in \(L^2([0, T] \times \Omega)\). For any \(\phi \in L^2([0, T] \times \Omega)\), there exists a sequence \(\{\phi_n\}\) of simple processes such that: \[ \|\phi - \phi_n\|_{L^2} \to 0. \]
- Cauchy Sequence in \(L^2(\Omega)\): For \(m, n \geq 1\), the Itô isometry for simple processes gives: \[ \mathbb{E} \left[ \left( \int_0^T (\phi_m(t) - \phi_n(t)) \, dW_t \right)^2 \right] = \mathbb{E} \left[ \int_0^T (\phi_m(t) - \phi_n(t))^2 \, dt \right] = \|\phi_m - \phi_n\|_{L^2}^2. \] Since \(\{\phi_n\}\) is Cauchy in \(L^2([0, T] \times \Omega)\), \(\{\int_0^T \phi_n(t) \, dW_t\}\) is Cauchy in \(L^2(\Omega)\).
- Existence of Limit: \(L^2(\Omega)\) is complete, so there exists a random variable \(I \in L^2(\Omega)\) such that: \[ \mathbb{E} \left[ \left( \int_0^T \phi_n(t) \, dW_t - I \right)^2 \right] \to 0 \quad \text{as} \quad n \to \infty. \] We define \(\int_0^T \phi(t) \, dW_t = I\).
- Uniqueness: The limit \(I\) is unique up to almost sure equality, as \(L^2\)-limits are unique.
Practical Applications:
- Stochastic Differential Equations (SDEs): The \(L^2\) theory allows us to define solutions to SDEs of the form: \[ dX_t = \mu(t, X_t) \, dt + \sigma(t, X_t) \, dW_t, \] where \(\sigma(t, X_t) \in L^2([0, T] \times \Omega)\). This is essential in modeling asset prices, interest rates, and other financial quantities.
- Martingale Representation Theorem: The \(L^2\) theory is foundational for the martingale representation theorem, which states that any square-integrable martingale adapted to the filtration of a Brownian motion can be represented as an Itô integral. This is critical in hedging and pricing derivatives.
- Numerical Methods: The \(L^2\) extension justifies the use of numerical approximations (e.g., Euler-Maruyama method) for SDEs, as it ensures convergence of discrete approximations to the true solution in \(L^2\).
- Quantitative Finance: In the Black-Scholes framework, the \(L^2\) theory ensures that the stochastic integral representing the gain from a self-financing trading strategy is well-defined and square-integrable.
Common Pitfalls and Important Notes:
- Progressive Measurability: The integrand \(\phi(t, \omega)\) must be progressively measurable (i.e., jointly measurable in \((t, \omega)\) and adapted to \(\{\mathcal{F}_t\}\)) for the Itô integral to be well-defined. Failure to ensure this can lead to non-measurable or ill-defined integrals.
- \(L^2\) Condition: The condition \(\mathbb{E}[\int_0^T \phi(t)^2 \, dt] < \infty\) is crucial. Integrands not satisfying this may lead to integrals that are not square-integrable or even infinite with positive probability. For example, \(\phi(t) = 1/t\) for \(t \in (0, T]\) is not in \(L^2([0, T] \times \Omega)\).
- Non-Anticipativity: The Itô integral is defined for non-anticipative integrands (i.e., \(\phi(t)\) depends only on information up to time \(t\)). Anticipative integrals (e.g., Skorokhod integrals) require different theories.
- Convergence of Approximations: When approximating a general \(L^2\) integrand by simple processes, the choice of approximation (e.g., left-endpoint, right-endpoint, or midpoint) can affect the limit in more general settings (e.g., Stratonovich integrals). For Itô integrals, left-endpoint approximations are standard.
- Martingale Property: The martingale property of Itô integrals holds only for integrands in \(L^2([0, T] \times \Omega)\). For integrands outside this space, the integral may not be a martingale (e.g., it could be a local martingale or a semimartingale).
- Relation to Riemann-Stieltjes Integrals: Unlike Riemann-Stieltjes integrals, the Itô integral does not satisfy integration by parts in the classical sense. Instead, the Itô formula provides the correct "stochastic" integration by parts.
Example: Itô Integral of a Deterministic Function
Let \(\phi(t)\) be a deterministic function in \(L^2([0, T])\), i.e., \(\int_0^T \phi(t)^2 \, dt < \infty\). The Itô integral \(\int_0^T \phi(t) \, dW_t\) is a Gaussian random variable with mean 0 and variance \(\int_0^T \phi(t)^2 \, dt\).
- Approximation by Simple Processes: Let \(\phi_n(t) = \sum_{i=0}^{n-1} \phi(t_i) \cdot \mathbb{I}_{[t_i, t_{i+1})}(t)\), where \(t_i = iT/n\). Then: \[ \int_0^T \phi_n(t) \, dW_t = \sum_{i=0}^{n-1} \phi(t_i) (W_{t_{i+1}} - W_{t_i}). \] This is a sum of independent Gaussian random variables, hence Gaussian with mean 0 and variance: \[ \sum_{i=0}^{n-1} \phi(t_i)^2 (t_{i+1} - t_i) \to \int_0^T \phi(t)^2 \, dt \quad \text{as} \quad n \to \infty. \]
- Limit in \(L^2\): By the \(L^2\) theory, \(\int_0^T \phi_n(t) \, dW_t \to \int_0^T \phi(t) \, dW_t\) in \(L^2(\Omega)\). The limit of Gaussian random variables is Gaussian, so \(\int_0^T \phi(t) \, dW_t \sim \mathcal{N}\left(0, \int_0^T \phi(t)^2 \, dt\right)\).
Key Takeaways:
- The Itô integral is initially defined for simple processes and extended to general \(L^2\) integrands via \(L^2\)-limits.
- The Itô isometry \(\mathbb{E}[(\int_0^T \phi(t) \, dW_t)^2] = \mathbb{E}[\int_0^T \phi(t)^2 \, dt]\) is central to the \(L^2\) theory.
- The Itô integral of an \(L^2\) integrand is a continuous square-integrable martingale.
- Progressive measurability and the \(L^2\) condition are essential for the integral to be well-defined.
Topic 7: Itô's Isometry and Its Proof
Itô’s Isometry: Itô’s Isometry is a fundamental result in stochastic calculus that provides an isometric relationship between the space of square-integrable stochastic processes and the space of square-integrable random variables. Specifically, it states that the expected value of the square of the Itô integral of a process is equal to the expected value of the integral of the square of the process.
This property is crucial for ensuring that the Itô integral is well-defined and for computing variances of stochastic integrals.
For a stochastic process \( f(t) \) that is adapted to the filtration generated by a standard Brownian motion \( W_t \), and satisfies the integrability condition:
\[ \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right] < \infty, \]Itô’s Isometry states that:
\[ \mathbb{E}\left[\left(\int_0^T f(t) \, dW_t\right)^2\right] = \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right]. \]Square-Integrable Process: A stochastic process \( f(t) \) is called square-integrable if it satisfies:
\[ \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right] < \infty. \]Proof of Itô’s Isometry
The proof of Itô’s Isometry relies on the properties of the Itô integral and the martingale property of the Brownian motion. Below is a step-by-step derivation:
-
Step 1: Simple Process Approximation
First, consider a simple (step) process \( f_n(t) \) that approximates \( f(t) \). A simple process is of the form:
\[ f_n(t) = \sum_{i=0}^{n-1} f(t_i) \mathbf{1}_{[t_i, t_{i+1})}(t), \]where \( 0 = t_0 < t_1 < \dots < t_n = T \) is a partition of \([0, T]\), and \( f(t_i) \) is \( \mathcal{F}_{t_i} \)-measurable.
-
Step 2: Itô Integral for Simple Processes
The Itô integral of \( f_n(t) \) with respect to \( W_t \) is given by:
\[ \int_0^T f_n(t) \, dW_t = \sum_{i=0}^{n-1} f(t_i) (W_{t_{i+1}} - W_{t_i}). \] -
Step 3: Compute the Square of the Itô Integral
Square the Itô integral and take expectations:
\[ \mathbb{E}\left[\left(\int_0^T f_n(t) \, dW_t\right)^2\right] = \mathbb{E}\left[\left(\sum_{i=0}^{n-1} f(t_i) (W_{t_{i+1}} - W_{t_i})\right)^2\right]. \]Expanding the square, we get:
\[ \mathbb{E}\left[\sum_{i=0}^{n-1} f(t_i)^2 (W_{t_{i+1}} - W_{t_i})^2 + 2 \sum_{i < j} f(t_i) f(t_j) (W_{t_{i+1}} - W_{t_i})(W_{t_{j+1}} - W_{t_j})\right]. \] -
Step 4: Use Independence and Martingale Properties
Since \( W_t \) has independent increments, the cross terms \( (W_{t_{i+1}} - W_{t_i})(W_{t_{j+1}} - W_{t_j}) \) for \( i < j \) are independent of \( f(t_i) \) and \( f(t_j) \). Moreover, \( \mathbb{E}[W_{t_{i+1}} - W_{t_i}] = 0 \), so the expectation of the cross terms vanishes:
\[ \mathbb{E}\left[2 \sum_{i < j} f(t_i) f(t_j) (W_{t_{i+1}} - W_{t_i})(W_{t_{j+1}} - W_{t_j})\right] = 0. \]Thus, we are left with:
\[ \mathbb{E}\left[\sum_{i=0}^{n-1} f(t_i)^2 (W_{t_{i+1}} - W_{t_i})^2\right]. \] -
Step 5: Quadratic Variation of Brownian Motion
The quadratic variation of Brownian motion over \([t_i, t_{i+1}]\) is \( t_{i+1} - t_i \). Therefore:
\[ \mathbb{E}\left[(W_{t_{i+1}} - W_{t_i})^2\right] = t_{i+1} - t_i. \]Substituting this in, we get:
\[ \mathbb{E}\left[\sum_{i=0}^{n-1} f(t_i)^2 (t_{i+1} - t_i)\right] = \mathbb{E}\left[\int_0^T f_n(t)^2 \, dt\right]. \] -
Step 6: Limit as \( n \to \infty \)
Taking the limit as the partition becomes finer (\( n \to \infty \)), the simple process \( f_n(t) \) converges to \( f(t) \), and the Itô integral converges in \( L^2 \). By the dominated convergence theorem, we obtain Itô’s Isometry:
\[ \mathbb{E}\left[\left(\int_0^T f(t) \, dW_t\right)^2\right] = \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right]. \]
Example: Application of Itô’s Isometry
Consider the Itô integral \( \int_0^T W_t \, dW_t \), where \( W_t \) is a standard Brownian motion. Compute its variance using Itô’s Isometry.
Solution:
Here, \( f(t) = W_t \). First, verify the integrability condition:
\[ \mathbb{E}\left[\int_0^T W_t^2 \, dt\right] = \int_0^T \mathbb{E}[W_t^2] \, dt = \int_0^T t \, dt = \frac{T^2}{2} < \infty. \]Now, apply Itô’s Isometry:
\[ \mathbb{E}\left[\left(\int_0^T W_t \, dW_t\right)^2\right] = \mathbb{E}\left[\int_0^T W_t^2 \, dt\right] = \frac{T^2}{2}. \]Thus, the variance of the Itô integral is \( \frac{T^2}{2} \).
Generalized Itô’s Isometry: For two square-integrable processes \( f(t) \) and \( g(t) \), the following holds:
\[ \mathbb{E}\left[\left(\int_0^T f(t) \, dW_t\right) \left(\int_0^T g(t) \, dW_t\right)\right] = \mathbb{E}\left[\int_0^T f(t) g(t) \, dt\right]. \]Practical Applications
-
Variance Calculation: Itô’s Isometry is widely used to compute the variance of stochastic integrals, which is essential in risk management and option pricing.
-
Stochastic Control and Filtering: In stochastic control theory and filtering (e.g., Kalman-Bucy filter), Itô’s Isometry helps in deriving the evolution of error variances.
-
Martingale Representation: Itô’s Isometry is used to show that certain stochastic integrals are martingales, which is a key property in the martingale representation theorem.
-
Numerical Methods: In Monte Carlo simulations of stochastic differential equations (SDEs), Itô’s Isometry ensures the convergence of discrete approximations to the true solution.
Common Pitfalls and Important Notes
-
Adaptedness: Itô’s Isometry only holds if \( f(t) \) is adapted to the filtration generated by \( W_t \). If \( f(t) \) depends on future information, the isometry does not apply.
-
Square-Integrability: The process \( f(t) \) must be square-integrable. If \( \mathbb{E}\left[\int_0^T f(t)^2 \, dt\right] = \infty \), the Itô integral may not be well-defined, and Itô’s Isometry does not hold.
-
Cross Terms: When expanding the square of a sum of Itô integrals, remember that cross terms do not vanish unless the processes are orthogonal or independent. Itô’s Isometry specifically addresses the case of a single integral.
-
Generalization to Multidimensional Case: Itô’s Isometry can be extended to multidimensional Brownian motion. For a vector of independent Brownian motions \( W_t = (W_t^1, \dots, W_t^d) \) and a matrix-valued process \( f(t) \), the isometry becomes:
\[ \mathbb{E}\left[\left\|\int_0^T f(t) \, dW_t\right\|^2\right] = \mathbb{E}\left[\int_0^T \|f(t)\|_F^2 \, dt\right], \]where \( \| \cdot \|_F \) is the Frobenius norm.
-
Connection to \( L^2 \) Space: Itô’s Isometry shows that the Itô integral is an isometry from the space of square-integrable adapted processes (with the norm \( \sqrt{\mathbb{E}\left[\int_0^T f(t)^2 \, dt\right]} \)) to the space of square-integrable random variables (with the \( L^2 \) norm). This is a key reason why the Itô integral is well-behaved.
Topic 8: Itô's Lemma (1D and Multidimensional Versions)
Definition (Itô Process): An Itô process \( X_t \) is a stochastic process that can be expressed as:
\[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t \]where \( W_t \) is a Wiener process (Brownian motion), \( \mu(t, X_t) \) is the drift term, and \( \sigma(t, X_t) \) is the diffusion term.
Definition (Itô’s Lemma): Itô’s Lemma is the stochastic calculus counterpart of the chain rule in ordinary calculus. It provides a way to compute the differential of a function of an Itô process.
1D Itô’s Lemma
Itô’s Lemma (1D): Let \( f(t, x) \) be a twice continuously differentiable function of \( t \) and \( x \), and let \( X_t \) be an Itô process. Then, the process \( Y_t = f(t, X_t) \) is also an Itô process, and its differential is given by:
\[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t \]where \( \mu = \mu(t, X_t) \) and \( \sigma = \sigma(t, X_t) \).
Example (Geometric Brownian Motion): Let \( X_t \) follow a geometric Brownian motion:
\[ dX_t = \mu X_t dt + \sigma X_t dW_t \]Let \( f(t, X_t) = \ln(X_t) \). Compute \( d(\ln(X_t)) \).
Solution: We apply Itô’s Lemma with \( f(t, x) = \ln(x) \). The partial derivatives are:
\[ \frac{\partial f}{\partial t} = 0, \quad \frac{\partial f}{\partial x} = \frac{1}{x}, \quad \frac{\partial^2 f}{\partial x^2} = -\frac{1}{x^2} \]Substituting into Itô’s Lemma:
\[ d(\ln(X_t)) = \left( 0 + \mu X_t \cdot \frac{1}{X_t} + \frac{1}{2} \sigma^2 X_t^2 \cdot \left( -\frac{1}{X_t^2} \right) \right) dt + \sigma X_t \cdot \frac{1}{X_t} dW_t \] \[ d(\ln(X_t)) = \left( \mu - \frac{1}{2} \sigma^2 \right) dt + \sigma dW_t \]Note: The additional term \( \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \) in Itô’s Lemma arises due to the non-zero quadratic variation of the Wiener process \( (dW_t)^2 = dt \). This term is absent in ordinary calculus.
Multidimensional Itô’s Lemma
Definition (Multidimensional Itô Process): A multidimensional Itô process \( \mathbf{X}_t = (X_t^1, X_t^2, \dots, X_t^n) \) is a vector of Itô processes, where each component satisfies:
\[ dX_t^i = \mu_i(t, \mathbf{X}_t) dt + \sum_{j=1}^m \sigma_{ij}(t, \mathbf{X}_t) dW_t^j \]Here, \( \mathbf{W}_t = (W_t^1, W_t^2, \dots, W_t^m) \) is an \( m \)-dimensional Wiener process with independent components, and \( \sigma_{ij} \) is the volatility matrix.
Multidimensional Itô’s Lemma: Let \( f(t, \mathbf{x}) \) be a twice continuously differentiable function of \( t \) and \( \mathbf{x} = (x_1, x_2, \dots, x_n) \), and let \( \mathbf{X}_t \) be an \( n \)-dimensional Itô process. Then, the process \( Y_t = f(t, \mathbf{X}_t) \) is also an Itô process, and its differential is given by:
\[ df(t, \mathbf{X}_t) = \left( \frac{\partial f}{\partial t} + \sum_{i=1}^n \mu_i \frac{\partial f}{\partial x_i} + \frac{1}{2} \sum_{i=1}^n \sum_{j=1}^n \sum_{k=1}^m \sigma_{ik} \sigma_{jk} \frac{\partial^2 f}{\partial x_i \partial x_j} \right) dt + \sum_{i=1}^n \sum_{j=1}^m \sigma_{ij} \frac{\partial f}{\partial x_i} dW_t^j \]where \( \mu_i = \mu_i(t, \mathbf{X}_t) \) and \( \sigma_{ij} = \sigma_{ij}(t, \mathbf{X}_t) \).
Example (Correlated Itô Processes): Let \( \mathbf{X}_t = (X_t^1, X_t^2) \) be a 2D Itô process with:
\[ dX_t^1 = \mu_1 dt + \sigma_1 dW_t^1 \] \[ dX_t^2 = \mu_2 dt + \sigma_2 (\rho dW_t^1 + \sqrt{1 - \rho^2} dW_t^2) \]where \( W_t^1 \) and \( W_t^2 \) are independent Wiener processes, and \( \rho \) is the correlation coefficient. Let \( f(t, \mathbf{X}_t) = X_t^1 X_t^2 \). Compute \( d(X_t^1 X_t^2) \).
Solution: We apply the multidimensional Itô’s Lemma with \( f(t, x_1, x_2) = x_1 x_2 \). The partial derivatives are:
\[ \frac{\partial f}{\partial t} = 0, \quad \frac{\partial f}{\partial x_1} = x_2, \quad \frac{\partial f}{\partial x_2} = x_1 \] \[ \frac{\partial^2 f}{\partial x_1^2} = 0, \quad \frac{\partial^2 f}{\partial x_2^2} = 0, \quad \frac{\partial^2 f}{\partial x_1 \partial x_2} = 1 \]The volatility matrix \( \sigma \) is:
\[ \sigma = \begin{pmatrix} \sigma_1 & 0 \\ \rho \sigma_2 & \sqrt{1 - \rho^2} \sigma_2 \end{pmatrix} \]Substituting into Itô’s Lemma:
\[ d(X_t^1 X_t^2) = \left( 0 + \mu_1 X_t^2 + \mu_2 X_t^1 + \frac{1}{2} \cdot 2 \cdot \sigma_1 \cdot \rho \sigma_2 \cdot 1 \right) dt + \sigma_1 X_t^2 dW_t^1 + \rho \sigma_2 X_t^1 dW_t^1 + \sqrt{1 - \rho^2} \sigma_2 X_t^1 dW_t^2 \] \[ d(X_t^1 X_t^2) = \left( \mu_1 X_t^2 + \mu_2 X_t^1 + \rho \sigma_1 \sigma_2 \right) dt + (\sigma_1 X_t^2 + \rho \sigma_2 X_t^1) dW_t^1 + \sqrt{1 - \rho^2} \sigma_2 X_t^1 dW_t^2 \]Note: In the multidimensional case, the quadratic variation terms \( dW_t^i dW_t^j \) are given by:
\[ dW_t^i dW_t^j = \delta_{ij} dt \]where \( \delta_{ij} \) is the Kronecker delta. For correlated Wiener processes, the covariance structure must be explicitly accounted for in the volatility matrix \( \sigma \).
Derivation of Itô’s Lemma (1D)
Step-by-Step Derivation: The derivation of Itô’s Lemma relies on a Taylor expansion of \( f(t, X_t) \) and the properties of the Wiener process.
-
Taylor Expansion: Expand \( f(t + dt, X_t + dX_t) \) around \( (t, X_t) \) up to second order:
\[ f(t + dt, X_t + dX_t) = f(t, X_t) + \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial x} dX_t + \frac{1}{2} \frac{\partial^2 f}{\partial t^2} (dt)^2 + \frac{\partial^2 f}{\partial t \partial x} dt dX_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} (dX_t)^2 + \text{higher-order terms} \] -
Substitute \( dX_t \): Replace \( dX_t \) with \( \mu dt + \sigma dW_t \):
\[ dX_t = \mu dt + \sigma dW_t \] \[ (dX_t)^2 = \mu^2 (dt)^2 + 2 \mu \sigma dt dW_t + \sigma^2 (dW_t)^2 \] -
Simplify Using Wiener Process Properties: Recall that \( (dW_t)^2 = dt \) and \( dt dW_t = 0 \) (since \( dW_t \) is of order \( \sqrt{dt} \)). Higher-order terms like \( (dt)^2 \) and \( dt dW_t \) are negligible:
\[ (dX_t)^2 = \sigma^2 dt + \text{negligible terms} \] -
Combine Terms: Substitute back into the Taylor expansion and retain only terms of order \( dt \) or \( dW_t \):
\[ df(t, X_t) = \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial x} (\mu dt + \sigma dW_t) + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} \sigma^2 dt \] \[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t \]
Practical Applications
Application 1: Black-Scholes PDE
The Black-Scholes equation for option pricing can be derived using Itô’s Lemma. Let \( S_t \) follow geometric Brownian motion:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t \]Consider a European call option \( V(t, S_t) \). Applying Itô’s Lemma to \( V(t, S_t) \):
\[ dV = \left( \frac{\partial V}{\partial t} + \mu S_t \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} \right) dt + \sigma S_t \frac{\partial V}{\partial S} dW_t \]By constructing a risk-free portfolio and applying no-arbitrage arguments, the Black-Scholes PDE is obtained:
\[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + r S \frac{\partial V}{\partial S} - r V = 0 \]Application 2: Change of Measure (Girsanov’s Theorem)
Itô’s Lemma is used in the derivation of Girsanov’s Theorem, which describes how the dynamics of a stochastic process change under a change of measure. For example, transforming the drift of a Brownian motion under the risk-neutral measure.
Application 3: Interest Rate Models
In short-rate models (e.g., Vasicek or CIR model), Itô’s Lemma is used to derive the dynamics of bond prices or other interest rate derivatives. For example, in the Vasicek model:
\[ dr_t = a(b - r_t) dt + \sigma dW_t \]The price of a zero-coupon bond \( P(t, T) \) can be expressed as a function of \( r_t \), and Itô’s Lemma is applied to derive its dynamics.
Common Pitfalls and Important Notes
Pitfall 1: Ignoring the Second-Order Term
A common mistake is to apply the ordinary chain rule and ignore the second-order term \( \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \). This term is crucial in stochastic calculus due to the non-zero quadratic variation of the Wiener process.
Pitfall 2: Incorrect Handling of Correlations
In the multidimensional case, failing to account for correlations between Wiener processes can lead to incorrect results. Always verify the covariance structure of the Wiener processes and adjust the volatility matrix \( \sigma \) accordingly.
Pitfall 3: Differentiability Assumptions
Itô’s Lemma requires the function \( f(t, x) \) to be twice continuously differentiable. If \( f \) is not smooth (e.g., \( f(x) = \max(x - K, 0) \) for a call option payoff), Itô’s Lemma cannot be directly applied. In such cases, the Itô-Tanaka formula or other techniques may be required.
Important Note: Stratonovich vs. Itô Calculus
Itô’s Lemma is specific to Itô calculus, where the integral is defined in the Itô sense. In Stratonovich calculus, the chain rule takes a different form (similar to ordinary calculus), and the additional second-order term does not appear. Always clarify which calculus is being used.
Important Note: Quadratic Variation
The key insight behind Itô’s Lemma is the non-zero quadratic variation of the Wiener process. For a process \( X_t \), the quadratic variation \( [X]_t \) is defined as:
\[ [X]_t = \lim_{\|\Pi\| \to 0} \sum_{i=1}^n (X_{t_i} - X_{t_{i-1}})^2 \]For a Wiener process, \( [W]_t = t \). This property is fundamental to the additional term in Itô’s Lemma.
Topic 9: Derivation of Itô's Lemma via Taylor Expansion
Itô’s Lemma (Itô’s Formula): A fundamental result in stochastic calculus that provides a way to compute the differential of a function of a stochastic process. It generalizes the chain rule from ordinary calculus to stochastic processes driven by Brownian motion.
Stochastic Process: A collection of random variables indexed by time, often denoted \( X_t \). In this context, we focus on Itô processes, which are solutions to stochastic differential equations (SDEs).
Brownian Motion (Wiener Process): A continuous-time stochastic process \( W_t \) with the following properties:
- \( W_0 = 0 \) almost surely.
- Independent increments: \( W_t - W_s \) is independent of \( \mathcal{F}_s \) for \( 0 \leq s < t \), where \( \mathcal{F}_s \) is the filtration up to time \( s \).
- Gaussian increments: \( W_t - W_s \sim \mathcal{N}(0, t-s) \).
- Continuous paths: \( t \mapsto W_t \) is continuous almost surely.
Itô Process: A stochastic process \( X_t \) that satisfies the SDE: \[ dX_t = \mu(t, X_t) \, dt + \sigma(t, X_t) \, dW_t, \] where \( \mu(t, X_t) \) is the drift term, \( \sigma(t, X_t) \) is the diffusion term, and \( W_t \) is a Brownian motion.
Taylor Expansion (Multivariable): For a twice continuously differentiable function \( f(t, x) \), the Taylor expansion around a point \( (t, x) \) is: \[ f(t + \Delta t, x + \Delta x) = f(t, x) + \frac{\partial f}{\partial t} \Delta t + \frac{\partial f}{\partial x} \Delta x + \frac{1}{2} \frac{\partial^2 f}{\partial t^2} (\Delta t)^2 + \frac{\partial^2 f}{\partial t \partial x} \Delta t \Delta x + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} (\Delta x)^2 + \text{higher-order terms}. \]
Itô’s Lemma: Let \( X_t \) be an Itô process given by \( dX_t = \mu \, dt + \sigma \, dW_t \), and let \( f(t, x) \) be a twice continuously differentiable function. Then, the process \( Y_t = f(t, X_t) \) is also an Itô process, and its differential is given by: \[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t. \] In integral form: \[ f(t, X_t) = f(0, X_0) + \int_0^t \left( \frac{\partial f}{\partial s} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) ds + \int_0^t \sigma \frac{\partial f}{\partial x} dW_s. \]
Quadratic Variation: For a stochastic process \( X_t \), the quadratic variation \( [X]_t \) is defined as the limit (in probability) of the sum of squared increments: \[ [X]_t = \lim_{\|\Pi\| \to 0} \sum_{i=1}^n (X_{t_i} - X_{t_{i-1}})^2, \] where \( \Pi = \{0 = t_0 < t_1 < \dots < t_n = t\} \) is a partition of \( [0, t] \), and \( \|\Pi\| = \max_i (t_i - t_{i-1}) \). For Brownian motion, \( [W]_t = t \).
Step-by-Step Derivation of Itô’s Lemma:
Start with the Taylor Expansion: Consider a function \( f(t, X_t) \). For small increments \( \Delta t \) and \( \Delta X_t \), the Taylor expansion is: \[ \Delta f = f(t + \Delta t, X_t + \Delta X_t) - f(t, X_t) \approx \frac{\partial f}{\partial t} \Delta t + \frac{\partial f}{\partial x} \Delta X_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} (\Delta X_t)^2 + \frac{\partial^2 f}{\partial t \partial x} \Delta t \Delta X_t + \frac{1}{2} \frac{\partial^2 f}{\partial t^2} (\Delta t)^2. \]
Substitute the Itô Process: Replace \( \Delta X_t \) with the increment of the Itô process: \[ \Delta X_t = \mu \Delta t + \sigma \Delta W_t, \] where \( \Delta W_t \sim \mathcal{N}(0, \Delta t) \). Thus: \[ \Delta f \approx \frac{\partial f}{\partial t} \Delta t + \frac{\partial f}{\partial x} (\mu \Delta t + \sigma \Delta W_t) + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} (\mu \Delta t + \sigma \Delta W_t)^2. \]
Expand and Simplify: Expand the squared term: \[ (\mu \Delta t + \sigma \Delta W_t)^2 = \mu^2 (\Delta t)^2 + 2 \mu \sigma \Delta t \Delta W_t + \sigma^2 (\Delta W_t)^2. \] Substitute back into \( \Delta f \): \[ \Delta f \approx \frac{\partial f}{\partial t} \Delta t + \frac{\partial f}{\partial x} \mu \Delta t + \frac{\partial f}{\partial x} \sigma \Delta W_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} \left( \mu^2 (\Delta t)^2 + 2 \mu \sigma \Delta t \Delta W_t + \sigma^2 (\Delta W_t)^2 \right). \]
Take Limits and Use Properties of Brownian Motion: As \( \Delta t \to 0 \), the following hold:
- \( (\Delta t)^2 \to 0 \) and \( \Delta t \Delta W_t \to 0 \) (since \( \Delta W_t \sim \sqrt{\Delta t} \)).
- \( (\Delta W_t)^2 \to \Delta t \) in \( L^2 \) (due to the quadratic variation of Brownian motion).
Pass to Differential Form: Taking the limit as \( \Delta t \to 0 \), we obtain the differential form of Itô’s Lemma: \[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t. \]
Example: Application of Itô’s Lemma to Geometric Brownian Motion
Let \( X_t \) follow a geometric Brownian motion (GBM): \[ dX_t = \mu X_t \, dt + \sigma X_t \, dW_t. \] Let \( f(x) = \ln x \). We apply Itô’s Lemma to \( f(X_t) \):
- Compute the partial derivatives: \[ \frac{\partial f}{\partial t} = 0, \quad \frac{\partial f}{\partial x} = \frac{1}{x}, \quad \frac{\partial^2 f}{\partial x^2} = -\frac{1}{x^2}. \]
- Substitute into Itô’s Lemma: \[ d(\ln X_t) = \left( 0 + \mu X_t \cdot \frac{1}{X_t} + \frac{1}{2} \sigma^2 X_t^2 \cdot \left( -\frac{1}{X_t^2} \right) \right) dt + \sigma X_t \cdot \frac{1}{X_t} dW_t. \] Simplify: \[ d(\ln X_t) = \left( \mu - \frac{1}{2} \sigma^2 \right) dt + \sigma dW_t. \]
- Integrate both sides to obtain the solution for \( \ln X_t \): \[ \ln X_t = \ln X_0 + \left( \mu - \frac{1}{2} \sigma^2 \right) t + \sigma W_t. \] Exponentiating both sides gives the solution for \( X_t \): \[ X_t = X_0 \exp \left( \left( \mu - \frac{1}{2} \sigma^2 \right) t + \sigma W_t \right). \]
General Multivariate Itô’s Lemma: Let \( \mathbf{X}_t = (X_t^1, \dots, X_t^n) \) be an \( n \)-dimensional Itô process with dynamics: \[ dX_t^i = \mu_i \, dt + \sum_{j=1}^m \sigma_{ij} \, dW_t^j, \quad i = 1, \dots, n, \] where \( W_t^1, \dots, W_t^m \) are independent Brownian motions. Let \( f(t, \mathbf{x}) \) be a twice continuously differentiable function. Then: \[ df(t, \mathbf{X}_t) = \left( \frac{\partial f}{\partial t} + \sum_{i=1}^n \mu_i \frac{\partial f}{\partial x_i} + \frac{1}{2} \sum_{i=1}^n \sum_{j=1}^n \sum_{k=1}^m \sigma_{ik} \sigma_{jk} \frac{\partial^2 f}{\partial x_i \partial x_j} \right) dt + \sum_{i=1}^n \sum_{j=1}^m \sigma_{ij} \frac{\partial f}{\partial x_i} dW_t^j. \]
Key Notes and Common Pitfalls:
- Non-Zero Quadratic Variation: The critical difference between Itô’s Lemma and the ordinary chain rule is the \( \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \) term, which arises because \( (dW_t)^2 = dt \). This term is often forgotten in applications.
- Differentiability Requirements: Itô’s Lemma requires \( f(t, x) \) to be twice continuously differentiable in \( x \) and once in \( t \). If \( f \) is not smooth enough, the lemma may not apply.
- Interpretation of \( dW_t \): The term \( dW_t \) represents an infinitesimal increment of Brownian motion and should not be treated as a deterministic differential. It has mean 0 and variance \( dt \).
- Higher-Order Terms: In the Taylor expansion, terms of order \( (\Delta t)^2 \), \( \Delta t \Delta W_t \), and \( (\Delta W_t)^3 \) vanish as \( \Delta t \to 0 \). Only \( (\Delta W_t)^2 \) contributes to the limit.
- Correlation in Multivariate Case: In the multivariate Itô’s Lemma, the quadratic covariation terms \( dW_t^i dW_t^j = \rho_{ij} dt \) (where \( \rho_{ij} \) is the correlation between \( W_t^i \) and \( W_t^j \)) must be accounted for if the Brownian motions are correlated.
- Stratonovich vs. Itô Calculus: Itô’s Lemma is specific to Itô calculus. In Stratonovich calculus, the chain rule resembles the ordinary chain rule, but the interpretation of the SDE is different. The two calculi are related by a correction term.
Practical Applications of Itô’s Lemma:
Option Pricing: Itô’s Lemma is used to derive the Black-Scholes PDE for European options. For example, applying Itô’s Lemma to the price of a stock \( S_t \) and a derivative \( V(t, S_t) \) allows one to eliminate the stochastic term and derive a deterministic PDE.
Change of Measure: In the derivation of the Girsanov theorem, Itô’s Lemma is used to compute the dynamics of the Radon-Nikodym derivative (the change of measure process).
Stochastic Control: Itô’s Lemma is used in dynamic programming and the Hamilton-Jacobi-Bellman (HJB) equation to derive optimal control policies for stochastic systems.
Interest Rate Models: In models like the Vasicek or CIR model, Itô’s Lemma is used to derive the dynamics of bond prices or other interest rate derivatives.
Portfolio Optimization: Itô’s Lemma is applied to the wealth process of a portfolio to derive conditions for optimal investment strategies (e.g., Merton’s portfolio problem).
Quant Interview Question: Derive the Dynamics of \( Y_t = X_t^2 \)
Problem: Let \( X_t \) be an Itô process given by \( dX_t = \mu \, dt + \sigma \, dW_t \). Use Itô’s Lemma to find the SDE for \( Y_t = X_t^2 \).
Solution:
- Define \( f(x) = x^2 \). Compute the partial derivatives: \[ \frac{\partial f}{\partial t} = 0, \quad \frac{\partial f}{\partial x} = 2x, \quad \frac{\partial^2 f}{\partial x^2} = 2. \]
- Apply Itô’s Lemma: \[ dY_t = d(f(X_t)) = \left( 0 + \mu \cdot 2X_t + \frac{1}{2} \sigma^2 \cdot 2 \right) dt + \sigma \cdot 2X_t \, dW_t. \] Simplify: \[ dY_t = (2 \mu X_t + \sigma^2) dt + 2 \sigma X_t \, dW_t. \]
- Express in terms of \( Y_t \): \[ dY_t = (2 \mu \sqrt{Y_t} + \sigma^2) dt + 2 \sigma \sqrt{Y_t} \, dW_t. \] (Note: This form is valid only when \( X_t \geq 0 \). For general \( X_t \), the square root is not well-defined, and the SDE is typically left in terms of \( X_t \).)
Topic 10: Geometric Brownian Motion (GBM) and Its SDE
Geometric Brownian Motion (GBM): A continuous-time stochastic process used to model the evolution of stock prices, interest rates, and other financial variables. It is characterized by a drift term (representing the average growth rate) and a volatility term (representing the random fluctuations). GBM ensures that the modeled quantity remains positive, making it particularly suitable for financial applications.
Stochastic Differential Equation (SDE) for GBM: The SDE describing GBM is given by:
\[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t \] where:- \(S_t\) is the stochastic process (e.g., stock price) at time \(t\),
- \(\mu\) is the drift coefficient (average growth rate),
- \(\sigma\) is the volatility coefficient,
- \(W_t\) is a standard Wiener process (Brownian motion).
Key Formulas for GBM
1. Solution to the GBM SDE:
\[ S_t = S_0 \exp\left( \left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W_t \right) \]This is the closed-form solution to the GBM SDE, where \(S_0\) is the initial value of the process.
2. Expected Value of \(S_t\):
\[ \mathbb{E}[S_t] = S_0 e^{\mu t} \]The expected value grows exponentially at rate \(\mu\).
3. Variance of \(S_t\):
\[ \text{Var}(S_t) = S_0^2 e^{2\mu t} \left(e^{\sigma^2 t} - 1\right) \]The variance depends on both \(\mu\) and \(\sigma\).
4. Log-Normal Distribution:
\[ \log\left(\frac{S_t}{S_0}\right) \sim \mathcal{N}\left( \left(\mu - \frac{\sigma^2}{2}\right)t, \sigma^2 t \right) \]The logarithm of the price ratio follows a normal distribution.
Derivation of the GBM Solution
We derive the solution to the GBM SDE using Itô's Lemma. Start with the SDE:
\[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t \]Step 1: Apply Itô's Lemma to \(f(S_t) = \log(S_t)\)
Itô's Lemma states that for a function \(f(S_t, t)\):
\[ df(S_t, t) = \left( \frac{\partial f}{\partial t} + \mu S_t \frac{\partial f}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 f}{\partial S^2} \right) dt + \sigma S_t \frac{\partial f}{\partial S} dW_t \]For \(f(S_t) = \log(S_t)\), the partial derivatives are:
\[ \frac{\partial f}{\partial S} = \frac{1}{S_t}, \quad \frac{\partial^2 f}{\partial S^2} = -\frac{1}{S_t^2}, \quad \frac{\partial f}{\partial t} = 0 \]Substituting into Itô's Lemma:
\[ d(\log S_t) = \left( 0 + \mu S_t \cdot \frac{1}{S_t} + \frac{1}{2} \sigma^2 S_t^2 \cdot \left(-\frac{1}{S_t^2}\right) \right) dt + \sigma S_t \cdot \frac{1}{S_t} dW_t \]Simplifying:
\[ d(\log S_t) = \left( \mu - \frac{\sigma^2}{2} \right) dt + \sigma dW_t \]Step 2: Integrate Both Sides
Integrate from \(0\) to \(t\):
\[ \int_0^t d(\log S_u) = \int_0^t \left( \mu - \frac{\sigma^2}{2} \right) du + \int_0^t \sigma dW_u \]This yields:
\[ \log S_t - \log S_0 = \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma W_t \]Exponentiating both sides gives the solution:
\[ S_t = S_0 \exp\left( \left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W_t \right) \]Example: Simulating GBM Paths
To simulate a path of GBM, discretize time into small intervals \(\Delta t\) and use the following recursive formula:
\[ S_{t + \Delta t} = S_t \exp\left( \left(\mu - \frac{\sigma^2}{2}\right) \Delta t + \sigma \sqrt{\Delta t} \, Z \right) \]where \(Z \sim \mathcal{N}(0, 1)\) is a standard normal random variable. Here’s a step-by-step simulation:
- Set \(S_0 = 100\), \(\mu = 0.05\), \(\sigma = 0.2\), \(T = 1\), and \(\Delta t = 0.01\) (100 steps).
- For each step \(i = 1\) to \(100\):
- Generate \(Z \sim \mathcal{N}(0, 1)\).
- Compute \(S_{i \Delta t} = S_{(i-1)\Delta t} \exp\left( \left(0.05 - \frac{0.2^2}{2}\right) \cdot 0.01 + 0.2 \sqrt{0.01} \, Z \right)\).
- Plot \(S_t\) over time to visualize the GBM path.
Applications of GBM
- Black-Scholes Model: GBM is the underlying assumption for the stock price in the Black-Scholes option pricing model. The solution to the GBM SDE is used to derive the Black-Scholes formula for European options.
- Risk Management: GBM is used to model asset prices for Value-at-Risk (VaR) calculations and stress testing.
- Portfolio Optimization: GBM helps in modeling the dynamics of asset returns for mean-variance optimization and other portfolio strategies.
- Interest Rate Modeling: While more complex models (e.g., Vasicek, CIR) are often used, GBM can serve as a simple model for interest rate dynamics in some contexts.
- Real Options Analysis: GBM is used to model the underlying asset in real options problems, such as investment timing or abandonment decisions.
Common Pitfalls and Important Notes
- Positivity of \(S_t\): GBM ensures \(S_t > 0\) for all \(t\), which is a desirable property for modeling prices. However, this is not guaranteed for all SDEs (e.g., arithmetic Brownian motion can yield negative values).
- Log-Normality: The log-returns \(\log(S_t / S_0)\) are normally distributed, but the returns \(S_t / S_0 - 1\) are not. This is a common source of confusion.
- Drift and Volatility: The drift \(\mu\) represents the expected return, while the volatility \(\sigma\) represents the standard deviation of the log-returns. The term \(\mu - \sigma^2/2\) is often called the "risk-neutral drift" in derivative pricing.
- Itô's Lemma: The derivation of the GBM solution relies heavily on Itô's Lemma. Misapplying Itô's Lemma (e.g., forgetting the second-order term) is a common mistake.
- Numerical Simulation: When simulating GBM, ensure that the time step \(\Delta t\) is small enough to avoid discretization errors. For large \(\Delta t\), the simulation may deviate significantly from the true GBM path.
- Mean Reversion: GBM does not exhibit mean reversion, which may not be realistic for some assets (e.g., interest rates or commodities). In such cases, models like the Ornstein-Uhlenbeck process may be more appropriate.
- Correlated GBMs: When modeling multiple assets with GBM, their Brownian motions may be correlated. This is handled by introducing a correlation matrix for the Wiener processes.
Quant Interview Question: GBM and Option Pricing
Question: Consider a stock whose price \(S_t\) follows a GBM with drift \(\mu = 0.1\) and volatility \(\sigma = 0.3\). The current stock price is \(S_0 = 50\). What is the probability that the stock price will be above \(60\) in one year?
Solution:
From the GBM solution, we know that:
\[ \log\left(\frac{S_t}{S_0}\right) \sim \mathcal{N}\left( \left(\mu - \frac{\sigma^2}{2}\right)t, \sigma^2 t \right) \]For \(t = 1\), \(\mu = 0.1\), and \(\sigma = 0.3\):
\[ \log\left(\frac{S_1}{50}\right) \sim \mathcal{N}\left( \left(0.1 - \frac{0.3^2}{2}\right) \cdot 1, 0.3^2 \cdot 1 \right) = \mathcal{N}(0.055, 0.09) \]We want to find \(P(S_1 > 60)\):
\[ P(S_1 > 60) = P\left(\log\left(\frac{S_1}{50}\right) > \log\left(\frac{60}{50}\right)\right) = P\left(\log\left(\frac{S_1}{50}\right) > 0.1823\right) \]Standardize the normal variable:
\[ Z = \frac{\log(S_1 / 50) - 0.055}{\sqrt{0.09}} \sim \mathcal{N}(0, 1) \]Thus:
\[ P(S_1 > 60) = P\left(Z > \frac{0.1823 - 0.055}{0.3}\right) = P(Z > 0.4243) \]Using standard normal tables or a calculator, \(P(Z > 0.4243) \approx 0.3355\). So, the probability is approximately 33.55%.
Topic 11: Solution to GBM: Log-Normal Distribution of Stock Prices
Geometric Brownian Motion (GBM): A continuous-time stochastic process used to model stock prices in the Black-Scholes framework. It is defined by the stochastic differential equation (SDE):
\[ dS_t = \mu S_t dt + \sigma S_t dW_t \]where:
- \( S_t \) is the stock price at time \( t \),
- \( \mu \) is the drift (expected return),
- \( \sigma \) is the volatility,
- \( W_t \) is a Wiener process (standard Brownian motion).
Log-Normal Distribution: A random variable \( X \) is log-normally distributed if \( \ln(X) \) is normally distributed. The probability density function (PDF) of \( X \) is:
\[ f_X(x) = \frac{1}{x \sigma \sqrt{2 \pi}} \exp \left( -\frac{(\ln x - \mu)^2}{2 \sigma^2} \right), \quad x > 0 \]where \( \mu \) and \( \sigma \) are the mean and standard deviation of \( \ln(X) \), respectively.
Solution to GBM: The solution to the GBM SDE is given by:
\[ S_t = S_0 \exp \left( \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma W_t \right) \]where \( S_0 \) is the initial stock price.
Log-Normal Distribution of \( S_t \): The stock price \( S_t \) at time \( t \) follows a log-normal distribution with parameters:
\[ \ln(S_t) \sim \mathcal{N} \left( \ln(S_0) + \left( \mu - \frac{\sigma^2}{2} \right) t, \sigma^2 t \right) \]Equivalently, \( S_t \) can be expressed as:
\[ S_t = S_0 \exp \left( \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma \sqrt{t} Z \right), \quad Z \sim \mathcal{N}(0,1) \]Derivation of the GBM Solution:
We solve the GBM SDE \( dS_t = \mu S_t dt + \sigma S_t dW_t \) using Itô's Lemma.
- Define \( X_t = \ln(S_t) \). Apply Itô's Lemma to \( X_t \): \[ dX_t = \frac{\partial X_t}{\partial S_t} dS_t + \frac{1}{2} \frac{\partial^2 X_t}{\partial S_t^2} (dS_t)^2 \]
- Compute the partial derivatives: \[ \frac{\partial X_t}{\partial S_t} = \frac{1}{S_t}, \quad \frac{\partial^2 X_t}{\partial S_t^2} = -\frac{1}{S_t^2} \]
- Substitute \( dS_t \) and \( (dS_t)^2 = \sigma^2 S_t^2 dt \): \[ dX_t = \frac{1}{S_t} (\mu S_t dt + \sigma S_t dW_t) + \frac{1}{2} \left( -\frac{1}{S_t^2} \right) \sigma^2 S_t^2 dt \] \[ dX_t = \mu dt + \sigma dW_t - \frac{\sigma^2}{2} dt \] \[ dX_t = \left( \mu - \frac{\sigma^2}{2} \right) dt + \sigma dW_t \]
- Integrate both sides from \( 0 \) to \( t \): \[ X_t - X_0 = \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma (W_t - W_0) \]
- Since \( W_0 = 0 \) and \( X_0 = \ln(S_0) \), we have: \[ \ln(S_t) = \ln(S_0) + \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma W_t \]
- Exponentiate both sides to solve for \( S_t \): \[ S_t = S_0 \exp \left( \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma W_t \right) \]
Example: Expected Value and Variance of \( S_t \)
Using the log-normal distribution of \( S_t \), we can compute its moments.
- Expected Value: \[ \mathbb{E}[S_t] = S_0 e^{\mu t} \]
- Variance: \[ \text{Var}(S_t) = S_0^2 e^{2 \mu t} \left( e^{\sigma^2 t} - 1 \right) \]
Derivation: Since \( \ln(S_t) \sim \mathcal{N}(\ln(S_0) + (\mu - \sigma^2/2)t, \sigma^2 t) \), the expected value of \( S_t \) is:
\[ \mathbb{E}[S_t] = \exp \left( \mathbb{E}[\ln(S_t)] + \frac{\text{Var}(\ln(S_t))}{2} \right) \] \[ \mathbb{E}[S_t] = \exp \left( \ln(S_0) + \left( \mu - \frac{\sigma^2}{2} \right) t + \frac{\sigma^2 t}{2} \right) = S_0 e^{\mu t} \]Derivation: The variance of \( S_t \) is given by:
\[ \text{Var}(S_t) = \mathbb{E}[S_t^2] - (\mathbb{E}[S_t])^2 \] \[ \mathbb{E}[S_t^2] = \exp \left( 2 \mathbb{E}[\ln(S_t)] + 2 \text{Var}(\ln(S_t)) \right) = S_0^2 e^{2 \mu t + \sigma^2 t} \] \[ \text{Var}(S_t) = S_0^2 e^{2 \mu t + \sigma^2 t} - S_0^2 e^{2 \mu t} = S_0^2 e^{2 \mu t} \left( e^{\sigma^2 t} - 1 \right) \]Practical Applications:
- Black-Scholes Option Pricing: The log-normal distribution of stock prices is a fundamental assumption in the Black-Scholes model for pricing European options.
- Risk Management: Value-at-Risk (VaR) calculations often assume log-normal returns to estimate potential losses.
- Portfolio Optimization: The log-normal distribution is used to model asset returns in mean-variance optimization frameworks.
- Monte Carlo Simulations: GBM is widely used in Monte Carlo simulations to generate future stock price paths for option pricing and risk assessment.
Common Pitfalls and Important Notes:
- Drift Adjustment: The term \( \mu - \sigma^2/2 \) in the exponent is crucial. Forgetting the \( -\sigma^2/2 \) term leads to incorrect expected values and variances.
- Log-Normal vs. Normal: Stock prices are log-normally distributed, not normally distributed. Assuming normality can lead to negative prices, which are nonsensical.
- Volatility Scaling: The variance of \( \ln(S_t) \) scales linearly with time \( t \), but the variance of \( S_t \) grows exponentially with \( t \). This is important for long-term risk assessments.
- Initial Condition: The solution \( S_t \) depends on the initial condition \( S_0 \). Ensure \( S_0 \) is correctly specified in calculations.
- Numerical Stability: When simulating GBM paths, use the log-normal form \( S_t = S_0 \exp(\cdot) \) rather than discretizing the SDE directly to avoid numerical instability and negative values.
Common Quant Interview Questions:
-
Derive the solution to the GBM SDE.
Expected Answer: Use Itô's Lemma on \( \ln(S_t) \) to transform the SDE into a solvable form, then exponentiate to recover \( S_t \). See the derivation section above.
-
Why are stock prices modeled as log-normal rather than normal?
Expected Answer: Stock prices cannot be negative, and log-normal distributions ensure positivity. Additionally, log-normal distributions better capture the multiplicative nature of returns.
-
What is the expected value of \( S_t \) under GBM?
Expected Answer: \( \mathbb{E}[S_t] = S_0 e^{\mu t} \). This is derived from the properties of the log-normal distribution.
-
Explain the role of the \( -\sigma^2/2 \) term in the GBM solution.
Expected Answer: The \( -\sigma^2/2 \) term arises from the quadratic variation of the Itô process. It ensures that the expected value of \( S_t \) is \( S_0 e^{\mu t} \), as the exponential of a normal random variable includes a variance adjustment.
-
How would you simulate a GBM path?
Expected Answer: Discretize time into small intervals \( \Delta t \), generate independent normal random variables \( Z_i \sim \mathcal{N}(0,1) \), and iteratively compute:
\[ S_{t+\Delta t} = S_t \exp \left( \left( \mu - \frac{\sigma^2}{2} \right) \Delta t + \sigma \sqrt{\Delta t} Z_i \right) \]
Topic 12: Ornstein-Uhlenbeck (OU) Process and Mean-Reverting SDE
Ornstein-Uhlenbeck (OU) Process: A stochastic process that models mean-reverting behavior, commonly used in finance to describe interest rates, volatility, and other quantities that tend to return to a long-term average. It is a solution to the following linear stochastic differential equation (SDE):
\[ dX_t = \theta (\mu - X_t) dt + \sigma dW_t \] where:- \(X_t\) is the process value at time \(t\),
- \(\theta > 0\) is the speed of mean reversion,
- \(\mu\) is the long-term mean,
- \(\sigma > 0\) is the volatility,
- \(W_t\) is a standard Wiener process (Brownian motion).
Mean-Reverting SDE: A general class of SDEs where the drift term pushes the process toward a long-term equilibrium level. The OU process is the simplest and most widely used example of a mean-reverting SDE.
Solution to the OU Process: The OU process has an explicit solution given by:
\[ X_t = X_0 e^{-\theta t} + \mu (1 - e^{-\theta t}) + \sigma \int_0^t e^{-\theta (t-s)} dW_s \] where \(X_0\) is the initial value of the process.Mean and Variance of the OU Process: For \(X_t\) as defined above:
- Mean: \(\mathbb{E}[X_t] = X_0 e^{-\theta t} + \mu (1 - e^{-\theta t})\)
- Variance: \(\text{Var}(X_t) = \frac{\sigma^2}{2\theta} (1 - e^{-2\theta t})\)
As \(t \to \infty\), the process becomes stationary with:
- Long-term mean: \(\mathbb{E}[X_t] \to \mu\)
- Long-term variance: \(\text{Var}(X_t) \to \frac{\sigma^2}{2\theta}\)
Derivation of the OU Process Solution
The OU SDE is:
\[ dX_t = \theta (\mu - X_t) dt + \sigma dW_t \]This is a linear SDE of the form:
\[ dX_t = (a(t) X_t + b(t)) dt + c(t) dW_t \] where \(a(t) = -\theta\), \(b(t) = \theta \mu\), and \(c(t) = \sigma\).Step 1: Homogeneous Solution
Solve the homogeneous equation:
\[ dX_t = -\theta X_t dt \]The solution is:
\[ X_t^H = X_0 e^{-\theta t} \]Step 2: Particular Solution
Assume a particular solution of the form \(X_t^P = C\) (constant). Substituting into the SDE:
\[ 0 = \theta (\mu - C) dt \implies C = \mu \]Thus, \(X_t^P = \mu\).
Step 3: General Solution Using Variation of Constants
Assume the solution is of the form:
\[ X_t = X_t^H + X_t^P + Y_t = X_0 e^{-\theta t} + \mu + Y_t \]Substitute into the SDE:
\[ dX_t = -\theta X_0 e^{-\theta t} dt + dY_t = \theta (\mu - X_t) dt + \sigma dW_t \]Simplify:
\[ dY_t = \sigma dW_t \]Thus, \(Y_t = \sigma \int_0^t e^{-\theta (t-s)} dW_s\) (using the integrating factor method).
Final Solution:
\[ X_t = X_0 e^{-\theta t} + \mu (1 - e^{-\theta t}) + \sigma \int_0^t e^{-\theta (t-s)} dW_s \]Derivation of Mean and Variance
Mean:
Take expectations of the solution:
\[ \mathbb{E}[X_t] = X_0 e^{-\theta t} + \mu (1 - e^{-\theta t}) + \sigma \mathbb{E}\left[\int_0^t e^{-\theta (t-s)} dW_s\right] \]Since the Itô integral has zero mean:
\[ \mathbb{E}[X_t] = X_0 e^{-\theta t} + \mu (1 - e^{-\theta t}) \]Variance:
Compute \(\text{Var}(X_t) = \mathbb{E}[(X_t - \mathbb{E}[X_t])^2]\). From the solution:
\[ X_t - \mathbb{E}[X_t] = \sigma \int_0^t e^{-\theta (t-s)} dW_s \]Using Itô isometry:
\[ \text{Var}(X_t) = \sigma^2 \mathbb{E}\left[\left(\int_0^t e^{-\theta (t-s)} dW_s\right)^2\right] = \sigma^2 \int_0^t e^{-2\theta (t-s)} ds = \frac{\sigma^2}{2\theta} (1 - e^{-2\theta t}) \]Example: Simulating an OU Process
Consider an OU process with parameters \(\theta = 1.5\), \(\mu = 0.05\), \(\sigma = 0.1\), and \(X_0 = 0.03\). Simulate \(X_t\) over \(t \in [0, 1]\) with \(\Delta t = 0.01\).
Euler-Maruyama Discretization:
\[ X_{t + \Delta t} = X_t + \theta (\mu - X_t) \Delta t + \sigma \sqrt{\Delta t} Z_t, \quad Z_t \sim \mathcal{N}(0, 1) \]This discretization approximates the continuous SDE for small \(\Delta t\).
Practical Applications
- Interest Rate Modeling: The OU process is used in the Vasicek model to describe the evolution of interest rates, where rates tend to revert to a long-term mean.
- Volatility Modeling: The OU process can model stochastic volatility, where volatility reverts to a mean level over time (e.g., in the Heston model).
- Commodity Prices: Commodity prices often exhibit mean-reverting behavior due to supply and demand dynamics, making the OU process a natural choice.
- Ornstein-Uhlenbeck in Physics: Originally derived to describe the velocity of a particle under friction, the OU process is fundamental in statistical mechanics.
First Passage Time: The expected time for the OU process to reach a level \(a\) starting from \(X_0\) is given by:
\[ \mathbb{E}[\tau_a] = \int_{X_0}^a \frac{2}{\sigma^2} \int_{-\infty}^y \frac{e^{\theta (z - \mu)^2 / \sigma^2}}{e^{\theta (y - \mu)^2 / \sigma^2}} dz dy \]This is useful for risk management (e.g., time to default or barrier crossing).
Common Pitfalls and Important Notes
- Stationarity: The OU process is stationary only if \(\theta > 0\). If \(\theta = 0\), it reduces to arithmetic Brownian motion (non-stationary).
- Parameter Estimation: Estimating \(\theta\), \(\mu\), and \(\sigma\) from data requires care. Maximum likelihood estimation (MLE) is commonly used, but numerical optimization is often needed.
- Discretization Errors: The Euler-Maruyama method introduces discretization errors. For accurate simulation, use small \(\Delta t\) or higher-order methods (e.g., Milstein scheme).
- Negative Values: The OU process can take negative values, which may not be realistic for some applications (e.g., interest rates). Truncated or reflected OU processes are used in such cases.
- Mean Reversion vs. Momentum: Unlike geometric Brownian motion (used in Black-Scholes), the OU process exhibits mean reversion, making it suitable for modeling quantities that do not grow indefinitely.
Quant Interview Questions
- Derive the solution to the OU process.
See the derivation above.
- What is the long-term distribution of the OU process?
The OU process converges to a stationary Gaussian distribution with mean \(\mu\) and variance \(\frac{\sigma^2}{2\theta}\).
- How would you estimate the parameters of an OU process from data?
Use maximum likelihood estimation (MLE). The log-likelihood function for discrete observations \(X_0, X_1, ..., X_n\) is:
\[ \ell(\theta, \mu, \sigma) = -\frac{n}{2} \log(2\pi) - \frac{n}{2} \log\left(\frac{\sigma^2}{2\theta}(1 - e^{-2\theta \Delta t})\right) - \sum_{i=1}^n \frac{(X_i - X_{i-1} e^{-\theta \Delta t} - \mu (1 - e^{-\theta \Delta t}))^2}{\frac{\sigma^2}{\theta}(1 - e^{-2\theta \Delta t})} \]Numerical optimization (e.g., gradient descent) is used to maximize \(\ell\).
- Explain the difference between the OU process and geometric Brownian motion (GBM).
The OU process is mean-reverting with a stationary distribution, while GBM has a drift term that causes exponential growth (non-stationary). The OU process is used for quantities that revert to a mean, while GBM is used for quantities that grow over time (e.g., stock prices).
- How would you simulate an OU process?
Use the Euler-Maruyama method for discretization (see example above). For higher accuracy, use the exact solution:
\[ X_{t + \Delta t} = X_t e^{-\theta \Delta t} + \mu (1 - e^{-\theta \Delta t}) + \sigma \sqrt{\frac{1 - e^{-2\theta \Delta t}}{2\theta}} Z_t, \quad Z_t \sim \mathcal{N}(0, 1) \]
Topic 13: Solution to the Ornstein-Uhlenbeck (OU) Process (Explicit Form)
Ornstein-Uhlenbeck (OU) Process: A stochastic process \( X_t \) that satisfies the following stochastic differential equation (SDE): \[ dX_t = \theta (\mu - X_t) dt + \sigma dW_t \] where:
- \( \theta > 0 \) is the mean reversion speed,
- \( \mu \) is the long-term mean,
- \( \sigma > 0 \) is the volatility,
- \( W_t \) is a standard Wiener process (Brownian motion).
Explicit Solution to the OU Process: The solution to the OU SDE with initial condition \( X_0 = x \) is given by: \[ X_t = \mu + (x - \mu) e^{-\theta t} + \sigma \int_0^t e^{-\theta (t-s)} dW_s \] This is a Gaussian process with:
- Mean: \( \mathbb{E}[X_t] = \mu + (x - \mu) e^{-\theta t} \)
- Variance: \( \text{Var}(X_t) = \frac{\sigma^2}{2\theta} (1 - e^{-2\theta t}) \)
Derivation of the Explicit Solution:
The OU SDE is linear and can be solved using an integrating factor. Rewrite the SDE: \[ dX_t + \theta X_t dt = \theta \mu dt + \sigma dW_t \] The integrating factor is \( e^{\theta t} \). Multiply both sides by this factor: \[ e^{\theta t} dX_t + \theta e^{\theta t} X_t dt = \theta \mu e^{\theta t} dt + \sigma e^{\theta t} dW_t \] The left-hand side is the differential of \( e^{\theta t} X_t \): \[ d(e^{\theta t} X_t) = \theta \mu e^{\theta t} dt + \sigma e^{\theta t} dW_t \] Integrate both sides from \( 0 \) to \( t \): \[ e^{\theta t} X_t - X_0 = \theta \mu \int_0^t e^{\theta s} ds + \sigma \int_0^t e^{\theta s} dW_s \] Compute the deterministic integral: \[ \int_0^t e^{\theta s} ds = \frac{1}{\theta} (e^{\theta t} - 1) \] Substitute back: \[ e^{\theta t} X_t = X_0 + \mu (e^{\theta t} - 1) + \sigma \int_0^t e^{\theta s} dW_s \] Divide by \( e^{\theta t} \): \[ X_t = X_0 e^{-\theta t} + \mu (1 - e^{-\theta t}) + \sigma \int_0^t e^{-\theta (t-s)} dW_s \] Rearrange to obtain the explicit solution: \[ X_t = \mu + (X_0 - \mu) e^{-\theta t} + \sigma \int_0^t e^{-\theta (t-s)} dW_s \]
Stationary Distribution: As \( t \to \infty \), the OU process converges to a stationary Gaussian distribution with: \[ \mathbb{E}[X_t] \to \mu, \quad \text{Var}(X_t) \to \frac{\sigma^2}{2\theta} \] The stationary distribution is \( \mathcal{N}\left(\mu, \frac{\sigma^2}{2\theta}\right) \).
Example: Simulating the OU Process
Suppose \( X_0 = 0 \), \( \theta = 1 \), \( \mu = 2 \), and \( \sigma = 0.5 \). The explicit solution at time \( t \) is: \[ X_t = 2 + (0 - 2) e^{-t} + 0.5 \int_0^t e^{-(t-s)} dW_s \] The mean and variance at \( t = 1 \) are: \[ \mathbb{E}[X_1] = 2 + (0 - 2) e^{-1} \approx 2 - 0.7358 = 1.2642 \] \[ \text{Var}(X_1) = \frac{0.5^2}{2 \cdot 1} (1 - e^{-2}) \approx 0.125 \cdot 0.8647 = 0.1081 \]
Practical Applications:
- Interest Rate Modeling: The OU process is used in the Vasicek model to describe the evolution of interest rates, where rates are pulled toward a long-term mean.
- Commodity Prices: Mean-reverting behavior in commodity prices (e.g., oil, electricity) can be modeled using the OU process.
- Ornstein-Uhlenbeck Bridge: Used in statistical mechanics and finance to model paths conditioned to start and end at specific values.
- Stochastic Control: The OU process appears in optimal control problems, such as portfolio optimization with mean-reverting assets.
Common Pitfalls and Important Notes:
- Mean Reversion: The OU process is mean-reverting only if \( \theta > 0 \). If \( \theta \leq 0 \), the process does not revert to \( \mu \).
- Initial Condition: The explicit solution depends on the initial condition \( X_0 \). For a stationary process, \( X_0 \) is drawn from \( \mathcal{N}\left(\mu, \frac{\sigma^2}{2\theta}\right) \).
- Discretization: When simulating the OU process numerically, use the exact discretization (Euler-Maruyama is biased for mean-reverting processes): \[ X_{t+\Delta t} = \mu + (X_t - \mu) e^{-\theta \Delta t} + \sigma \sqrt{\frac{1 - e^{-2\theta \Delta t}}{2\theta}} Z, \quad Z \sim \mathcal{N}(0,1) \]
- Ito vs. Stratonovich: The OU process is typically defined in the Ito sense. If interpreted in the Stratonovich sense, the SDE would have a different form.
- Long-Term Behavior: The stationary distribution exists only if \( \theta > 0 \). The variance \( \frac{\sigma^2}{2\theta} \) increases as \( \theta \) decreases, reflecting slower mean reversion.
Connection to Ito's Lemma: The OU process can be derived from geometric Brownian motion using a change of variables. Let \( Y_t = e^{\theta t} (X_t - \mu) \). Applying Ito's lemma: \[ dY_t = \theta e^{\theta t} (X_t - \mu) dt + e^{\theta t} dX_t \] Substitute \( dX_t \): \[ dY_t = \theta e^{\theta t} (X_t - \mu) dt + e^{\theta t} [\theta (\mu - X_t) dt + \sigma dW_t] = \sigma e^{\theta t} dW_t \] Thus, \( Y_t = Y_0 + \sigma \int_0^t e^{\theta s} dW_s \), and solving for \( X_t \) recovers the explicit solution.
Topic 14: Cox-Ingersoll-Ross (CIR) Model for Interest Rates
Cox-Ingersoll-Ross (CIR) Model: A mathematical model for describing the evolution of interest rates, introduced by John C. Cox, Jonathan E. Ingersoll, and Stephen A. Ross in 1985. The CIR model is a type of one-factor short-rate model that captures the mean-reverting behavior of interest rates while ensuring they remain non-negative.
Short-Rate Models: Models that describe the evolution of the instantaneous interest rate (short rate), denoted \( r_t \), over time. The CIR model is one such model, where the dynamics of \( r_t \) are governed by a stochastic differential equation (SDE).
Mean Reversion: A property of stochastic processes where the variable tends to drift toward a long-term average level over time. In the CIR model, the interest rate reverts to a mean level \( \theta \).
The CIR model is defined by the following stochastic differential equation (SDE):
\[ dr_t = \kappa (\theta - r_t) dt + \sigma \sqrt{r_t} dW_t \]where:
- \( r_t \): Instantaneous interest rate at time \( t \).
- \( \kappa \): Speed of mean reversion (positive constant).
- \( \theta \): Long-term mean level of the interest rate (positive constant).
- \( \sigma \): Volatility of the interest rate (positive constant).
- \( W_t \): Standard Wiener process (Brownian motion) under the risk-neutral measure.
Feller Condition: A condition that ensures the interest rate \( r_t \) remains strictly positive. The Feller condition for the CIR model is:
\[ 2 \kappa \theta \geq \sigma^2 \]If this condition holds, the process \( r_t \) is strictly positive almost surely. If not, the process may hit zero but will be reflected back into the positive domain.
Derivation of the CIR Model Solution
The CIR model is a special case of a square-root diffusion process, and its solution can be derived using Itô's Lemma and properties of stochastic processes. The solution for \( r_t \) given \( r_0 \) is:
However, this form is not immediately useful for computation. Instead, we use the fact that the CIR process has a known transition density.
The conditional distribution of \( r_t \) given \( r_s \) (for \( t > s \)) is a non-central chi-squared distribution:
\[ r_t \mid r_s \sim \frac{\sigma^2 (1 - e^{-\kappa (t-s)})}{4 \kappa} \chi^2 \left( \frac{4 \kappa e^{-\kappa (t-s)}}{\sigma^2 (1 - e^{-\kappa (t-s)})} r_s, \frac{4 \kappa \theta}{\sigma^2} \right) \]where \( \chi^2(\lambda, \nu) \) denotes a non-central chi-squared distribution with non-centrality parameter \( \lambda \) and \( \nu \) degrees of freedom.
Example: Expected Value and Variance of \( r_t \)
The expected value and variance of \( r_t \) given \( r_0 \) can be derived from the SDE or the transition density:
Expected Value:
\[ \mathbb{E}[r_t \mid r_0] = r_0 e^{-\kappa t} + \theta (1 - e^{-\kappa t}) \]This shows the mean-reverting behavior: as \( t \to \infty \), \( \mathbb{E}[r_t] \to \theta \).
Variance:
\[ \text{Var}(r_t \mid r_0) = r_0 \frac{\sigma^2}{\kappa} (e^{-\kappa t} - e^{-2 \kappa t}) + \theta \frac{\sigma^2}{2 \kappa} (1 - e^{-\kappa t})^2 \]As \( t \to \infty \), the variance approaches \( \theta \frac{\sigma^2}{2 \kappa} \).
Bond Pricing in the CIR Model
The CIR model is widely used for pricing zero-coupon bonds. The price of a zero-coupon bond \( P(t, T) \) at time \( t \) maturing at time \( T \) is given by:
where:
\[ A(t, T) = \left( \frac{2 h e^{(\kappa + h)(T-t)/2}}{(\kappa + h)(e^{h(T-t)} - 1) + 2h} \right)^{2 \kappa \theta / \sigma^2} \] \[ B(t, T) = \frac{2 (e^{h(T-t)} - 1)}{(\kappa + h)(e^{h(T-t)} - 1) + 2h} \]and \( h = \sqrt{\kappa^2 + 2 \sigma^2} \).
Example: Deriving the Bond Price Formula
The bond price formula can be derived by solving the partial differential equation (PDE) associated with the CIR model. The steps are as follows:
- Assume the bond price is of the form \( P(t, T) = A(t, T) e^{-B(t, T) r_t} \).
- Apply Itô's Lemma to \( P(t, T) \) to find its dynamics.
- Use the fact that the discounted bond price is a martingale under the risk-neutral measure, leading to a PDE for \( A \) and \( B \).
- Solve the resulting system of ODEs for \( A(t, T) \) and \( B(t, T) \).
The PDE for \( P(t, T) \) is:
\[ \frac{\partial P}{\partial t} + \kappa (\theta - r_t) \frac{\partial P}{\partial r} + \frac{1}{2} \sigma^2 r_t \frac{\partial^2 P}{\partial r^2} - r_t P = 0 \]with terminal condition \( P(T, T) = 1 \).
Practical Applications
1. Interest Rate Derivatives Pricing:
The CIR model is used to price interest rate derivatives such as caps, floors, swaptions, and bond options. The closed-form bond price formula makes it computationally efficient for these applications.
2. Risk Management:
The model's mean-reverting property and ability to generate realistic interest rate paths make it useful for Value-at-Risk (VaR) calculations and stress testing.
3. Monetary Policy Analysis:
Central banks and economists use the CIR model to analyze the impact of monetary policy changes on the yield curve and to forecast future interest rates.
4. Credit Risk Modeling:
The CIR model can be extended to model credit spreads, where the default intensity is assumed to follow a CIR process.
Common Pitfalls and Important Notes
1. Feller Condition:
Always check the Feller condition \( 2 \kappa \theta \geq \sigma^2 \). If this condition is violated, the interest rate can hit zero, which may lead to numerical instability in simulations or unintuitive behavior in pricing formulas.
2. Parameter Estimation:
Estimating the parameters \( \kappa \), \( \theta \), and \( \sigma \) from market data can be challenging. Common methods include maximum likelihood estimation (MLE) or calibration to bond prices or interest rate derivatives.
3. Negative Rates:
The CIR model does not naturally allow for negative interest rates. In environments where negative rates are observed (e.g., post-2008 financial crisis), the model may need to be adjusted or replaced with alternatives like the shifted CIR model.
4. Affine Term Structure:
The CIR model belongs to the class of affine term structure models, where bond prices and yields are exponential-affine functions of the state variable \( r_t \). This property simplifies the derivation of analytical formulas for bond prices and derivatives.
5. Extensions and Generalizations:
The CIR model can be extended to multi-factor models or combined with other processes (e.g., Heston model for stochastic volatility) to capture more complex dynamics. However, these extensions often sacrifice analytical tractability.
Example: Simulating CIR Paths
To simulate paths of the CIR process, use the following discretization scheme (Euler-Maruyama with reflection at zero):
where \( Z \sim \mathcal{N}(0, 1) \). If \( r_{t + \Delta t} < 0 \), set \( r_{t + \Delta t} = 0 \). For better accuracy, use the Milstein scheme or exact simulation methods based on the non-central chi-squared distribution.
Topic 15: Existence and Uniqueness of SDE Solutions (Lipschitz Conditions)
Stochastic Differential Equation (SDE): A stochastic differential equation is an equation of the form: \[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t, \quad X_0 = x_0, \] where:
- \(X_t\) is the stochastic process being modeled,
- \(\mu(t, X_t)\) is the drift coefficient,
- \(\sigma(t, X_t)\) is the diffusion coefficient,
- \(W_t\) is a Wiener process (Brownian motion),
- \(x_0\) is the initial condition.
Strong Solution: A strong solution to the SDE is a stochastic process \(X_t\) that is adapted to the filtration generated by the Brownian motion \(W_t\) and satisfies the integral equation almost surely for all \(t \geq 0\).
Weak Solution: A weak solution consists of a probability space, a Brownian motion on that space, and a process \(X_t\) that satisfies the integral equation. Weak solutions are less restrictive than strong solutions and do not require \(X_t\) to be adapted to the filtration of a specific Brownian motion.
Lipschitz Condition: A function \(f(t, x)\) is said to satisfy a global Lipschitz condition in \(x\) if there exists a constant \(K > 0\) such that for all \(t \geq 0\) and all \(x, y \in \mathbb{R}\), \[ |f(t, x) - f(t, y)| \leq K |x - y|. \] The function \(f\) is said to satisfy a local Lipschitz condition if for every \(R > 0\), there exists a constant \(K_R > 0\) such that the above inequality holds for all \(|x|, |y| \leq R\).
Linear Growth Condition: A function \(f(t, x)\) satisfies a linear growth condition if there exists a constant \(C > 0\) such that for all \(t \geq 0\) and all \(x \in \mathbb{R}\), \[ |f(t, x)| \leq C (1 + |x|). \]
Theorem (Existence and Uniqueness of Strong Solutions): Suppose the coefficients \(\mu(t, x)\) and \(\sigma(t, x)\) satisfy the following conditions:
- Global Lipschitz condition: There exists a constant \(K > 0\) such that for all \(t \geq 0\) and all \(x, y \in \mathbb{R}\), \[ |\mu(t, x) - \mu(t, y)| + |\sigma(t, x) - \sigma(t, y)| \leq K |x - y|. \]
- Linear growth condition: There exists a constant \(C > 0\) such that for all \(t \geq 0\) and all \(x \in \mathbb{R}\), \[ |\mu(t, x)| + |\sigma(t, x)| \leq C (1 + |x|). \]
Example (Verification of Lipschitz and Growth Conditions): Consider the geometric Brownian motion SDE: \[ dX_t = \mu X_t dt + \sigma X_t dW_t, \quad X_0 = x_0 > 0. \] Here, \(\mu(t, x) = \mu x\) and \(\sigma(t, x) = \sigma x\).
- Lipschitz Condition: For any \(x, y \in \mathbb{R}\), \[ |\mu x - \mu y| + |\sigma x - \sigma y| = |\mu| |x - y| + |\sigma| |x - y| = (|\mu| + |\sigma|) |x - y|. \] Thus, the global Lipschitz condition is satisfied with \(K = |\mu| + |\sigma|\).
- Linear Growth Condition: For any \(x \in \mathbb{R}\), \[ |\mu x| + |\sigma x| = (|\mu| + |\sigma|) |x| \leq (|\mu| + |\sigma|) (1 + |x|). \] Thus, the linear growth condition is satisfied with \(C = |\mu| + |\sigma|\).
Picard Iteration for SDEs: The proof of existence and uniqueness often relies on Picard iteration (also known as the method of successive approximations). Define the sequence of processes \(\{X_t^n\}_{n=0}^\infty\) as follows: \[ X_t^0 = x_0, \] \[ X_t^{n+1} = x_0 + \int_0^t \mu(s, X_s^n) ds + \int_0^t \sigma(s, X_s^n) dW_s. \] Under the Lipschitz and linear growth conditions, this sequence converges uniformly on compact time intervals to the unique strong solution \(X_t\).
Example (Picard Iteration for Ornstein-Uhlenbeck Process): Consider the Ornstein-Uhlenbeck SDE: \[ dX_t = -\theta X_t dt + \sigma dW_t, \quad X_0 = x_0, \] where \(\theta > 0\) and \(\sigma > 0\) are constants. The coefficients are \(\mu(t, x) = -\theta x\) and \(\sigma(t, x) = \sigma\).
- Verify Lipschitz and growth conditions: \[ |\mu(t, x) - \mu(t, y)| = |-\theta x + \theta y| = \theta |x - y|, \] \[ |\sigma(t, x) - \sigma(t, y)| = 0 \leq K |x - y| \quad \text{(for any \(K > 0\))}. \] Thus, the global Lipschitz condition holds with \(K = \theta\). The linear growth condition is also satisfied since: \[ |\mu(t, x)| + |\sigma(t, x)| = \theta |x| + \sigma \leq \max(\theta, \sigma) (1 + |x|). \]
- Apply Picard iteration: \[ X_t^0 = x_0, \] \[ X_t^1 = x_0 + \int_0^t (-\theta x_0) ds + \int_0^t \sigma dW_s = x_0 - \theta x_0 t + \sigma W_t, \] \[ X_t^2 = x_0 + \int_0^t (-\theta X_s^1) ds + \int_0^t \sigma dW_s = x_0 - \theta \int_0^t (x_0 - \theta x_0 s + \sigma W_s) ds + \sigma W_t. \] Simplifying: \[ X_t^2 = x_0 - \theta x_0 t + \theta^2 x_0 \frac{t^2}{2} - \theta \sigma \int_0^t W_s ds + \sigma W_t. \] As \(n \to \infty\), \(X_t^n\) converges to the unique strong solution: \[ X_t = x_0 e^{-\theta t} + \sigma \int_0^t e^{-\theta (t-s)} dW_s. \]
Important Notes and Common Pitfalls:
- Local vs. Global Lipschitz: The existence and uniqueness theorem stated above requires global Lipschitz and linear growth conditions. If the coefficients only satisfy local Lipschitz conditions, a unique solution may still exist up to an explosion time (i.e., the solution may blow up in finite time). For example, the SDE \(dX_t = X_t^2 dt + dW_t\) has coefficients that are locally Lipschitz but not globally Lipschitz, and its solution may explode in finite time.
- Weak vs. Strong Solutions: The theorem guarantees a strong solution, which is adapted to the filtration of the given Brownian motion. Weak solutions may exist even when strong solutions do not, but they are not unique in the same sense.
- Dependence on Initial Condition: The solution \(X_t\) depends continuously on the initial condition \(x_0\) under the Lipschitz and growth conditions. This is important for stability and sensitivity analysis.
- Time-Dependent Coefficients: The theorem extends to time-dependent coefficients \(\mu(t, x)\) and \(\sigma(t, x)\), provided the Lipschitz and growth conditions hold uniformly in \(t\).
- Multidimensional SDEs: For SDEs in \(\mathbb{R}^n\), the Lipschitz condition becomes: \[ \|\mu(t, x) - \mu(t, y)\| + \|\sigma(t, x) - \sigma(t, y)\| \leq K \|x - y\|, \] where \(\|\cdot\|\) denotes the Euclidean norm. The linear growth condition is similarly generalized.
- Numerical Methods: The Lipschitz condition is also crucial for the convergence of numerical methods (e.g., Euler-Maruyama) for SDEs. Without it, numerical approximations may fail to converge to the true solution.
Local Lipschitz and Explosion Time: If the coefficients \(\mu(t, x)\) and \(\sigma(t, x)\) satisfy a local Lipschitz condition and a linear growth condition, then the SDE has a unique strong solution up to an explosion time \(\tau\). The explosion time is defined as: \[ \tau = \inf \{ t \geq 0 : \lim_{s \to t} |X_s| = \infty \}. \] If \(\tau = \infty\) almost surely, the solution exists globally.
Example (Explosion in Finite Time): Consider the SDE: \[ dX_t = X_t^2 dt, \quad X_0 = x_0 > 0. \] The drift coefficient \(\mu(x) = x^2\) is locally Lipschitz but does not satisfy the linear growth condition. The solution is: \[ X_t = \frac{x_0}{1 - x_0 t}. \] The explosion time is \(\tau = 1/x_0\), since \(X_t \to \infty\) as \(t \to \tau\).
Quant Interview Questions: Here are some common quant interview questions related to existence and uniqueness of SDE solutions:
- What are the Lipschitz and linear growth conditions, and why are they important for the existence and uniqueness of SDE solutions?
- Explain the difference between strong and weak solutions to an SDE. Provide an example where a weak solution exists but a strong solution does not.
- Consider the SDE \(dX_t = X_t^\alpha dt + dW_t\). For which values of \(\alpha\) does a unique strong solution exist? Does the solution explode in finite time?
- How would you verify the Lipschitz and growth conditions for the coefficients of the Heston model SDE?
- What is Picard iteration, and how is it used to prove the existence of solutions to SDEs?
- Explain why the Euler-Maruyama method for numerically solving SDEs requires the coefficients to satisfy the Lipschitz condition.
- Consider the SDE \(dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t\) with \(\mu(t, x) = -x^3\) and \(\sigma(t, x) = 1\). Does a unique strong solution exist? If so, does it explode in finite time?
Topic 16: Strong vs. Weak Solutions of SDEs
Stochastic Differential Equation (SDE): An SDE is an equation of the form: \[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t, \quad X_0 = x_0, \] where \(W_t\) is a standard Brownian motion, \(\mu(t, X_t)\) is the drift term, and \(\sigma(t, X_t)\) is the diffusion term.
Strong Solution: A strong solution to an SDE is a stochastic process \(X_t\) that is adapted to the filtration generated by the Brownian motion \(W_t\) and satisfies the integral form of the SDE almost surely: \[ X_t = X_0 + \int_0^t \mu(s, X_s) ds + \int_0^t \sigma(s, X_s) dW_s. \] The solution is pathwise unique, meaning that if \(X_t\) and \(Y_t\) are two strong solutions, then \(P(X_t = Y_t \text{ for all } t \geq 0) = 1\).
Weak Solution: A weak solution to an SDE consists of a probability space \((\Omega, \mathcal{F}, P)\), a filtration \(\{\mathcal{F}_t\}\), a Brownian motion \(W_t\) on that space, and a process \(X_t\) adapted to \(\{\mathcal{F}_t\}\) such that the integral form of the SDE holds. Weak solutions do not require \(X_t\) to be adapted to the filtration generated by a specific Brownian motion \(W_t\), and uniqueness is in the sense of law (distribution).
Existence and Uniqueness Conditions:
For the SDE \(dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t\), the following conditions ensure the existence and uniqueness of a strong solution:
- Lipschitz Condition: \[ |\mu(t, x) - \mu(t, y)| + |\sigma(t, x) - \sigma(t, y)| \leq K |x - y| \quad \text{for all } x, y \in \mathbb{R}, t \geq 0, \] where \(K\) is a constant.
- Linear Growth Condition: \[ |\mu(t, x)| + |\sigma(t, x)| \leq K(1 + |x|) \quad \text{for all } x \in \mathbb{R}, t \geq 0. \]
If these conditions are satisfied, there exists a unique strong solution to the SDE.
Weak Uniqueness: A weak solution is unique in law if any two weak solutions \(X_t\) and \(Y_t\) have the same finite-dimensional distributions. That is, for any \(t_1, t_2, \dots, t_n \geq 0\), \[ (X_{t_1}, X_{t_2}, \dots, X_{t_n}) \stackrel{d}{=} (Y_{t_1}, Y_{t_2}, \dots, Y_{t_n}). \]
Example: Strong Solution (Geometric Brownian Motion)
Consider the SDE for geometric Brownian motion: \[ dS_t = \mu S_t dt + \sigma S_t dW_t, \quad S_0 = s_0. \] This SDE satisfies the Lipschitz and linear growth conditions, so it has a unique strong solution: \[ S_t = s_0 \exp\left( \left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W_t \right). \]
Example: Weak Solution (Tanaka's SDE)
Consider the SDE: \[ dX_t = \text{sgn}(X_t) dW_t, \quad X_0 = 0, \] where \(\text{sgn}(x) = 1\) if \(x \geq 0\) and \(\text{sgn}(x) = -1\) if \(x < 0\). This SDE does not satisfy the Lipschitz condition, so it does not have a strong solution. However, it has a weak solution, and the solution is unique in law. The weak solution is given by \(X_t = W_t\), where \(W_t\) is a Brownian motion on some probability space.
Key Differences Between Strong and Weak Solutions:
- Adaptability: Strong solutions are adapted to the filtration generated by the driving Brownian motion \(W_t\), while weak solutions are adapted to some filtration \(\{\mathcal{F}_t\}\) on a probability space where \(W_t\) is a Brownian motion.
- Uniqueness: Strong solutions are pathwise unique, while weak solutions are unique in law.
- Existence Conditions: Strong solutions require stronger conditions (e.g., Lipschitz continuity) compared to weak solutions.
- Applications: Strong solutions are often used in modeling where pathwise properties are important (e.g., hedging in finance). Weak solutions are useful when only distributional properties are needed (e.g., pricing derivatives).
Practical Applications:
- Finance: Strong solutions are used in models where the exact path of the underlying asset is important, such as in delta hedging. Weak solutions are sufficient for pricing derivatives where only the distribution of the underlying asset matters.
- Filtering Theory: Strong solutions are often required in filtering problems where the state process must be adapted to the observation filtration.
- Numerical Methods: Weak solutions are often easier to simulate numerically, as they only require matching the distribution of the process rather than its exact path.
Common Pitfalls:
- Assuming Strong Solutions Exist: Not all SDEs have strong solutions. Always check the Lipschitz and linear growth conditions before assuming a strong solution exists.
- Confusing Uniqueness: Pathwise uniqueness (strong solutions) is stronger than uniqueness in law (weak solutions). Do not assume pathwise uniqueness unless the conditions are met.
- Adaptability: In weak solutions, the process \(X_t\) must be adapted to the filtration \(\{\mathcal{F}_t\}\), but this filtration may be larger than the one generated by \(W_t\). Failing to account for this can lead to incorrect conclusions about the solution.
Yamada-Watanabe Theorem:
The Yamada-Watanabe theorem provides a connection between strong and weak solutions. It states that:
- If pathwise uniqueness holds for an SDE, then uniqueness in law holds.
- If a weak solution exists and pathwise uniqueness holds, then a strong solution exists and is pathwise unique.
This theorem is useful for showing the existence of strong solutions when weak solutions are known to exist and pathwise uniqueness can be established.
Example: Applying Yamada-Watanabe Theorem
Consider the SDE: \[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t, \] where \(\mu\) and \(\sigma\) satisfy the Lipschitz and linear growth conditions. By the Yamada-Watanabe theorem, since pathwise uniqueness holds, the existence of a weak solution implies the existence of a strong solution. This is often used to establish the existence of strong solutions in practice.
Topic 17: Girsanov's Theorem and Change of Measure
Girsanov’s Theorem: A fundamental result in stochastic calculus that describes how the dynamics of stochastic processes change when the underlying probability measure is altered. It is particularly useful for removing drifts from stochastic differential equations (SDEs) or introducing them, facilitating the pricing of financial derivatives under different measures (e.g., risk-neutral measure).
Equivalent Probability Measures: Two probability measures \( \mathbb{P} \) and \( \mathbb{Q} \) on a measurable space \( (\Omega, \mathcal{F}) \) are equivalent if they agree on which events have probability zero. That is, \( \mathbb{P}(A) = 0 \) if and only if \( \mathbb{Q}(A) = 0 \) for all \( A \in \mathcal{F} \).
Radon-Nikodym Derivative: If \( \mathbb{Q} \) is absolutely continuous with respect to \( \mathbb{P} \), there exists a non-negative random variable \( \xi \) such that for any event \( A \in \mathcal{F} \), \[ \mathbb{Q}(A) = \int_A \xi \, d\mathbb{P}. \] The random variable \( \xi \) is called the Radon-Nikodym derivative of \( \mathbb{Q} \) with respect to \( \mathbb{P} \), denoted \( \xi = \frac{d\mathbb{Q}}{d\mathbb{P}} \).
Novikov’s Condition: A sufficient condition for the process \( \exp\left( \int_0^T \theta_s dW_s - \frac{1}{2} \int_0^T \theta_s^2 ds \right) \) to be a martingale. Novikov’s condition states that if \[ \mathbb{E}\left[ \exp\left( \frac{1}{2} \int_0^T \theta_s^2 ds \right) \right] < \infty, \] then the process is a martingale.
Key Formulas and Theorems
Girsanov’s Theorem (One-Dimensional Case):
Let \( W_t \) be a \( \mathbb{P} \)-Brownian motion on \( (\Omega, \mathcal{F}, \mathbb{P}) \), and let \( \theta_t \) be an adapted process such that the process \[ Z_t = \exp\left( -\int_0^t \theta_s dW_s - \frac{1}{2} \int_0^t \theta_s^2 ds \right) \] is a martingale (e.g., Novikov’s condition holds). Define a new probability measure \( \mathbb{Q} \) by \[ \frac{d\mathbb{Q}}{d\mathbb{P}} \bigg|_{\mathcal{F}_t} = Z_t. \] Then the process \[ \tilde{W}_t = W_t + \int_0^t \theta_s ds \] is a \( \mathbb{Q} \)-Brownian motion.
Multi-Dimensional Girsanov’s Theorem:
Let \( W_t = (W_t^1, \dots, W_t^d) \) be a \( d \)-dimensional \( \mathbb{P} \)-Brownian motion, and let \( \theta_t = (\theta_t^1, \dots, \theta_t^d) \) be an adapted process such that the process \[ Z_t = \exp\left( -\int_0^t \theta_s \cdot dW_s - \frac{1}{2} \int_0^t \|\theta_s\|^2 ds \right) \] is a martingale. Define \( \mathbb{Q} \) by \( \frac{d\mathbb{Q}}{d\mathbb{P}} \big|_{\mathcal{F}_t} = Z_t \). Then the process \[ \tilde{W}_t = W_t + \int_0^t \theta_s ds \] is a \( d \)-dimensional \( \mathbb{Q} \)-Brownian motion.
Change of Measure for SDEs:
Consider the SDE under \( \mathbb{P} \): \[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t. \] Under the measure \( \mathbb{Q} \) defined by Girsanov’s theorem with \( \theta_t = \frac{\mu(t, X_t)}{\sigma(t, X_t)} \), the SDE becomes \[ dX_t = \sigma(t, X_t) d\tilde{W}_t, \] where \( \tilde{W}_t \) is a \( \mathbb{Q} \)-Brownian motion.
Derivations and Proof Sketch
Derivation of Girsanov’s Theorem (One-Dimensional Case):
- Define the Radon-Nikodym Derivative:
Let \( Z_t = \exp\left( -\int_0^t \theta_s dW_s - \frac{1}{2} \int_0^t \theta_s^2 ds \right) \). By Itô’s formula, \[ dZ_t = -\theta_t Z_t dW_t. \] Thus, \( Z_t \) is a local martingale. If \( Z_t \) is a true martingale (e.g., Novikov’s condition holds), then \( \mathbb{E}[Z_t] = 1 \), and we can define \( \mathbb{Q} \) by \( \frac{d\mathbb{Q}}{d\mathbb{P}} \big|_{\mathcal{F}_t} = Z_t \).
- Show \( \tilde{W}_t \) is a \( \mathbb{Q} \)-Brownian Motion:
We verify that \( \tilde{W}_t \) satisfies the conditions of Lévy’s characterization of Brownian motion:
- \( \tilde{W}_t \) is a \( \mathbb{Q} \)-martingale:
For \( s \leq t \), compute \( \mathbb{E}^\mathbb{Q}[\tilde{W}_t | \mathcal{F}_s] \): \[ \mathbb{E}^\mathbb{Q}[\tilde{W}_t | \mathcal{F}_s] = \mathbb{E}^\mathbb{P}\left[ \frac{Z_t}{Z_s} \tilde{W}_t \bigg| \mathcal{F}_s \right]. \] Since \( \tilde{W}_t = W_t + \int_0^t \theta_u du \), and \( W_t \) is a \( \mathbb{P} \)-martingale, \[ \mathbb{E}^\mathbb{P}\left[ \frac{Z_t}{Z_s} W_t \bigg| \mathcal{F}_s \right] = W_s. \] Also, \( \mathbb{E}^\mathbb{P}\left[ \frac{Z_t}{Z_s} \int_s^t \theta_u du \bigg| \mathcal{F}_s \right] = \int_s^t \theta_u du \) (since \( \theta_u \) is adapted and \( Z_t \) is a martingale). Thus, \[ \mathbb{E}^\mathbb{Q}[\tilde{W}_t | \mathcal{F}_s] = W_s + \int_0^s \theta_u du = \tilde{W}_s. \]
- Quadratic Variation:
The quadratic variation of \( \tilde{W}_t \) under \( \mathbb{Q} \) is the same as under \( \mathbb{P} \), since the additional drift term does not contribute to the quadratic variation: \[ [\tilde{W}, \tilde{W}]_t = [W, W]_t = t. \]
- \( \tilde{W}_t \) is a \( \mathbb{Q} \)-martingale:
Practical Applications
1. Risk-Neutral Pricing in the Black-Scholes Model:
In the Black-Scholes model, the stock price \( S_t \) follows the SDE under the real-world measure \( \mathbb{P} \): \[ dS_t = \mu S_t dt + \sigma S_t dW_t. \] To price derivatives, we switch to the risk-neutral measure \( \mathbb{Q} \), where the drift \( \mu \) is replaced by the risk-free rate \( r \). Define \( \theta_t = \frac{\mu - r}{\sigma} \), and let \( Z_t \) be the Radon-Nikodym derivative as in Girsanov’s theorem. Then under \( \mathbb{Q} \), \[ dS_t = r S_t dt + \sigma S_t d\tilde{W}_t, \] where \( \tilde{W}_t \) is a \( \mathbb{Q} \)-Brownian motion. This allows us to price derivatives using the risk-neutral expectation.
2. Change of Numéraire:
Girsanov’s theorem is used to change the numéraire (unit of account) in derivative pricing. For example, to price a quanto option, we may switch from the domestic risk-neutral measure to the foreign risk-neutral measure. The change of measure introduces a drift adjustment to the Brownian motion, reflecting the correlation between the underlying asset and the exchange rate.
3. Removing Drifts for Simplification:
In some cases, it is easier to work with driftless processes. Girsanov’s theorem allows us to "remove" the drift by changing the measure. For example, consider the SDE: \[ dX_t = \mu(t) dt + \sigma(t) dW_t. \] Under the measure \( \mathbb{Q} \) defined by \( \theta_t = \frac{\mu(t)}{\sigma(t)} \), the SDE becomes \[ dX_t = \sigma(t) d\tilde{W}_t, \] which is simpler to analyze (e.g., for computing hitting times or barrier probabilities).
Common Pitfalls and Important Notes
1. Martingale Condition for \( Z_t \):
Girsanov’s theorem requires that \( Z_t \) is a martingale, not just a local martingale. If \( Z_t \) is not a true martingale, the measure \( \mathbb{Q} \) may not be well-defined or equivalent to \( \mathbb{P} \). Novikov’s condition is a sufficient (but not necessary) condition for \( Z_t \) to be a martingale. In practice, verifying Novikov’s condition can be challenging, and alternative conditions (e.g., Kazamaki’s condition) may be used.
2. Absolute Continuity and Equivalence:
The measures \( \mathbb{P} \) and \( \mathbb{Q} \) must be equivalent (i.e., they agree on null sets). If \( Z_t \) is not strictly positive, \( \mathbb{Q} \) may not be equivalent to \( \mathbb{P} \), and Girsanov’s theorem does not apply. For example, if \( \theta_t \) is unbounded, \( Z_t \) may hit zero, leading to singular measures.
3. Adaptedness of \( \theta_t \):
The process \( \theta_t \) must be adapted to the filtration \( \mathcal{F}_t \). If \( \theta_t \) depends on future information, the change of measure is not valid. For example, in the Black-Scholes model, \( \theta_t = \frac{\mu - r}{\sigma} \) is a constant (or deterministic function of time), so it is adapted.
4. Dimensionality in Multi-Asset Models:
In multi-asset models, the drift adjustment \( \theta_t \) is a vector, and the change of measure affects all Brownian motions driving the assets. Care must be taken to ensure that the resulting \( \mathbb{Q} \)-Brownian motions are consistent with the model’s correlation structure. For example, in a multi-asset Black-Scholes model, the market price of risk \( \theta_t \) must be chosen to preserve the correlation between assets under \( \mathbb{Q} \).
5. Practical Computation of \( Z_t \):
In numerical implementations (e.g., Monte Carlo simulations), computing \( Z_t \) can be challenging due to the stochastic integral \( \int_0^t \theta_s dW_s \). For constant \( \theta \), this integral simplifies to \( \theta W_t \), but for time-dependent or stochastic \( \theta_t \), discretization schemes (e.g., Euler-Maruyama) must be used carefully to avoid bias.
Interview Questions and Answers
Q1: What is the purpose of Girsanov’s theorem in quantitative finance?
A1: Girsanov’s theorem is used to change the probability measure under which stochastic processes are defined, typically to remove or introduce drifts. In quantitative finance, its primary purpose is to switch from the real-world measure \( \mathbb{P} \) to the risk-neutral measure \( \mathbb{Q} \), under which the discounted asset prices are martingales. This facilitates the pricing of derivatives by allowing us to compute expectations under \( \mathbb{Q} \).
Q2: Explain how Girsanov’s theorem is used in the Black-Scholes model.
A2: In the Black-Scholes model, the stock price \( S_t \) follows the SDE \( dS_t = \mu S_t dt + \sigma S_t dW_t \) under the real-world measure \( \mathbb{P} \). To price derivatives, we switch to the risk-neutral measure \( \mathbb{Q} \), where the drift \( \mu \) is replaced by the risk-free rate \( r \). This is achieved by defining the Radon-Nikodym derivative: \[ Z_t = \exp\left( -\frac{\mu - r}{\sigma} W_t - \frac{1}{2} \left( \frac{\mu - r}{\sigma} \right)^2 t \right), \] and setting \( \frac{d\mathbb{Q}}{d\mathbb{P}} \big|_{\mathcal{F}_t} = Z_t \). Under \( \mathbb{Q} \), the SDE becomes \( dS_t = r S_t dt + \sigma S_t d\tilde{W}_t \), where \( \tilde{W}_t \) is a \( \mathbb{Q} \)-Brownian motion. This allows us to price derivatives using the risk-neutral expectation \( \mathbb{E}^\mathbb{Q}[\cdot] \).
Q3: What is Novikov’s condition, and why is it important?
A3: Novikov’s condition is a sufficient condition for the process \( Z_t = \exp\left( -\int_0^t \theta_s dW_s - \frac{1}{2} \int_0^t \theta_s^2 ds \right) \) to be a martingale. It states that if \[ \mathbb{E}\left[ \exp\left( \frac{1}{2} \int_0^T \theta_s^2 ds \right) \right] < \infty, \] then \( Z_t \) is a martingale. This is important because Girsanov’s theorem requires \( Z_t \) to be a martingale to define a valid equivalent measure \( \mathbb{Q} \). Without Novikov’s condition (or a similar condition), \( Z_t \) may only be a local martingale, and the measure \( \mathbb{Q} \) may not be equivalent to \( \mathbb{P} \).
Q4: Consider the SDE \( dX_t = \mu dt + \sigma dW_t \). How can Girsanov’s theorem be used to remove the drift \( \mu \)?
A4: To remove the drift \( \mu \), we define the process \( \theta_t = \frac{\mu}{\sigma} \) and the Radon-Nikodym derivative: \[ Z_t = \exp\left( -\frac{\mu}{\sigma} W_t - \frac{1}{2} \left( \frac{\mu}{\sigma} \right)^2 t \right). \] Assuming Novikov’s condition holds, \( Z_t \) is a martingale, and we can define a new measure \( \mathbb{Q} \) by \( \frac{d\mathbb{Q}}{d\mathbb{P}} \big|_{\mathcal{F}_t} = Z_t \). Under \( \mathbb{Q} \), the process \( \tilde{W}_t = W_t + \frac{\mu}{\sigma} t \) is a Brownian motion, and the SDE for \( X_t \) becomes: \[ dX_t = \sigma d\tilde{W}_t. \] Thus, the drift \( \mu \) is removed under the new measure \( \mathbb{Q} \).
Q5: What happens if Novikov’s condition is not satisfied?
A5: If Novikov’s condition is not satisfied, the process \( Z_t \) may not be a true martingale, and the measure \( \mathbb{Q} \) defined by \( \frac{d\mathbb{Q}}{d\mathbb{P}} = Z_T \) may not be equivalent to \( \mathbb{P} \). Specifically:
- Local Martingale but Not Martingale: \( Z_t \) may be a local martingale but fail to be a true martingale. In this case, \( \mathbb{E}[Z_T] < 1 \), and \( \mathbb{Q} \) is not a probability measure (it does not integrate to 1).
- Singular Measures: If \( Z_t \) hits zero with positive probability, \( \mathbb{Q} \) may be singular with respect to \( \mathbb{P} \), meaning there exist events with \( \mathbb{P} \)-probability zero but \( \mathbb{Q} \)-probability one. Girsanov’s theorem does not apply in this case.
Topic 18: Radon-Nikodym Derivative and Its Role in Measure Change
Radon-Nikodym Derivative: Let \((\Omega, \mathcal{F})\) be a measurable space, and let \(\mathbb{P}\) and \(\mathbb{Q}\) be two probability measures on \((\Omega, \mathcal{F})\). If \(\mathbb{Q}\) is absolutely continuous with respect to \(\mathbb{P}\) (denoted \(\mathbb{Q} \ll \mathbb{P}\)), then there exists a non-negative, \(\mathcal{F}\)-measurable function \(Z: \Omega \to \mathbb{R}\) such that for any \(A \in \mathcal{F}\),
\[ \mathbb{Q}(A) = \int_A Z \, d\mathbb{P}. \]The function \(Z\) is called the Radon-Nikodym derivative of \(\mathbb{Q}\) with respect to \(\mathbb{P}\), and is denoted by:
\[ Z = \frac{d\mathbb{Q}}{d\mathbb{P}}. \]Absolute Continuity (\(\mathbb{Q} \ll \mathbb{P}\)): A measure \(\mathbb{Q}\) is absolutely continuous with respect to \(\mathbb{P}\) if for every \(A \in \mathcal{F}\), \(\mathbb{P}(A) = 0\) implies \(\mathbb{Q}(A) = 0\).
Equivalent Measures (\(\mathbb{Q} \sim \mathbb{P}\)): Two measures \(\mathbb{P}\) and \(\mathbb{Q}\) are equivalent if \(\mathbb{Q} \ll \mathbb{P}\) and \(\mathbb{P} \ll \mathbb{Q}\). In this case, the Radon-Nikodym derivative \(\frac{d\mathbb{Q}}{d\mathbb{P}}\) is strictly positive \(\mathbb{P}\)-a.s.
Key Properties of the Radon-Nikodym Derivative:
- Expectation under Change of Measure: \[ \mathbb{E}^\mathbb{Q}[X] = \mathbb{E}^\mathbb{P}\left[X \frac{d\mathbb{Q}}{d\mathbb{P}}\right], \] where \(X\) is a \(\mathbb{Q}\)-integrable random variable.
- Chain Rule: If \(\mathbb{R} \ll \mathbb{Q} \ll \mathbb{P}\), then: \[ \frac{d\mathbb{R}}{d\mathbb{P}} = \frac{d\mathbb{R}}{d\mathbb{Q}} \cdot \frac{d\mathbb{Q}}{d\mathbb{P}}. \]
- Inverse Relationship: If \(\mathbb{Q} \sim \mathbb{P}\), then: \[ \frac{d\mathbb{P}}{d\mathbb{Q}} = \left(\frac{d\mathbb{Q}}{d\mathbb{P}}\right)^{-1}. \]
Example: Change of Measure for a Normal Random Variable
Let \(X \sim \mathcal{N}(\mu, \sigma^2)\) under \(\mathbb{P}\). Define a new measure \(\mathbb{Q}\) such that \(X \sim \mathcal{N}(\nu, \sigma^2)\) under \(\mathbb{Q}\). The Radon-Nikodym derivative is given by:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}} = \exp\left(\frac{\nu - \mu}{\sigma^2} X - \frac{\nu^2 - \mu^2}{2\sigma^2}\right). \]Derivation:
The density of \(X\) under \(\mathbb{P}\) is:
\[ f_\mathbb{P}(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x - \mu)^2}{2\sigma^2}\right). \]The density of \(X\) under \(\mathbb{Q}\) is:
\[ f_\mathbb{Q}(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x - \nu)^2}{2\sigma^2}\right). \]The Radon-Nikodym derivative is the ratio of the densities:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}} = \frac{f_\mathbb{Q}(X)}{f_\mathbb{P}(X)} = \exp\left(\frac{(X - \mu)^2 - (X - \nu)^2}{2\sigma^2}\right). \]Simplifying the exponent:
\[ (X - \mu)^2 - (X - \nu)^2 = (X^2 - 2\mu X + \mu^2) - (X^2 - 2\nu X + \nu^2) = 2(\nu - \mu)X + (\mu^2 - \nu^2). \]Thus:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}} = \exp\left(\frac{2(\nu - \mu)X + (\mu^2 - \nu^2)}{2\sigma^2}\right) = \exp\left(\frac{\nu - \mu}{\sigma^2} X - \frac{\nu^2 - \mu^2}{2\sigma^2}\right). \]Girsanov's Theorem (Measure Change for Brownian Motion):
Let \(W_t\) be a \(\mathbb{P}\)-Brownian motion, and let \(\theta_t\) be an adapted process such that the Novikov condition holds:
\[ \mathbb{E}^\mathbb{P}\left[\exp\left(\frac{1}{2} \int_0^T \theta_s^2 \, ds\right)\right] < \infty. \]Define the process:
\[ Z_t = \exp\left(-\int_0^t \theta_s \, dW_s - \frac{1}{2} \int_0^t \theta_s^2 \, ds\right). \]Then \(Z_t\) is a \(\mathbb{P}\)-martingale, and the measure \(\mathbb{Q}\) defined by:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}}\bigg|_{\mathcal{F}_t} = Z_t \]is equivalent to \(\mathbb{P}\) on \(\mathcal{F}_t\). Under \(\mathbb{Q}\), the process:
\[ \tilde{W}_t = W_t + \int_0^t \theta_s \, ds \]is a \(\mathbb{Q}\)-Brownian motion.
Example: Risk-Neutral Measure in the Black-Scholes Model
In the Black-Scholes model, the stock price \(S_t\) follows:
\[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t, \]where \(W_t\) is a \(\mathbb{P}\)-Brownian motion. The risk-neutral measure \(\mathbb{Q}\) is defined such that the discounted stock price \(\tilde{S}_t = e^{-rt} S_t\) is a \(\mathbb{Q}\)-martingale. The Radon-Nikodym derivative is:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}}\bigg|_{\mathcal{F}_t} = \exp\left(-\frac{\mu - r}{\sigma} W_t - \frac{1}{2} \left(\frac{\mu - r}{\sigma}\right)^2 t\right). \]Under \(\mathbb{Q}\), the stock price follows:
\[ dS_t = r S_t \, dt + \sigma S_t \, d\tilde{W}_t, \]where \(\tilde{W}_t = W_t + \frac{\mu - r}{\sigma} t\) is a \(\mathbb{Q}\)-Brownian motion.
Important Notes and Common Pitfalls:
- Absolute Continuity is Required: The Radon-Nikodym derivative exists only if \(\mathbb{Q} \ll \mathbb{P}\). If \(\mathbb{Q}\) and \(\mathbb{P}\) are not absolutely continuous (e.g., they have disjoint support), the derivative does not exist.
- Novikov Condition: In Girsanov's theorem, the Novikov condition ensures that \(Z_t\) is a martingale. Without it, \(Z_t\) may not be a valid Radon-Nikodym derivative (e.g., it could explode).
- Strict Positivity: If \(\mathbb{Q} \sim \mathbb{P}\), then \(\frac{d\mathbb{Q}}{d\mathbb{P}} > 0\) \(\mathbb{P}\)-a.s. This is crucial for ensuring that the change of measure is invertible.
- Expectation Calculations: When computing expectations under a new measure, ensure that the Radon-Nikodym derivative is correctly applied. A common mistake is to forget to multiply by \(\frac{d\mathbb{Q}}{d\mathbb{P}}\) when changing measures.
- Martingale Property: The Radon-Nikodym derivative \(Z_t = \frac{d\mathbb{Q}}{d\mathbb{P}}\big|_{\mathcal{F}_t}\) is always a \(\mathbb{P}\)-martingale. This is a key property used in proving Girsanov's theorem.
- Finite-Dimensional Distributions: When changing measures for stochastic processes, the Radon-Nikodym derivative is typically defined for finite-dimensional distributions (e.g., \(\mathcal{F}_t\)-measurable sets). Extending this to infinite-dimensional spaces requires additional care (e.g., Kolmogorov's extension theorem).
Practical Applications:
- Derivatives Pricing (Risk-Neutral Measure):
The Radon-Nikodym derivative is central to the change of measure from the real-world (physical) measure \(\mathbb{P}\) to the risk-neutral measure \(\mathbb{Q}\). Under \(\mathbb{Q}\), the discounted asset prices are martingales, simplifying the pricing of derivatives. For example, in the Black-Scholes model, the Radon-Nikodym derivative adjusts the drift of the stock price from \(\mu\) to the risk-free rate \(r\).
- Importance Sampling:
In Monte Carlo simulations, the Radon-Nikodym derivative is used to change the sampling measure to reduce variance. For example, if simulating under \(\mathbb{P}\) is inefficient, one can simulate under \(\mathbb{Q}\) and reweight the samples using \(\frac{d\mathbb{P}}{d\mathbb{Q}}\).
- Stochastic Control and Filtering:
In stochastic control problems (e.g., portfolio optimization), the Radon-Nikodym derivative is used to express the value function under different measures. In filtering theory (e.g., Kalman filter), it appears in the change of measure between the signal and observation processes.
- Credit Risk Modeling:
In reduced-form credit risk models, the Radon-Nikodym derivative is used to change the measure from the risk-neutral measure to the real-world measure when calibrating default intensities or hazard rates.
- Interest Rate Models:
In the Heath-Jarrow-Morton (HJM) framework, the Radon-Nikodym derivative is used to ensure the absence of arbitrage by changing the measure to the forward risk-neutral measure. This is critical for deriving the drift restrictions on the forward rate process.
- Bayesian Statistics:
The Radon-Nikodym derivative appears in Bayesian statistics as the likelihood ratio, where the prior and posterior measures are related via the likelihood function. This is analogous to the change of measure in probability theory.
Radon-Nikodym Derivative for Exponential Martingales:
Let \(M_t\) be a \(\mathbb{P}\)-martingale, and define the exponential martingale:
\[ Z_t = \mathcal{E}(M)_t = \exp\left(M_t - \frac{1}{2} \langle M \rangle_t\right), \]where \(\langle M \rangle_t\) is the quadratic variation of \(M_t\). If \(Z_t\) is a true martingale (e.g., under the Novikov condition), then it can be used as a Radon-Nikodym derivative to define a new measure \(\mathbb{Q}\) via:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}}\bigg|_{\mathcal{F}_t} = Z_t. \]Under \(\mathbb{Q}\), the process \(M_t - \langle M \rangle_t\) is a martingale.
Example: Change of Measure for a Poisson Process
Let \(N_t\) be a Poisson process with intensity \(\lambda\) under \(\mathbb{P}\). Define a new measure \(\mathbb{Q}\) such that \(N_t\) has intensity \(\nu\) under \(\mathbb{Q}\). The Radon-Nikodym derivative is:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}}\bigg|_{\mathcal{F}_t} = \left(\frac{\nu}{\lambda}\right)^{N_t} e^{-(\nu - \lambda)t}. \]Derivation:
The likelihood ratio for a Poisson process over \([0, t]\) is the ratio of the probabilities of observing \(N_t = k\) under \(\mathbb{Q}\) and \(\mathbb{P}\):
\[ \frac{\mathbb{Q}(N_t = k)}{\mathbb{P}(N_t = k)} = \frac{e^{-\nu t} (\nu t)^k / k!}{e^{-\lambda t} (\lambda t)^k / k!} = \left(\frac{\nu}{\lambda}\right)^k e^{-(\nu - \lambda)t}. \]Thus, the Radon-Nikodym derivative is:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}}\bigg|_{\mathcal{F}_t} = \left(\frac{\nu}{\lambda}\right)^{N_t} e^{-(\nu - \lambda)t}. \]Topic 19: Novikov's Condition for Girsanov's Theorem
Girsanov’s Theorem: A fundamental result in stochastic calculus that describes how the dynamics of stochastic processes change under an equivalent probability measure. Specifically, it provides a way to transform a Brownian motion under one measure into a Brownian motion with drift under another equivalent measure.
Novikov’s Condition: A sufficient condition ensuring that the exponential local martingale associated with a change of measure (via Girsanov’s Theorem) is a true martingale. This guarantees that the new measure is well-defined and equivalent to the original measure.
Girsanov’s Theorem (Statement):
Let \( W_t \) be a \( d \)-dimensional Brownian motion under the probability measure \( \mathbb{P} \). Let \( \theta_t \) be an \( \mathbb{R}^d \)-valued, \( \mathcal{F}_t \)-adapted process such that the process
\[ Z_t = \exp\left( \int_0^t \theta_s \cdot dW_s - \frac{1}{2} \int_0^t \|\theta_s\|^2 ds \right) \]is a martingale. Define a new probability measure \( \mathbb{Q} \) by
\[ \frac{d\mathbb{Q}}{d\mathbb{P}} \bigg|_{\mathcal{F}_t} = Z_t. \]Then, the process
\[ \tilde{W}_t = W_t - \int_0^t \theta_s ds \]is a \( d \)-dimensional Brownian motion under \( \mathbb{Q} \).
Novikov’s Condition:
Let \( \theta_t \) be an \( \mathbb{R}^d \)-valued, \( \mathcal{F}_t \)-adapted process. If
\[ \mathbb{E}\left[ \exp\left( \frac{1}{2} \int_0^T \|\theta_s\|^2 ds \right) \right] < \infty, \]then the process \( Z_t \) defined above is a martingale for \( t \in [0, T] \).
Key Concepts:
- Equivalent Measures: Two probability measures \( \mathbb{P} \) and \( \mathbb{Q} \) are equivalent if they agree on which events have probability zero. Girsanov’s Theorem constructs such a \( \mathbb{Q} \) from \( \mathbb{P} \).
- Exponential Local Martingale: The process \( Z_t \) is a local martingale. Novikov’s Condition ensures it is a true martingale, which is necessary for \( \mathbb{Q} \) to be a valid probability measure.
- Change of Drift: Under \( \mathbb{Q} \), the Brownian motion \( W_t \) gains a drift \( \theta_t \), transforming into \( \tilde{W}_t \).
Example: Verifying Novikov’s Condition
Let \( \theta_t = \mu \) (a constant) for \( t \in [0, T] \). Check Novikov’s Condition:
\[ \mathbb{E}\left[ \exp\left( \frac{1}{2} \int_0^T \|\mu\|^2 ds \right) \right] = \exp\left( \frac{1}{2} \|\mu\|^2 T \right) < \infty. \]Since the expectation is finite, Novikov’s Condition is satisfied, and \( Z_t \) is a martingale. Thus, \( \tilde{W}_t = W_t - \mu t \) is a Brownian motion under \( \mathbb{Q} \).
Derivation of Novikov’s Condition (Sketch):
- Exponential Local Martingale: The process \( Z_t \) is a local martingale by Itô’s formula. To show it is a true martingale, we need to verify that \( \mathbb{E}[Z_t] = 1 \) for all \( t \).
- Sufficient Condition: Novikov’s Condition is derived by bounding the moments of \( Z_t \). Specifically, for any \( \lambda \in (0, 1) \), \[ \mathbb{E}[Z_t^\lambda] \leq \mathbb{E}\left[ \exp\left( \frac{\lambda}{2} \int_0^t \|\theta_s\|^2 ds \right) \right]^{1-\lambda}. \]
- Martingale Property: If Novikov’s Condition holds, then \( \mathbb{E}[Z_t] = 1 \), and \( Z_t \) is a martingale. This follows from the fact that \( Z_t \) is a supermartingale (since it is a non-negative local martingale) and \( \mathbb{E}[Z_t] \leq 1 \). Equality holds if \( \mathbb{E}[Z_t] = 1 \), which is guaranteed by Novikov’s Condition.
Practical Applications:
- Risk-Neutral Pricing: In quantitative finance, Girsanov’s Theorem is used to change from the real-world measure \( \mathbb{P} \) to the risk-neutral measure \( \mathbb{Q} \). Novikov’s Condition ensures the existence of \( \mathbb{Q} \). For example, in the Black-Scholes model, the market price of risk \( \theta_t = \frac{\mu - r}{\sigma} \) must satisfy Novikov’s Condition to define the risk-neutral measure.
- Stochastic Control: In optimal control problems, Girsanov’s Theorem is used to transform controlled processes into uncontrolled ones under a new measure. Novikov’s Condition ensures the measure is valid.
- Filtering Theory: In nonlinear filtering, Girsanov’s Theorem is used to change measures to simplify the observation process. Novikov’s Condition ensures the new measure is well-defined.
Example: Black-Scholes Model
In the Black-Scholes model, the stock price \( S_t \) follows:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t, \]where \( W_t \) is a Brownian motion under \( \mathbb{P} \). To price options, we switch to the risk-neutral measure \( \mathbb{Q} \), where \( S_t \) has drift \( r \) (the risk-free rate). The market price of risk is \( \theta = \frac{\mu - r}{\sigma} \). Novikov’s Condition is satisfied because \( \theta \) is constant:
\[ \mathbb{E}\left[ \exp\left( \frac{1}{2} \int_0^T \theta^2 ds \right) \right] = \exp\left( \frac{1}{2} \theta^2 T \right) < \infty. \]Thus, \( \tilde{W}_t = W_t + \theta t \) is a Brownian motion under \( \mathbb{Q} \), and \( S_t \) follows:
\[ dS_t = r S_t dt + \sigma S_t d\tilde{W}_t. \]Common Pitfalls and Important Notes:
- Local vs. True Martingale: Novikov’s Condition is sufficient but not necessary. There are processes \( \theta_t \) for which \( Z_t \) is a martingale but Novikov’s Condition fails. However, it is the most commonly used condition in practice.
- Boundedness: If \( \theta_t \) is bounded, Novikov’s Condition is automatically satisfied. This is often the case in simple models like Black-Scholes.
- Time Horizon: Novikov’s Condition must hold for the entire time horizon \( [0, T] \). If \( \theta_t \) is unbounded as \( t \to \infty \), the condition may fail for large \( T \).
- Higher Dimensions: In multi-dimensional settings, \( \|\theta_s\|^2 \) is the squared Euclidean norm of the vector \( \theta_s \). The condition must account for all components.
- Alternative Conditions: Other conditions, such as the Kazamaki condition, can also ensure \( Z_t \) is a martingale. The Kazamaki condition is weaker than Novikov’s but harder to verify in practice.
Kazamaki’s Condition (for comparison):
If \( \theta_t \) is such that
\[ \mathbb{E}\left[ \exp\left( \frac{1}{2} \int_0^T \theta_s \cdot dW_s \right) \right] < \infty, \]then \( Z_t \) is a martingale. This is weaker than Novikov's Condition but often more difficult to check.
Topic 20: Black-Scholes SDE and Its Derivation from GBM
Geometric Brownian Motion (GBM): A continuous-time stochastic process where the logarithm of the variable follows a Brownian motion with drift. It is commonly used to model stock prices in financial mathematics.
Black-Scholes Stochastic Differential Equation (SDE): The SDE that describes the evolution of a stock price under the assumptions of the Black-Scholes model. It is derived from GBM and forms the foundation for option pricing.
Ito's Lemma: A fundamental result in stochastic calculus that provides a way to compute the differential of a function of a stochastic process. It is essential for deriving the Black-Scholes PDE from the SDE.
Geometric Brownian Motion (GBM):
\[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t \] where:- \( S_t \) is the stock price at time \( t \),
- \( \mu \) is the drift (expected return),
- \( \sigma \) is the volatility,
- \( W_t \) is a Wiener process (Brownian motion).
Solution to GBM:
The solution to the GBM SDE is given by: \[ S_t = S_0 \exp\left( \left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W_t \right) \] where \( S_0 \) is the initial stock price.Black-Scholes SDE (Risk-Neutral Measure):
Under the risk-neutral measure \( \mathbb{Q} \), the drift \( \mu \) is replaced by the risk-free rate \( r \): \[ dS_t = r S_t \, dt + \sigma S_t \, dW_t^\mathbb{Q} \] where \( W_t^\mathbb{Q} \) is a Wiener process under \( \mathbb{Q} \).Derivation of the Black-Scholes SDE from GBM
Step 1: Start with GBM
The GBM for a stock price \( S_t \) is given by:
\[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t \]Step 2: Apply Ito's Lemma to \( \ln(S_t) \)
Let \( f(S_t) = \ln(S_t) \). By Ito's Lemma: \[ df(S_t) = \left( \frac{\partial f}{\partial t} + \mu S_t \frac{\partial f}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 f}{\partial S^2} \right) dt + \sigma S_t \frac{\partial f}{\partial S} dW_t \] Compute the partial derivatives: \[ \frac{\partial f}{\partial t} = 0, \quad \frac{\partial f}{\partial S} = \frac{1}{S_t}, \quad \frac{\partial^2 f}{\partial S^2} = -\frac{1}{S_t^2} \] Substitute into Ito's Lemma: \[ d(\ln S_t) = \left( 0 + \mu S_t \cdot \frac{1}{S_t} + \frac{1}{2} \sigma^2 S_t^2 \cdot \left(-\frac{1}{S_t^2}\right) \right) dt + \sigma S_t \cdot \frac{1}{S_t} dW_t \] Simplify: \[ d(\ln S_t) = \left( \mu - \frac{\sigma^2}{2} \right) dt + \sigma dW_t \]Step 3: Integrate to find \( S_t \)
Integrate both sides from \( 0 \) to \( t \): \[ \ln S_t - \ln S_0 = \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma (W_t - W_0) \] Since \( W_0 = 0 \), exponentiate both sides to solve for \( S_t \): \[ S_t = S_0 \exp\left( \left( \mu - \frac{\sigma^2}{2} \right) t + \sigma W_t \right) \]Step 4: Risk-Neutral Measure
Under the risk-neutral measure \( \mathbb{Q} \), the drift \( \mu \) is replaced by the risk-free rate \( r \): \[ dS_t = r S_t \, dt + \sigma S_t \, dW_t^\mathbb{Q} \] The solution under \( \mathbb{Q} \) is: \[ S_t = S_0 \exp\left( \left(r - \frac{\sigma^2}{2}\right) t + \sigma W_t^\mathbb{Q} \right) \]Black-Scholes PDE:
The Black-Scholes PDE for the price \( V(S_t, t) \) of a derivative is derived from the Black-Scholes SDE using Ito's Lemma and no-arbitrage arguments: \[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + r S \frac{\partial V}{\partial S} - r V = 0 \]Derivation of the Black-Scholes PDE
Step 1: Construct a Portfolio
Consider a portfolio \( \Pi \) consisting of one option \( V(S_t, t) \) and \( -\Delta \) shares of the stock \( S_t \): \[ \Pi = V - \Delta S_t \]Step 2: Compute the Change in Portfolio Value
Using Ito's Lemma, the change in \( V \) is: \[ dV = \left( \frac{\partial V}{\partial t} + \mu S_t \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} \right) dt + \sigma S_t \frac{\partial V}{\partial S} dW_t \] The change in \( \Pi \) is: \[ d\Pi = dV - \Delta dS_t \] Substitute \( dS_t \) from the GBM SDE: \[ d\Pi = \left( \frac{\partial V}{\partial t} + \mu S_t \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} - \Delta \mu S_t \right) dt + \left( \sigma S_t \frac{\partial V}{\partial S} - \Delta \sigma S_t \right) dW_t \]Step 3: Eliminate Randomness
Choose \( \Delta = \frac{\partial V}{\partial S} \) to eliminate the \( dW_t \) term: \[ d\Pi = \left( \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} \right) dt \]Step 4: No-Arbitrage Argument
In the absence of arbitrage, the portfolio must earn the risk-free rate \( r \): \[ d\Pi = r \Pi \, dt \] Substitute \( \Pi = V - \Delta S_t \): \[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} = r \left( V - \frac{\partial V}{\partial S} S_t \right) \] Rearrange to obtain the Black-Scholes PDE: \[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + r S \frac{\partial V}{\partial S} - r V = 0 \]Key Notes and Common Pitfalls
- Risk-Neutral Measure: The Black-Scholes SDE is derived under the risk-neutral measure, where the drift \( \mu \) is replaced by the risk-free rate \( r \). This is crucial for option pricing.
- Ito's Lemma: Misapplying Ito's Lemma is a common mistake. Remember that the second-order term \( \frac{\partial^2 f}{\partial S^2} \) is essential for stochastic processes.
- Log-Normal Distribution: The solution to GBM shows that \( S_t \) is log-normally distributed. This property is key for deriving the Black-Scholes option pricing formula.
- Volatility: The Black-Scholes model assumes constant volatility \( \sigma \). In practice, volatility is stochastic and may vary with time and price (e.g., in local volatility or stochastic volatility models).
- Dividends: The Black-Scholes SDE does not account for dividends. For dividend-paying stocks, the drift term is adjusted to \( (r - q) \), where \( q \) is the dividend yield.
- Numerical Instability: When implementing the Black-Scholes PDE numerically, ensure stability by choosing appropriate time and space discretizations (e.g., using implicit methods).
Practical Applications
- Option Pricing: The Black-Scholes SDE is the foundation for the Black-Scholes-Merton option pricing formula, which gives the theoretical price of European call and put options.
- Risk Management: The Black-Scholes framework is used to compute Greeks (Delta, Gamma, Vega, Theta, Rho), which are essential for hedging and risk management.
- Implied Volatility: The Black-Scholes model is used to back out implied volatility from market option prices, which is a key input for trading strategies.
- Real Options: The Black-Scholes SDE is applied in corporate finance to value real options, such as the option to invest, abandon, or expand a project.
- Monte Carlo Simulations: The solution to the Black-Scholes SDE is used to simulate stock price paths for Monte Carlo valuation of exotic options or path-dependent derivatives.
For a European call option with strike \( K \) and maturity \( T \), the price \( C(S_t, t) \) is:
\[ C(S_t, t) = S_t N(d_1) - K e^{-r(T-t)} N(d_2) \] where: \[ d_1 = \frac{\ln(S_t / K) + (r + \sigma^2 / 2)(T - t)}{\sigma \sqrt{T - t}}, \quad d_2 = d_1 - \sigma \sqrt{T - t} \] and \( N(\cdot) \) is the cumulative distribution function of the standard normal distribution.Topic 21: Black-Scholes PDE via Itô's Lemma and Hedging
Stochastic Differential Equation (SDE): An equation of the form
\[ dX_t = \mu(X_t, t) dt + \sigma(X_t, t) dW_t \]where \(X_t\) is a stochastic process, \(\mu\) is the drift term, \(\sigma\) is the diffusion term, and \(W_t\) is a Wiener process (Brownian motion).
Itô’s Lemma: A fundamental result in stochastic calculus that provides a way to compute the differential of a function of a stochastic process. For a function \(f(t, X_t)\) where \(X_t\) follows an SDE, Itô’s Lemma states:
\[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t \]Black-Scholes PDE: A partial differential equation that describes the price of a European option over time. It is derived using Itô’s Lemma and the principle of no-arbitrage.
Black-Scholes SDE for Stock Price:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t \]where \(S_t\) is the stock price, \(\mu\) is the expected return, \(\sigma\) is the volatility, and \(W_t\) is a Wiener process.
Black-Scholes PDE:
\[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + rS \frac{\partial V}{\partial S} - rV = 0 \]where \(V(S,t)\) is the price of the option, \(S\) is the stock price, \(t\) is time, \(r\) is the risk-free interest rate, and \(\sigma\) is the volatility.
Derivation of the Black-Scholes PDE
Step 1: Assume the Stock Price Follows Geometric Brownian Motion (GBM)
The stock price \(S_t\) is modeled as:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t \]Step 2: Construct a Portfolio
Consider a portfolio \(\Pi\) consisting of one option \(V(S,t)\) and \(-\Delta\) shares of the stock, where \(\Delta = \frac{\partial V}{\partial S}\). The value of the portfolio is:
\[ \Pi = V - \Delta S \]Step 3: Compute the Change in Portfolio Value
Using Itô’s Lemma, the change in the option price \(dV\) is:
\[ dV = \left( \frac{\partial V}{\partial t} + \mu S \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} \right) dt + \sigma S \frac{\partial V}{\partial S} dW_t \]The change in the portfolio value is:
\[ d\Pi = dV - \Delta dS \]Substituting \(dV\) and \(dS\):
\[ d\Pi = \left( \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} \right) dt \]Step 4: Apply No-Arbitrage Principle
For the portfolio to be risk-free, its return must equal the risk-free rate \(r\):
\[ d\Pi = r \Pi dt \]Substituting \(\Pi\) and \(d\Pi\):
\[ \left( \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} \right) dt = r (V - \Delta S) dt \]Step 5: Simplify and Obtain the Black-Scholes PDE
Substitute \(\Delta = \frac{\partial V}{\partial S}\) and simplify:
\[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + rS \frac{\partial V}{\partial S} - rV = 0 \]Practical Applications
1. Pricing European Options:
The Black-Scholes PDE is used to derive closed-form solutions for the prices of European call and put options. The solutions are given by the Black-Scholes formula:
For a European call option:
\[ C(S,t) = S N(d_1) - K e^{-r(T-t)} N(d_2) \]For a European put option:
\[ P(S,t) = K e^{-r(T-t)} N(-d_2) - S N(-d_1) \]where:
\[ d_1 = \frac{\ln(S/K) + (r + \sigma^2/2)(T-t)}{\sigma \sqrt{T-t}} \] \[ d_2 = d_1 - \sigma \sqrt{T-t} \]and \(N(\cdot)\) is the cumulative distribution function of the standard normal distribution.
2. Hedging:
The Black-Scholes PDE provides the theoretical foundation for delta hedging. The delta of an option, \(\Delta = \frac{\partial V}{\partial S}\), indicates how much of the underlying asset to hold to hedge the option position.
3. Implied Volatility:
The Black-Scholes model is used to compute implied volatility, which is the volatility that makes the Black-Scholes option price equal to the market price. This is a key input for trading and risk management.
Common Pitfalls and Important Notes
1. Assumptions of the Black-Scholes Model:
- The stock price follows geometric Brownian motion with constant drift \(\mu\) and volatility \(\sigma\).
- Markets are frictionless (no transaction costs, no taxes, and assets are infinitely divisible).
- There are no arbitrage opportunities.
- The risk-free interest rate \(r\) and volatility \(\sigma\) are constant and known.
- The stock does not pay dividends (though the model can be extended to include dividends).
These assumptions are often violated in real markets, leading to limitations in the model's applicability.
2. Misinterpretation of \(\Delta\):
The delta \(\Delta = \frac{\partial V}{\partial S}\) represents the sensitivity of the option price to changes in the underlying asset price. However, it is not constant and changes with the stock price and time. Continuous rebalancing is required for effective delta hedging.
3. Volatility Smile and Skew:
The Black-Scholes model assumes constant volatility, but in practice, implied volatility varies with strike price and maturity, leading to the "volatility smile" or "skew." This indicates that the model does not fully capture market dynamics.
4. Numerical Solutions:
While the Black-Scholes PDE has closed-form solutions for European options, more complex derivatives (e.g., American options, exotic options) often require numerical methods such as finite difference methods, Monte Carlo simulation, or binomial trees for solving the PDE.
5. Importance of Itô’s Lemma:
Itô’s Lemma is the cornerstone of the derivation of the Black-Scholes PDE. A solid understanding of Itô’s Lemma is essential for working with stochastic processes and derivatives pricing.
Quant Interview Questions
1. Derive the Black-Scholes PDE using Itô’s Lemma.
Answer: Follow the step-by-step derivation provided above, starting from the SDE of the stock price and constructing a risk-free portfolio.
2. What are the assumptions of the Black-Scholes model, and how do they limit its applicability?
Answer: The key assumptions are constant volatility, no arbitrage, frictionless markets, and geometric Brownian motion for the stock price. Limitations include the inability to capture volatility smile/skew, transaction costs, and discrete trading.
3. Explain delta hedging in the context of the Black-Scholes model.
Answer: Delta hedging involves holding \(\Delta = \frac{\partial V}{\partial S}\) shares of the underlying asset to offset the risk of an option position. The Black-Scholes PDE ensures that a continuously rebalanced delta-hedged portfolio earns the risk-free rate.
4. How would you modify the Black-Scholes PDE to account for dividends?
Answer: For a continuous dividend yield \(q\), the Black-Scholes PDE becomes:
\[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + (r - q)S \frac{\partial V}{\partial S} - rV = 0 \]The dividend yield reduces the effective drift of the stock price from \(r\) to \(r - q\).
5. What is the relationship between the Black-Scholes PDE and the risk-neutral measure?
Answer: The Black-Scholes PDE is derived under the risk-neutral measure, where the expected return of the stock is the risk-free rate \(r\). This measure simplifies the pricing of derivatives by eliminating the need to estimate the market price of risk.
Topic 22: Feynman-Kac Theorem and Its Connection to PDEs
Feynman-Kac Theorem: The Feynman-Kac theorem establishes a profound connection between stochastic differential equations (SDEs) and partial differential equations (PDEs). It provides a probabilistic representation for the solutions of certain linear parabolic PDEs using expectations of functionals of diffusion processes.
Formally, it states that under suitable conditions, the solution \( u(t, x) \) to a PDE of the form:
\[ \frac{\partial u}{\partial t} + \mu(t, x) \frac{\partial u}{\partial x} + \frac{1}{2} \sigma^2(t, x) \frac{\partial^2 u}{\partial x^2} - V(t, x) u + f(t, x) = 0, \] with terminal condition \( u(T, x) = g(x) \), can be represented as: \[ u(t, x) = \mathbb{E}\left[ \int_t^T e^{-\int_t^s V(r, X_r) dr} f(s, X_s) ds + e^{-\int_t^T V(r, X_r) dr} g(X_T) \mid X_t = x \right], \] where \( X_t \) is a diffusion process solving the SDE \( dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t \).Key Formula: Feynman-Kac Representation
Consider the PDE: \[ \frac{\partial u}{\partial t} + \mathcal{L} u - V u + f = 0, \quad u(T, x) = g(x), \] where \( \mathcal{L} \) is the infinitesimal generator of the diffusion \( X_t \): \[ \mathcal{L} = \mu(t, x) \frac{\partial}{\partial x} + \frac{1}{2} \sigma^2(t, x) \frac{\partial^2}{\partial x^2}. \] The Feynman-Kac formula gives the solution as: \[ u(t, x) = \mathbb{E}\left[ \int_t^T e^{-\int_t^s V(r, X_r) dr} f(s, X_s) ds + e^{-\int_t^T V(r, X_r) dr} g(X_T) \mid X_t = x \right]. \]For the special case where \( V = 0 \) and \( f = 0 \), this simplifies to:
\[ u(t, x) = \mathbb{E}\left[ g(X_T) \mid X_t = x \right]. \]Infinitesimal Generator \( \mathcal{L} \): The operator \( \mathcal{L} \) is defined for a function \( \phi(x) \) as:
\[ \mathcal{L} \phi(x) = \lim_{t \downarrow 0} \frac{\mathbb{E}[\phi(X_t) \mid X_0 = x] - \phi(x)}{t}. \] For the diffusion \( dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t \), the generator is: \[ \mathcal{L} = \mu(t, x) \frac{\partial}{\partial x} + \frac{1}{2} \sigma^2(t, x) \frac{\partial^2}{\partial x^2}. \]Example: Pricing a Zero-Coupon Bond
Consider a short-rate model where the interest rate \( r_t \) follows the SDE:
\[ dr_t = \mu(r_t) dt + \sigma(r_t) dW_t. \] The price \( P(t, T) \) of a zero-coupon bond with maturity \( T \) is given by the PDE: \[ \frac{\partial P}{\partial t} + \mu(r) \frac{\partial P}{\partial r} + \frac{1}{2} \sigma^2(r) \frac{\partial^2 P}{\partial r^2} - r P = 0, \quad P(T, r) = 1. \]Using the Feynman-Kac theorem, the solution is:
\[ P(t, r) = \mathbb{E}\left[ e^{-\int_t^T r_s ds} \mid r_t = r \right]. \]This is the expected discounted payoff under the risk-neutral measure.
Example: European Option Pricing
The Black-Scholes PDE for a European call option with payoff \( (S_T - K)^+ \) is:
\[ \frac{\partial V}{\partial t} + r S \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} - r V = 0, \quad V(T, S) = (S - K)^+. \]The Feynman-Kac representation gives the solution as:
\[ V(t, S) = \mathbb{E}\left[ e^{-r(T-t)} (S_T - K)^+ \mid S_t = S \right], \] where \( S_t \) follows geometric Brownian motion: \[ dS_t = r S_t dt + \sigma S_t dW_t. \]Key Observations and Pitfalls:
- Linearity: The Feynman-Kac theorem applies to linear PDEs. Nonlinear PDEs (e.g., those arising in portfolio optimization) require extensions like the nonlinear Feynman-Kac theorem.
- Terminal vs. Initial Conditions: The theorem is typically stated for terminal conditions (e.g., \( u(T, x) = g(x) \)). For initial conditions, reverse time or adjust the PDE accordingly.
- Regularity Conditions: The theorem requires that the coefficients \( \mu, \sigma, V, f \) and the terminal condition \( g \) satisfy certain regularity and growth conditions (e.g., Lipschitz continuity, polynomial growth) to ensure the existence and uniqueness of solutions.
- Numerical Methods: The Feynman-Kac representation is the foundation for Monte Carlo methods in option pricing. However, simulating \( e^{-\int_t^T V(r, X_r) dr} \) can be challenging if \( V \) is not constant.
- Connection to Ito's Lemma: The Feynman-Kac theorem can be derived using Ito's lemma. Apply Ito's lemma to \( Y_t = e^{-\int_0^t V(r, X_r) dr} u(t, X_t) \) and take expectations to recover the PDE.
- Risk-Neutral Pricing: In quantitative finance, the Feynman-Kac theorem underpins the risk-neutral pricing framework. The expectation is taken under the risk-neutral measure \( \mathbb{Q} \), not the physical measure \( \mathbb{P} \).
Derivation Sketch (Using Ito's Lemma):
Let \( X_t \) solve \( dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t \), and define:
\[ Y_t = e^{-\int_0^t V(r, X_r) dr} u(t, X_t). \]Apply Ito's lemma to \( Y_t \):
\[ dY_t = e^{-\int_0^t V(r, X_r) dr} \left( \frac{\partial u}{\partial t} + \mu \frac{\partial u}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 u}{\partial x^2} - V u \right) dt + e^{-\int_0^t V(r, X_r) dr} \sigma \frac{\partial u}{\partial x} dW_t. \]Substitute the PDE \( \frac{\partial u}{\partial t} + \mathcal{L} u - V u + f = 0 \):
\[ dY_t = -e^{-\int_0^t V(r, X_r) dr} f(t, X_t) dt + e^{-\int_0^t V(r, X_r) dr} \sigma \frac{\partial u}{\partial x} dW_t. \]Integrate from \( t \) to \( T \) and take expectations:
\[ \mathbb{E}[Y_T \mid \mathcal{F}_t] - Y_t = -\mathbb{E}\left[ \int_t^T e^{-\int_0^s V(r, X_r) dr} f(s, X_s) ds \mid \mathcal{F}_t \right]. \]Rearrange and use \( Y_T = e^{-\int_0^T V(r, X_r) dr} g(X_T) \):
\[ u(t, X_t) = \mathbb{E}\left[ \int_t^T e^{-\int_t^s V(r, X_r) dr} f(s, X_s) ds + e^{-\int_t^T V(r, X_r) dr} g(X_T) \mid \mathcal{F}_t \right]. \]This is the Feynman-Kac representation.
Practical Applications:
- Option Pricing: The Feynman-Kac theorem provides the theoretical foundation for pricing derivatives by expressing the price as an expectation under the risk-neutral measure. This is the basis for Monte Carlo simulation in finance.
- Interest Rate Modeling: In fixed income, the theorem connects short-rate models (e.g., Vasicek, CIR) to bond prices via expectations of discounted payoffs.
- Stochastic Control: The theorem is used in stochastic control problems to derive the Hamilton-Jacobi-Bellman (HJB) equation and its probabilistic representation.
- Physics: In quantum mechanics, the Feynman-Kac formula relates the Schrödinger equation to path integrals, providing a probabilistic interpretation of quantum mechanics.
- Credit Risk: The theorem is used to model default probabilities and credit spreads by representing survival probabilities as expectations of indicator functions.
Interview Question: Feynman-Kac for a Barrier Option
Question: Consider a down-and-out call option with barrier \( H \), strike \( K \), and maturity \( T \). The underlying asset \( S_t \) follows geometric Brownian motion under the risk-neutral measure:
\[ dS_t = r S_t dt + \sigma S_t dW_t. \] Write down the PDE for the option price \( V(t, S) \) and its Feynman-Kac representation.Solution:
The PDE is the Black-Scholes PDE with a boundary condition at \( S = H \):
\[ \frac{\partial V}{\partial t} + r S \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} - r V = 0, \quad V(T, S) = (S - K)^+ \mathbf{1}_{\{S > H\}}, \] with boundary condition \( V(t, H) = 0 \) for all \( t \leq T \).The Feynman-Kac representation is:
\[ V(t, S) = \mathbb{E}\left[ e^{-r(T-t)} (S_T - K)^+ \mathbf{1}_{\{\tau_H > T\}} \mid S_t = S \right], \] where \( \tau_H = \inf\{t \geq 0 : S_t \leq H\} \) is the first hitting time of the barrier.Common Interview Questions:
- Explain the Feynman-Kac theorem and its connection to the Black-Scholes PDE.
- Derive the Feynman-Kac representation for a European call option.
- How does the Feynman-Kac theorem relate to Monte Carlo simulation in option pricing?
- What are the regularity conditions required for the Feynman-Kac theorem to hold?
- Consider a PDE with a source term \( f(t, x) \). How does this appear in the Feynman-Kac representation?
- Explain how the Feynman-Kac theorem can be used to price a zero-coupon bond in the Vasicek model.
- What is the role of the infinitesimal generator \( \mathcal{L} \) in the Feynman-Kac theorem?
- How would you use the Feynman-Kac theorem to compute the price of an Asian option?
- Discuss the differences between the Feynman-Kac theorem and the Girsanov theorem in the context of derivative pricing.
- Explain how the Feynman-Kac theorem can be extended to handle jump-diffusion processes.
Topic 23: Risk-Neutral Pricing and Martingale Measures
Risk-Neutral Probability Measure (ℚ): A probability measure under which the discounted price processes of all traded assets are martingales. In this measure, all assets grow at the risk-free rate, and investors are indifferent to risk (hence "risk-neutral").
Equivalent Martingale Measure (EMM): A probability measure ℚ that is equivalent to the real-world measure ℙ (i.e., they agree on which events have zero probability) and under which the discounted asset price process is a martingale.
Fundamental Theorem of Asset Pricing (FTAP): A market model is arbitrage-free if and only if there exists at least one equivalent martingale measure. If the EMM is unique, the market is also complete (i.e., every contingent claim can be hedged).
Numéraire: A traded asset used as a unit of account for pricing other assets. The choice of numéraire affects the risk-neutral measure. Common numéraires include the money-market account (risk-free asset) and the stock price itself.
Change of Numéraire Formula: Let \( N(t) \) be the numéraire, and \( B(t) = e^{rt} \) be the money-market account. The Radon-Nikodym derivative for the change of measure from the risk-neutral measure ℚ (using \( B(t) \)) to the forward measure ℚN (using \( N(t) \)) is: \[ \frac{d\mathbb{Q}^N}{d\mathbb{Q}} \bigg|_t = \frac{N(0) B(t)}{B(0) N(t)}. \]
Risk-Neutral Pricing Formula: The price \( V_t \) of a contingent claim with payoff \( \Phi(S_T) \) at time \( T \) is given by the discounted expectation under the risk-neutral measure ℚ: \[ V_t = \mathbb{E}^\mathbb{Q} \left[ \frac{B(t)}{B(T)} \Phi(S_T) \bigg| \mathcal{F}_t \right] = e^{-r(T-t)} \mathbb{E}^\mathbb{Q} \left[ \Phi(S_T) \bigg| \mathcal{F}_t \right], \] where \( B(t) = e^{rt} \) is the money-market account, \( r \) is the risk-free rate, and \( \mathcal{F}_t \) is the filtration up to time \( t \).
Girsanov's Theorem (Change of Measure): Let \( W_t^\mathbb{P} \) be a Brownian motion under the real-world measure ℙ. Define the Radon-Nikodym derivative: \[ \frac{d\mathbb{Q}}{d\mathbb{P}} \bigg|_t = \exp \left( -\int_0^t \theta_s dW_s^\mathbb{P} - \frac{1}{2} \int_0^t \theta_s^2 ds \right), \] where \( \theta_t \) is the market price of risk. Then \( W_t^\mathbb{Q} = W_t^\mathbb{P} + \int_0^t \theta_s ds \) is a Brownian motion under ℚ.
Black-Scholes PDE under Risk-Neutral Measure: The price \( V(t, S_t) \) of a derivative in the Black-Scholes framework satisfies: \[ \frac{\partial V}{\partial t} + r S_t \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} = r V, \] where \( S_t \) follows \( dS_t = r S_t dt + \sigma S_t dW_t^\mathbb{Q} \) under ℚ.
Example: Pricing a European Call Option
Consider a European call option with strike \( K \) and maturity \( T \) on a stock \( S_t \) following geometric Brownian motion under ℚ: \[ dS_t = r S_t dt + \sigma S_t dW_t^\mathbb{Q}. \] The risk-neutral price at time \( t \) is: \[ C(t, S_t) = \mathbb{E}^\mathbb{Q} \left[ e^{-r(T-t)} (S_T - K)^+ \bigg| \mathcal{F}_t \right]. \] Solving this expectation (e.g., via the Black-Scholes formula) gives: \[ C(t, S_t) = S_t N(d_1) - K e^{-r(T-t)} N(d_2), \] where \( N(\cdot) \) is the standard normal CDF, and: \[ d_1 = \frac{\ln(S_t / K) + (r + \sigma^2/2)(T-t)}{\sigma \sqrt{T-t}}, \quad d_2 = d_1 - \sigma \sqrt{T-t}. \]
Example: Change of Numéraire (Forward Measure)
Let \( N(t) = S_t \) (the stock price) be the numéraire. The forward measure ℚS is defined by: \[ \frac{d\mathbb{Q}^S}{d\mathbb{Q}} \bigg|_t = \frac{S_t B(0)}{S_0 B(t)} = \frac{S_t}{S_0 e^{rt}}. \] Under ℚS, the discounted stock price \( S_t / S_t = 1 \) is a martingale, and the forward price \( F(t, T) = S_t e^{r(T-t)} \) is a ℚS-martingale. This measure is useful for pricing options on futures or forward contracts.
Key Notes and Pitfalls:
- No Arbitrage vs. Risk-Neutral Pricing: Risk-neutral pricing relies on the absence of arbitrage, but it does not assume investors are actually risk-neutral. The risk-neutral measure is a mathematical tool to simplify pricing by removing the need to estimate risk premia.
- Market Completeness: In incomplete markets (e.g., with stochastic volatility or jumps), there are infinitely many equivalent martingale measures. Additional criteria (e.g., utility maximization) are needed to select a unique pricing measure.
- Numéraire Invariance: The choice of numéraire does not affect the price of a claim, but it can simplify calculations. For example, using the stock price as numéraire eliminates the drift in the stock process under the forward measure.
- Girsanov's Theorem: The market price of risk \( \theta_t \) must satisfy the Novikov condition \( \mathbb{E}^\mathbb{P} \left[ \exp \left( \frac{1}{2} \int_0^T \theta_s^2 ds \right) \right] < \infty \) for the change of measure to be valid.
- Dividends: If the stock pays dividends at rate \( q \), the risk-neutral dynamics become \( dS_t = (r - q) S_t dt + \sigma S_t dW_t^\mathbb{Q} \). The risk-neutral pricing formula remains the same, but the drift is adjusted.
- Common Mistake: Confusing the real-world measure ℙ (used for forecasting or statistical estimation) with the risk-neutral measure ℚ (used for pricing). For example, the expected return of a stock under ℚ is the risk-free rate \( r \), not its real-world expected return \( \mu \).
Practical Applications:
- Derivatives Pricing: Risk-neutral pricing is the foundation for pricing options, swaps, and other derivatives in the Black-Scholes, Heston, and other models.
- Interest Rate Models: The Heath-Jarrow-Morton (HJM) framework uses the risk-neutral measure to model the evolution of the entire yield curve.
- Credit Risk: Reduced-form credit models (e.g., Jarrow-Turnbull) use risk-neutral measures to price credit derivatives like CDS.
- Foreign Exchange: The Garman-Kohlhagen model extends Black-Scholes to FX options by using the risk-neutral measure for the domestic and foreign risk-free rates.
- Energy Markets: Risk-neutral pricing is used for commodities derivatives, where the convenience yield plays a role analogous to dividends.
- Algorithmic Trading: Risk-neutral measures are used to derive no-arbitrage bounds for pricing and hedging strategies.
Interview Questions (Common):
- Explain the difference between the real-world measure ℙ and the risk-neutral measure ℚ. Why do we use ℚ for pricing?
- Derive the Black-Scholes formula using risk-neutral pricing. What assumptions are made?
- What is the market price of risk? How does it relate to the change of measure via Girsanov's theorem?
- Explain the Fundamental Theorem of Asset Pricing. What does it imply about arbitrage and completeness?
- How does the choice of numéraire affect the risk-neutral measure? Give an example using the forward measure.
- A stock pays dividends at rate \( q \). How does this affect its risk-neutral dynamics?
- What is the risk-neutral density, and how can it be extracted from option prices?
- Explain why the discounted stock price is a martingale under the risk-neutral measure. What happens if it is not?
- How would you price a derivative in an incomplete market? What additional information is needed?
- What is the Radon-Nikodym derivative, and how is it used in the change of measure?
Topic 24: Derivation of the Black-Scholes Formula via SDEs
Black-Scholes Model: A mathematical model for pricing an options contract, particularly European options. It assumes that the price of the underlying asset follows a geometric Brownian motion with constant drift and volatility.
Geometric Brownian Motion (GBM): A continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion (Wiener process). The SDE for GBM is given by:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t \] where:- \(S_t\) is the price of the underlying asset at time \(t\),
- \(\mu\) is the drift (expected return),
- \(\sigma\) is the volatility,
- \(dW_t\) is the increment of a Wiener process (standard Brownian motion).
Ito's Lemma: A fundamental result in stochastic calculus that provides a way to compute the differential of a function of a stochastic process. For a function \(f(t, S_t)\), Ito's Lemma states:
\[ df(t, S_t) = \left( \frac{\partial f}{\partial t} + \mu S_t \frac{\partial f}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 f}{\partial S^2} \right) dt + \sigma S_t \frac{\partial f}{\partial S} dW_t \]Black-Scholes PDE: The partial differential equation that the price \(V(t, S_t)\) of a derivative must satisfy under the Black-Scholes model:
\[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + r S \frac{\partial V}{\partial S} - r V = 0 \] where \(r\) is the risk-free interest rate.Black-Scholes Formula for a European Call Option:
\[ C(S_t, t) = S_t N(d_1) - K e^{-r(T-t)} N(d_2) \] where: \[ d_1 = \frac{\ln(S_t / K) + (r + \sigma^2 / 2)(T - t)}{\sigma \sqrt{T - t}} \] \[ d_2 = d_1 - \sigma \sqrt{T - t} \]- \(C(S_t, t)\) is the price of the call option,
- \(S_t\) is the current price of the underlying asset,
- \(K\) is the strike price,
- \(T\) is the maturity time,
- \(r\) is the risk-free interest rate,
- \(\sigma\) is the volatility of the underlying asset,
- \(N(\cdot)\) is the cumulative distribution function of the standard normal distribution.
Black-Scholes Formula for a European Put Option:
\[ P(S_t, t) = K e^{-r(T-t)} N(-d_2) - S_t N(-d_1) \] where \(d_1\) and \(d_2\) are defined as above.Derivation of the Black-Scholes Formula
The derivation of the Black-Scholes formula via SDEs involves the following steps:
- Model the Underlying Asset Price:
Assume the price of the underlying asset \(S_t\) follows a geometric Brownian motion:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t \] - Construct a Riskless Portfolio:
Consider a portfolio consisting of one option and \(-\Delta\) shares of the underlying asset. The value of this portfolio is:
\[ \Pi_t = V(S_t, t) - \Delta S_t \]Using Ito's Lemma, the change in the portfolio value is:
\[ d\Pi_t = dV - \Delta dS_t = \left( \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} \right) dt + \left( \frac{\partial V}{\partial S} - \Delta \right) dS_t \]To make the portfolio riskless, choose \(\Delta = \frac{\partial V}{\partial S}\), eliminating the stochastic term \(dW_t\). The portfolio then evolves as:
\[ d\Pi_t = \left( \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 V}{\partial S^2} \right) dt \] - Apply No-Arbitrage Principle:
Since the portfolio is riskless, it must earn the risk-free rate \(r\):
\[ d\Pi_t = r \Pi_t dt \]Substituting \(\Pi_t\) and \(d\Pi_t\) gives the Black-Scholes PDE:
\[ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} + r S \frac{\partial V}{\partial S} - r V = 0 \] - Solve the Black-Scholes PDE:
The Black-Scholes PDE can be solved using a change of variables and recognizing it as the heat equation. The solution for a European call option is:
\[ C(S_t, t) = S_t N(d_1) - K e^{-r(T-t)} N(d_2) \]where \(d_1\) and \(d_2\) are as defined above. The put option price can be derived using put-call parity.
Example: Pricing a European Call Option
Consider a European call option with the following parameters:
- Current stock price \(S_0 = 100\),
- Strike price \(K = 100\),
- Time to maturity \(T = 1\) year,
- Risk-free rate \(r = 0.05\),
- Volatility \(\sigma = 0.2\).
Compute \(d_1\) and \(d_2\):
\[ d_1 = \frac{\ln(100 / 100) + (0.05 + 0.2^2 / 2) \cdot 1}{0.2 \sqrt{1}} = \frac{0 + 0.07}{0.2} = 0.35 \] \[ d_2 = 0.35 - 0.2 \sqrt{1} = 0.15 \]Using standard normal distribution tables or a calculator, find \(N(d_1)\) and \(N(d_2)\):
\[ N(0.35) \approx 0.6368, \quad N(0.15) \approx 0.5596 \]The call option price is:
\[ C = 100 \cdot 0.6368 - 100 e^{-0.05 \cdot 1} \cdot 0.5596 \approx 63.68 - 53.26 = 10.42 \]Important Notes and Common Pitfalls
- Assumptions of the Black-Scholes Model:
The Black-Scholes model relies on several key assumptions, including:
- Markets are efficient (no arbitrage opportunities).
- The underlying asset price follows a geometric Brownian motion with constant drift and volatility.
- No dividends are paid during the life of the option.
- There are no transaction costs or taxes.
- The risk-free interest rate and volatility are constant and known.
- Trading is continuous, and the asset is infinitely divisible.
Violations of these assumptions can lead to significant pricing errors.
- Volatility Smile/Skew:
Empirical observations show that implied volatility (the volatility parameter that makes the Black-Scholes price match the market price) varies with strike price and maturity, leading to the "volatility smile" or "volatility skew." This suggests that the Black-Scholes model may not fully capture the dynamics of real markets.
- Dividends:
The Black-Scholes formula can be adjusted to account for dividends. For continuous dividends at rate \(q\), the formulas for \(d_1\) and \(d_2\) become:
\[ d_1 = \frac{\ln(S_t / K) + (r - q + \sigma^2 / 2)(T - t)}{\sigma \sqrt{T - t}} \] \[ d_2 = d_1 - \sigma \sqrt{T - t} \]The call option price is then:
\[ C(S_t, t) = S_t e^{-q(T-t)} N(d_1) - K e^{-r(T-t)} N(d_2) \] - Numerical Instability:
When computing \(d_1\) and \(d_2\), numerical instability can occur if \(S_t\) is very close to zero or if \(T - t\) is very small. Care should be taken to handle these edge cases, often by using approximations or limits.
- Risk-Neutral Valuation:
The Black-Scholes formula is derived under the risk-neutral measure, where the expected return of the underlying asset is the risk-free rate. This is a fundamental concept in derivative pricing, ensuring that the price does not depend on the investor's risk preferences.
Practical Applications
- Option Pricing:
The primary application of the Black-Scholes formula is to price European call and put options. It provides a closed-form solution that is widely used in practice, despite its simplifying assumptions.
- Implied Volatility:
Given the market price of an option, the Black-Scholes formula can be inverted to solve for the implied volatility. This is a key input for trading and risk management, as it reflects the market's view of future volatility.
- Hedging:
The Black-Scholes model provides a framework for dynamic hedging. The delta (\(\Delta = \frac{\partial V}{\partial S}\)) gives the number of shares of the underlying asset needed to hedge the option, allowing traders to manage risk.
- Exotic Options:
While the Black-Scholes formula is derived for European options, the underlying PDE and concepts are foundational for pricing more complex derivatives, such as American options, barrier options, and Asian options, often using numerical methods.
- Risk Management:
The Greeks (delta, gamma, vega, theta, rho) derived from the Black-Scholes model are essential tools for managing the risk of options portfolios. They quantify the sensitivity of the option price to various factors.
Topic 25: Local Volatility Models (Dupire's Equation)
Local Volatility Model: A deterministic function \(\sigma_{LV}(S_t, t)\) that describes the volatility of an asset's price \(S_t\) as a function of both the current asset price and time. Unlike constant volatility models (e.g., Black-Scholes), local volatility models allow for a more accurate fit to the observed market prices of vanilla options.
Dupire’s Equation: A partial differential equation (PDE) derived by Bruno Dupire in 1994 that provides a way to compute the local volatility surface \(\sigma_{LV}(K, T)\) from the market prices of European options. The equation ensures that the local volatility model reproduces the market prices of all European options for all strikes \(K\) and maturities \(T\).
The Dupire’s forward PDE for the risk-neutral density \(f(K, T)\) of the asset price \(S_T\) at time \(T\) is given by:
\[ \frac{\partial C}{\partial T} = \frac{1}{2} \sigma_{LV}^2(K, T) K^2 \frac{\partial^2 C}{\partial K^2} - (r - q) K \frac{\partial C}{\partial K} - q C, \] where:- \(C(K, T)\) is the price of a European call option with strike \(K\) and maturity \(T\),
- \(r\) is the risk-free interest rate,
- \(q\) is the dividend yield,
- \(\sigma_{LV}(K, T)\) is the local volatility function.
Solving Dupire’s equation for \(\sigma_{LV}(K, T)\) yields the Dupire’s formula for local volatility:
\[ \sigma_{LV}^2(K, T) = \frac{\frac{\partial C}{\partial T} + (r - q) K \frac{\partial C}{\partial K} + q C}{\frac{1}{2} K^2 \frac{\partial^2 C}{\partial K^2}}. \]Derivation of Dupire’s Equation
We start with the Black-Scholes PDE for a European call option \(C(S_t, t)\) under a general local volatility model:
\[ \frac{\partial C}{\partial t} + \frac{1}{2} \sigma_{LV}^2(S_t, t) S_t^2 \frac{\partial^2 C}{\partial S_t^2} + (r - q) S_t \frac{\partial C}{\partial S_t} - r C = 0. \]We perform a change of variables to express the PDE in terms of strike \(K\) and maturity \(T\) (instead of \(S_t\) and \(t\)). Let \(C(K, T)\) be the price of a call option with strike \(K\) and maturity \(T\). Using the relationship \(T = t + \tau\) (where \(\tau\) is time to maturity), we rewrite the PDE in terms of \(K\) and \(T\):
\[ \frac{\partial C}{\partial T} = \frac{1}{2} \sigma_{LV}^2(K, T) K^2 \frac{\partial^2 C}{\partial K^2} - (r - q) K \frac{\partial C}{\partial K} - q C. \]This is Dupire’s forward PDE. To solve for \(\sigma_{LV}(K, T)\), we rearrange the equation:
\[ \sigma_{LV}^2(K, T) = \frac{\frac{\partial C}{\partial T} + (r - q) K \frac{\partial C}{\partial K} + q C}{\frac{1}{2} K^2 \frac{\partial^2 C}{\partial K^2}}. \]This formula allows us to compute the local volatility surface from the market prices of European options.
Example: Computing Local Volatility from Market Data
Suppose we observe the following market prices for European call options on an asset with \(S_0 = 100\), \(r = 0.05\), and \(q = 0.02\):
| Strike \(K\) | Maturity \(T\) (years) | Call Price \(C(K, T)\) |
|---|---|---|
| 90 | 1.0 | 15.0 |
| 100 | 1.0 | 10.0 |
| 110 | 1.0 | 6.0 |
| 100 | 0.5 | 8.0 |
To compute \(\sigma_{LV}(100, 1.0)\), we approximate the partial derivatives numerically:
- \(\frac{\partial C}{\partial T} \approx \frac{C(100, 1.0) - C(100, 0.5)}{1.0 - 0.5} = \frac{10.0 - 8.0}{0.5} = 4.0\),
- \(\frac{\partial C}{\partial K} \approx \frac{C(110, 1.0) - C(90, 1.0)}{110 - 90} = \frac{6.0 - 15.0}{20} = -0.45\),
- \(\frac{\partial^2 C}{\partial K^2} \approx \frac{\left(\frac{C(110, 1.0) - C(100, 1.0)}{10}\right) - \left(\frac{C(100, 1.0) - C(90, 1.0)}{10}\right)}{10} = \frac{(0.6 - 1.0) - (1.0 - 1.5)}{10} = 0.01\).
Plugging these into Dupire’s formula:
\[ \sigma_{LV}^2(100, 1.0) = \frac{4.0 + (0.05 - 0.02) \cdot 100 \cdot (-0.45) + 0.02 \cdot 10.0}{\frac{1}{2} \cdot 100^2 \cdot 0.01} = \frac{4.0 - 1.35 + 0.2}{50} = \frac{2.85}{50} = 0.057. \]Thus, \(\sigma_{LV}(100, 1.0) \approx \sqrt{0.057} \approx 0.239\) or 23.9%.
Dupire’s Local Volatility in Terms of Implied Volatility: In practice, market option prices are often quoted in terms of implied volatility \(\sigma_{imp}(K, T)\). Dupire’s formula can be rewritten in terms of \(\sigma_{imp}(K, T)\) and its derivatives:
\[ \sigma_{LV}^2(K, T) = \frac{\sigma_{imp}^2 + 2 T \sigma_{imp} \left( \frac{\partial \sigma_{imp}}{\partial T} + (r - q) K \frac{\partial \sigma_{imp}}{\partial K} \right)}{\left(1 - \frac{K}{\sigma_{imp}} \frac{\partial \sigma_{imp}}{\partial K}\right)^2 + K^2 \sigma_{imp} \left( \frac{\partial^2 \sigma_{imp}}{\partial K^2} - \frac{1}{\sigma_{imp}} \left( \frac{\partial \sigma_{imp}}{\partial K} \right)^2 + \frac{1}{K} \frac{\partial \sigma_{imp}}{\partial K} \right)}. \]This form is useful when working directly with implied volatility surfaces.
Practical Applications
- Exotic Option Pricing: Local volatility models are widely used to price exotic options (e.g., barriers, Asians) because they can fit the market prices of vanilla options exactly. This ensures consistency with the market.
- Volatility Surface Calibration: Dupire’s equation provides a method to construct a local volatility surface that reproduces the observed implied volatility surface from market data.
- Risk Management: Local volatility models allow for more accurate hedging of options portfolios, as they account for the dependence of volatility on the underlying asset price and time.
- Model Validation: By comparing the local volatility surface derived from Dupire’s equation with other volatility models (e.g., stochastic volatility models), practitioners can assess the validity of their assumptions.
Common Pitfalls and Important Notes
- Numerical Instability: Dupire’s formula involves second derivatives of option prices with respect to strike, which can be noisy and lead to numerical instability. Smoothing techniques (e.g., splines, Tikhonov regularization) are often required to stabilize the computation.
- Arbitrage-Free Inputs: The market option prices used as inputs to Dupire’s equation must be arbitrage-free. This means ensuring no calendar spread arbitrage (monotonicity in time) and no butterfly arbitrage (convexity in strike).
- Extrapolation Issues: Dupire’s equation relies on the availability of option prices for a continuum of strikes and maturities. In practice, market data is discrete and sparse, requiring interpolation and extrapolation, which can introduce errors.
- Time-Dependent Parameters: Dupire’s formula assumes that the risk-free rate \(r\) and dividend yield \(q\) are constant. If these parameters are time-dependent, the formula must be adjusted accordingly.
- Forward vs. Spot Volatility: Dupire’s local volatility is a function of strike \(K\) and maturity \(T\), not the spot price \(S_t\). This can lead to confusion when interpreting the local volatility surface. The forward local volatility \(\sigma_{LV}(F_{t,T}, T)\) (where \(F_{t,T}\) is the forward price) is often more intuitive.
- Comparison with Stochastic Volatility Models: Local volatility models assume that volatility is a deterministic function of \(S_t\) and \(t\). This can lead to unrealistic dynamics (e.g., volatility smile flattening as time to maturity increases). Stochastic volatility models (e.g., Heston) address this by introducing randomness in volatility.
Quant Interview: Most Asked Questions on Local Volatility Models
-
Explain the difference between local volatility and implied volatility.
Answer: Implied volatility \(\sigma_{imp}(K, T)\) is the constant volatility parameter in the Black-Scholes model that reproduces the market price of a European option with strike \(K\) and maturity \(T\). It is derived from market prices and is a function of \(K\) and \(T\). Local volatility \(\sigma_{LV}(S_t, t)\) is a deterministic function that describes how the instantaneous volatility of the asset price \(S_t\) depends on \(S_t\) and \(t\). While implied volatility is a "summary" of the market's view of future volatility, local volatility is a model that attempts to explain the dynamics of the underlying asset to fit the implied volatility surface.
-
Derive Dupire’s formula for local volatility.
Answer: See the derivation section above. Start with the Black-Scholes PDE under local volatility, change variables to strike \(K\) and maturity \(T\), and solve for \(\sigma_{LV}(K, T)\).
-
Why is Dupire’s local volatility model arbitrage-free?
Answer: Dupire’s model is arbitrage-free by construction because it is derived from the market prices of European options, which are assumed to be arbitrage-free. The local volatility surface \(\sigma_{LV}(K, T)\) is chosen such that the model reproduces the market prices of all European options for all strikes and maturities. However, this assumes that the input market prices are themselves arbitrage-free (e.g., no calendar spread or butterfly arbitrage).
-
What are the limitations of local volatility models?
Answer:
- Deterministic Volatility: Local volatility models assume that volatility is a deterministic function of \(S_t\) and \(t\). This fails to capture the randomness in volatility observed in markets (e.g., volatility clustering).
- Volatility Smile Dynamics: Local volatility models predict that the implied volatility smile flattens as time to maturity increases, which is often not observed in markets. Stochastic volatility models better capture the dynamics of the smile.
- Numerical Challenges: Computing local volatility from Dupire’s formula requires numerical differentiation of option prices, which can be unstable and sensitive to noise in the data.
- Extrapolation: Local volatility models may perform poorly for strikes or maturities where market data is sparse or unavailable.
-
How would you compute the local volatility surface from market data?
Answer:
- Collect market prices of European options for a range of strikes \(K\) and maturities \(T\).
- Ensure the data is arbitrage-free (e.g., no calendar spread or butterfly arbitrage).
- Interpolate the option prices (or implied volatilities) to create a smooth surface \(C(K, T)\) or \(\sigma_{imp}(K, T)\).
- Compute the partial derivatives \(\frac{\partial C}{\partial T}\), \(\frac{\partial C}{\partial K}\), and \(\frac{\partial^2 C}{\partial K^2}\) numerically (e.g., using finite differences or splines).
- Apply Dupire’s formula to compute \(\sigma_{LV}(K, T)\).
- Smooth the resulting local volatility surface to avoid numerical instabilities.
-
Explain the concept of "sticky delta" and "sticky strike" in the context of local volatility models.
Answer:
- Sticky Strike: In a sticky strike model, the implied volatility for a given strike \(K\) remains constant as the underlying asset price \(S_t\) changes. This is the behavior predicted by local volatility models, where \(\sigma_{imp}(K, T)\) is a function of \(K\) and \(T\) only. However, this can lead to unrealistic dynamics (e.g., the volatility smile moving with the underlying).
- Sticky Delta: In a sticky delta model, the implied volatility for a given moneyness (e.g., delta) remains constant as \(S_t\) changes. This is more consistent with market observations, where the volatility smile tends to "stick" to the moneyness of the option rather than the strike. Stochastic volatility models (e.g., Heston) often exhibit sticky delta behavior.
Topic 26: Stochastic Volatility Models (Heston Model)
Stochastic Volatility Models: A class of financial models where the volatility of an asset's price is not constant but follows a stochastic (random) process. These models address the limitations of constant volatility models (e.g., Black-Scholes) by capturing phenomena such as volatility clustering and the leverage effect.
Heston Model (1993): A widely used stochastic volatility model where the asset price \( S_t \) and its variance \( v_t \) (volatility squared) follow correlated stochastic processes. The model is named after Steven Heston, who introduced it in his 1993 paper.
The Heston model is defined by the following system of stochastic differential equations (SDEs):
\[ \begin{aligned} dS_t &= \mu S_t dt + \sqrt{v_t} S_t dW_t^1, \\ dv_t &= \kappa (\theta - v_t) dt + \xi \sqrt{v_t} dW_t^2, \end{aligned} \]where:
- \( S_t \): Price of the underlying asset at time \( t \),
- \( v_t \): Variance of the asset's returns at time \( t \),
- \( \mu \): Drift (expected return) of the asset,
- \( \kappa \): Rate of mean reversion of the variance process,
- \( \theta \): Long-term mean of the variance,
- \( \xi \): Volatility of volatility (vol of vol),
- \( dW_t^1 \) and \( dW_t^2 \): Wiener processes with correlation \( \rho \), i.e., \( dW_t^1 dW_t^2 = \rho dt \).
Key Features of the Heston Model:
- Mean Reversion: The variance \( v_t \) reverts to its long-term mean \( \theta \) at rate \( \kappa \).
- Stochastic Volatility: The volatility \( \sqrt{v_t} \) is not constant but evolves randomly.
- Correlation: The Wiener processes driving the asset price and variance are correlated, allowing the model to capture the leverage effect (negative correlation between asset returns and volatility).
- Closed-Form Solution: The Heston model admits a semi-closed-form solution for European option prices, making it computationally efficient.
Important Formulas
Characteristic Function of the Heston Model:
The characteristic function \( \phi(u, t) = \mathbb{E}[e^{iu \ln S_t} | S_0, v_0] \) is given by:
\[ \phi(u, t) = \exp \left( C(u, t) + D(u, t) v_0 + iu \ln S_0 \right), \]where:
\[ \begin{aligned} C(u, t) &= \kappa \theta \left( r_- t - \frac{2}{\xi^2} \ln \left( \frac{1 - g e^{-dt}}{1 - g} \right) \right), \\ D(u, t) &= r_- \frac{1 - e^{-dt}}{1 - g e^{-dt}}, \\ r_\pm &= \frac{\beta \pm d}{\xi^2}, \quad g = \frac{r_-}{r_+}, \quad d = \sqrt{\beta^2 - \alpha \xi^2}, \\ \alpha &= -u^2 - iu, \quad \beta = \kappa - \rho \xi iu. \end{aligned} \]The characteristic function is used in the Lewis-Lipton formula or Fourier inversion methods to price European options.
European Call Option Price (Heston Formula):
The price \( C(S_0, v_0, K, T) \) of a European call option with strike \( K \) and maturity \( T \) is:
\[ C(S_0, v_0, K, T) = S_0 P_1 - K e^{-rT} P_2, \]where \( P_1 \) and \( P_2 \) are risk-neutral probabilities computed using the characteristic function:
\[ \begin{aligned} P_j &= \frac{1}{2} + \frac{1}{\pi} \int_0^\infty \text{Re} \left( \frac{e^{-iu \ln K} \phi_j(u, T)}{iu} \right) du, \quad j = 1, 2, \\ \phi_1(u, T) &= \phi(u - i, T), \quad \phi_2(u, T) = \phi(u, T). \end{aligned} \]Here, \( \phi(u, T) \) is the characteristic function of \( \ln S_T \) under the risk-neutral measure.
Feller Condition:
The variance process \( v_t \) is guaranteed to remain positive if the Feller condition is satisfied:
\[ 2 \kappa \theta > \xi^2. \]If this condition is violated, the variance process can reach zero, and the model may require reflection or absorption at zero to remain well-defined.
Derivations
Derivation of the Heston Characteristic Function:
The characteristic function \( \phi(u, t) = \mathbb{E}[e^{iu \ln S_t}] \) is derived by solving a partial differential equation (PDE) using the Feynman-Kac theorem. The steps are as follows:
- Define the Process:
Under the risk-neutral measure \( \mathbb{Q} \), the Heston model is:
\[ \begin{aligned} dS_t &= r S_t dt + \sqrt{v_t} S_t dW_t^1, \\ dv_t &= \kappa (\theta - v_t) dt + \xi \sqrt{v_t} dW_t^2, \quad dW_t^1 dW_t^2 = \rho dt. \end{aligned} \] - Apply Itô's Lemma:
Let \( x_t = \ln S_t \). By Itô's lemma:
\[ dx_t = \left( r - \frac{v_t}{2} \right) dt + \sqrt{v_t} dW_t^1. \] - Formulate the PDE:
The characteristic function \( \phi(u, t) = \mathbb{E}[e^{iu x_t}] \) satisfies the PDE:
\[ \frac{\partial \phi}{\partial t} + \left( r - \frac{v}{2} \right) iu \phi + \kappa (\theta - v) \frac{\partial \phi}{\partial v} + \frac{1}{2} v u^2 \phi + \frac{1}{2} \xi^2 v \frac{\partial^2 \phi}{\partial v^2} + \rho \xi v iu \frac{\partial \phi}{\partial v} = 0, \]with terminal condition \( \phi(u, T) = e^{iu x_T} \).
- Ansatz Solution:
Assume a solution of the form:
\[ \phi(u, t) = \exp \left( C(u, \tau) + D(u, \tau) v + iu x \right), \quad \tau = T - t. \] - Solve ODEs for \( C \) and \( D \):
Substitute the ansatz into the PDE to obtain two ordinary differential equations (ODEs):
\[ \begin{aligned} \frac{dD}{d\tau} &= \frac{1}{2} \xi^2 D^2 + (\rho \xi iu - \kappa) D + \frac{1}{2} (u^2 + iu), \\ \frac{dC}{d\tau} &= \kappa \theta D + r iu. \end{aligned} \]The solution to these ODEs yields the expressions for \( C(u, t) \) and \( D(u, t) \) given in the characteristic function formula.
Practical Applications
1. Pricing European Options:
The Heston model is widely used to price European options, especially when the Black-Scholes model fails to fit market data (e.g., for options with different strikes and maturities). The semi-closed-form solution allows for efficient computation of option prices.
Steps:
- Calibrate the Heston model parameters \( (\kappa, \theta, \xi, \rho, v_0) \) to market option prices.
- Compute the characteristic function \( \phi(u, T) \).
- Use Fourier inversion (e.g., Lewis-Lipton formula or Carr-Madan formula) to compute the option price.
2. Volatility Surface Modeling:
The Heston model can generate a volatility surface that matches market observations, including the volatility smile and skew. This is useful for risk management and trading strategies.
3. Hedging and Risk Management:
The model's ability to capture stochastic volatility and correlation between asset returns and volatility makes it useful for dynamic hedging strategies. Traders can hedge not only the underlying asset but also volatility risk.
4. Exotic Options Pricing:
While the Heston model has a closed-form solution for European options, it can also be used to price exotic options (e.g., barriers, Asians) via Monte Carlo simulation or PDE methods.
Common Pitfalls and Important Notes
1. Feller Condition:
If the Feller condition \( 2 \kappa \theta > \xi^2 \) is not satisfied, the variance process \( v_t \) can hit zero. This may lead to numerical instability in simulations or pricing. Some implementations use reflection or absorption at zero to handle this case.
2. Correlation \( \rho \):
The correlation \( \rho \) between the asset price and variance processes is crucial for capturing the leverage effect. A negative \( \rho \) (typically observed in equity markets) implies that volatility tends to increase when the asset price decreases. Misestimating \( \rho \) can lead to poor hedging performance.
3. Parameter Estimation:
Calibrating the Heston model to market data is non-trivial due to the non-linearity of the model. Common methods include:
- Least Squares Calibration: Minimize the difference between model and market option prices.
- Characteristic Function Methods: Use the characteristic function to fit the model to the implied volatility surface.
Poor calibration can lead to mispricing and ineffective hedging.
4. Numerical Integration:
The integral in the Heston option pricing formula (for \( P_1 \) and \( P_2 \)) must be computed numerically. Care must be taken to handle the integrand's oscillatory behavior and singularities. Common techniques include:
- Simpson's rule or adaptive quadrature.
- Fast Fourier Transform (FFT) methods (e.g., Carr-Madan formula).
5. Volatility of Volatility \( \xi \):
The parameter \( \xi \) (vol of vol) controls the randomness of the volatility process. High \( \xi \) leads to more extreme volatility movements, which can affect the tails of the return distribution. This parameter is critical for pricing options with long maturities or extreme strikes.
6. Mean Reversion Speed \( \kappa \):
The parameter \( \kappa \) determines how quickly the variance reverts to its long-term mean \( \theta \). A high \( \kappa \) implies rapid mean reversion, while a low \( \kappa \) allows for prolonged deviations from \( \theta \). This affects the term structure of volatility.
7. Monte Carlo Simulation:
When using Monte Carlo simulation for the Heston model:
- Use the Euler-Maruyama scheme with care, as it can produce negative variances. The Milstein scheme or a reflection principle may be more appropriate.
- Antithetic variates or control variates can improve convergence.
Topic 27: Heston Model SDE and Its Characteristic Function
Heston Model: A stochastic volatility model where both the asset price \( S_t \) and its variance \( v_t \) follow stochastic processes. The model captures the volatility smile observed in markets by allowing the variance to be mean-reverting and correlated with the asset price.
Stochastic Differential Equations (SDEs) in the Heston Model:
- The asset price \( S_t \) follows a geometric Brownian motion with stochastic variance \( \sqrt{v_t} \).
- The variance \( v_t \) follows a Cox-Ingersoll-Ross (CIR) process, which is mean-reverting and ensures non-negativity.
The Heston model is defined by the following system of SDEs:
\[ \begin{aligned} dS_t &= \mu S_t dt + \sqrt{v_t} S_t dW_t^S, \\ dv_t &= \kappa (\theta - v_t) dt + \xi \sqrt{v_t} dW_t^v, \end{aligned} \] where:- \( S_t \): Asset price at time \( t \),
- \( v_t \): Instantaneous variance at time \( t \),
- \( \mu \): Drift (expected return) of the asset,
- \( \kappa \): Mean reversion speed of the variance,
- \( \theta \): Long-term mean of the variance,
- \( \xi \): Volatility of volatility (vol-of-vol),
- \( dW_t^S \) and \( dW_t^v \): Wiener processes with correlation \( \rho \), i.e., \( dW_t^S dW_t^v = \rho dt \).
Characteristic Function: A complex-valued function that uniquely defines the probability distribution of a random variable. For the Heston model, the characteristic function of the log-asset price \( \ln(S_t) \) is derived to price options using Fourier transform methods.
The characteristic function \( \phi(u; t) \) of \( \ln(S_t) \) under the Heston model is given by:
\[ \phi(u; t) = \mathbb{E}\left[ e^{iu \ln(S_t)} \mid \mathcal{F}_0 \right] = e^{C(u, t) + D(u, t) v_0 + iu \ln(S_0)}, \] where \( C(u, t) \) and \( D(u, t) \) satisfy the following system of ordinary differential equations (ODEs): \[ \begin{aligned} \frac{dD}{dt} &= -\frac{1}{2} u^2 + iu \rho \xi D + \frac{1}{2} \xi^2 D^2 - \kappa D, \\ \frac{dC}{dt} &= iu \mu + \kappa \theta D. \end{aligned} \] The solutions to these ODEs are: \[ \begin{aligned} D(u, t) &= \frac{a - b}{\xi^2} \cdot \frac{1 - e^{-bt}}{1 - g e^{-bt}}, \\ C(u, t) &= iu \mu t + \frac{\kappa \theta}{\xi^2} \left[ (a - b)t - 2 \ln\left( \frac{1 - g e^{-bt}}{1 - g} \right) \right], \end{aligned} \] where: \[ \begin{aligned} a &= \kappa - iu \rho \xi, \\ b &= \sqrt{a^2 + \xi^2 (u^2 + iu)}, \\ g &= \frac{a - b}{a + b}. \end{aligned} \]Derivation of the Heston Characteristic Function (Sketch):strong>
- Apply Itô's Lemma: Start with the SDEs for \( S_t \) and \( v_t \). Define \( x_t = \ln(S_t) \). By Itô's Lemma: \[ dx_t = \left( \mu - \frac{1}{2} v_t \right) dt + \sqrt{v_t} dW_t^S. \]
- Formulate the PDE for the Characteristic Function: The characteristic function \( \phi(u; t) = \mathbb{E}[e^{iu x_t}] \) satisfies the Kolmogorov backward equation: \[ \frac{\partial \phi}{\partial t} = \left( \mu - \frac{1}{2} v \right) iu \phi + \kappa (\theta - v) \frac{\partial \phi}{\partial v} + \frac{1}{2} v u^2 \phi + \frac{1}{2} \xi^2 v \frac{\partial^2 \phi}{\partial v^2} + \rho \xi v u \frac{\partial \phi}{\partial v}. \]
- Ansatz for the Solution: Assume a solution of the form: \[ \phi(u; t) = e^{C(u, t) + D(u, t) v + iu x}. \] Substitute this into the PDE to derive the ODEs for \( C(u, t) \) and \( D(u, t) \).
- Solve the ODEs: The ODE for \( D(u, t) \) is a Riccati equation. Solve it using standard techniques for Riccati equations, and then solve for \( C(u, t) \) by integration.
Option Pricing with the Heston Model: The price of a European call option with strike \( K \) and maturity \( T \) can be computed using the characteristic function via the Lewis-Lipton formula:
\[ C(S_0, K, T) = S_0 - \frac{\sqrt{K}}{\pi} e^{-rT} \int_0^\infty \text{Re}\left[ \frac{e^{-iu \ln(K)} \phi(u - i/2; T)}{u^2 + 1/4} \right] du, \] where \( \phi(u; T) \) is the characteristic function of \( \ln(S_T) \), and \( r \) is the risk-free rate.Example: Computing the Characteristic Function
Given the Heston model parameters:
- \( S_0 = 100 \), \( v_0 = 0.04 \), \( \mu = 0.05 \), \( \kappa = 2 \), \( \theta = 0.04 \), \( \xi = 0.3 \), \( \rho = -0.7 \), \( t = 1 \).
Compute \( \phi(u; t) \) for \( u = 1 \):
- Compute \( a = \kappa - iu \rho \xi = 2 - i \cdot 1 \cdot (-0.7) \cdot 0.3 = 2 + 0.21i \).
- Compute \( b = \sqrt{a^2 + \xi^2 (u^2 + iu)} = \sqrt{(2 + 0.21i)^2 + 0.09 (1 + i)} \).
- Compute \( g = \frac{a - b}{a + b} \).
- Compute \( D(u, t) \) and \( C(u, t) \) using the formulas above.
- Finally, compute \( \phi(u; t) = e^{C(u, t) + D(u, t) v_0 + iu \ln(S_0)} \).
Key Notes and Common Pitfalls:
- Feller Condition: The variance process \( v_t \) remains strictly positive if \( 2 \kappa \theta > \xi^2 \). If this condition is violated, the process can hit zero, leading to numerical instabilities.
- Correlation \( \rho \): The correlation between the asset price and variance processes is crucial for capturing the skew in implied volatility. A negative \( \rho \) is typically observed in equity markets.
- Characteristic Function Branch Cuts: The characteristic function involves complex logarithms and square roots. Care must be taken to handle branch cuts correctly, especially when implementing numerically.
- Numerical Integration: When pricing options, the integral in the Lewis-Lipton formula must be computed numerically. The integrand is highly oscillatory, so adaptive integration methods (e.g., Gauss-Laguerre quadrature) are often used.
- Affine Structure: The Heston model is an affine jump-diffusion model, meaning the characteristic function has an exponential-affine form. This property is key to deriving closed-form solutions for the characteristic function.
Practical Applications:
- Option Pricing: The Heston model is widely used to price European options, especially when the volatility smile is pronounced. The characteristic function allows for efficient computation using Fourier transform methods.
- Volatility Surface Calibration: The model's parameters (\( \kappa, \theta, \xi, \rho \)) can be calibrated to market data to fit the observed volatility surface. This is essential for accurate hedging and risk management.
- Exotic Options: The Heston model can be extended to price exotic options (e.g., barriers, Asians) using Monte Carlo simulation or PDE methods.
- Risk Management: The model provides insights into the dynamics of volatility risk, which is critical for managing portfolios with volatility-sensitive instruments (e.g., variance swaps).
Topic 28: Affine Processes and Their Applications in Finance
Affine Process: A stochastic process \( X = \{X_t\}_{t \geq 0} \) taking values in \( \mathbb{R}^d \) is called affine if its characteristic function has the exponential-affine form:
\[ \mathbb{E}\left[e^{i \mathbf{u}^\top X_t} \mid \mathcal{F}_s\right] = e^{\phi(t-s, \mathbf{u}) + \psi(t-s, \mathbf{u})^\top X_s}, \quad \mathbf{u} \in \mathbb{R}^d, \] where \( \phi: \mathbb{R}_+ \times \mathbb{R}^d \to \mathbb{C} \) and \( \psi: \mathbb{R}_+ \times \mathbb{R}^d \to \mathbb{C}^d \) are deterministic functions.Affine Diffusion: An affine process \( X \) is called an affine diffusion if it is the strong solution to a stochastic differential equation (SDE) of the form:
\[ dX_t = b(X_t) dt + \sigma(X_t) dW_t, \] where \( W \) is a standard Brownian motion, and the drift \( b(x) \) and diffusion matrix \( \sigma(x) \sigma(x)^\top \) are affine functions of \( x \).General Form of Affine Diffusion:
\[ dX_t = (K_0 + K_1 X_t) dt + \sqrt{H_0 + H_1 X_t} \, dW_t, \] where:- \( K_0 \in \mathbb{R}^d \) and \( H_0 \in \mathbb{S}^d_+ \) (the set of symmetric positive semi-definite \( d \times d \) matrices),
- \( K_1 \in \mathbb{R}^{d \times d} \) and \( H_1 \in \mathbb{R}^{d \times d \times d} \) (a tensor such that \( H_1 x \) is symmetric for all \( x \in \mathbb{R}^d \)).
Characteristic Exponent (Ricatti Equations):
The functions \( \phi(\tau, \mathbf{u}) \) and \( \psi(\tau, \mathbf{u}) \) satisfy the following system of ordinary differential equations (ODEs), known as Ricatti equations:
\[ \begin{aligned} \partial_\tau \phi(\tau, \mathbf{u}) &= \frac{1}{2} \psi(\tau, \mathbf{u})^\top H_0 \psi(\tau, \mathbf{u}) + K_0^\top \psi(\tau, \mathbf{u}), \quad \phi(0, \mathbf{u}) = 0, \\ \partial_\tau \psi(\tau, \mathbf{u}) &= \frac{1}{2} \psi(\tau, \mathbf{u})^\top H_1 \psi(\tau, \mathbf{u}) + K_1^\top \psi(\tau, \mathbf{u}), \quad \psi(0, \mathbf{u}) = i \mathbf{u}. \end{aligned} \]Example: Vasicek Model (Affine Diffusion in 1D)
The Vasicek model for interest rates is given by:
\[ dr_t = \kappa (\theta - r_t) dt + \sigma dW_t, \] where \( \kappa, \theta, \sigma > 0 \). This is an affine diffusion with: \[ K_0 = \kappa \theta, \quad K_1 = -\kappa, \quad H_0 = \sigma^2, \quad H_1 = 0. \]The Ricatti equations become:
\[ \begin{aligned} \partial_\tau \phi(\tau, u) &= \frac{1}{2} \sigma^2 \psi(\tau, u)^2 + \kappa \theta \psi(\tau, u), \quad \phi(0, u) = 0, \\ \partial_\tau \psi(\tau, u) &= -\kappa \psi(\tau, u), \quad \psi(0, u) = i u. \end{aligned} \]Solving these ODEs yields:
\[ \psi(\tau, u) = i u e^{-\kappa \tau}, \] \[ \phi(\tau, u) = \left(\theta - \frac{\sigma^2}{2 \kappa^2}\right) \left(\kappa \tau + e^{-\kappa \tau} - 1\right) i u + \frac{\sigma^2}{4 \kappa} \left(1 - e^{-2 \kappa \tau}\right) u^2. \]The characteristic function is then:
\[ \mathbb{E}\left[e^{i u r_t} \mid r_s\right] = \exp\left(\phi(t-s, u) + \psi(t-s, u) r_s\right). \]Zero-Coupon Bond Price in Affine Term Structure Models (ATSMs):
In an ATSM, the price of a zero-coupon bond with maturity \( T \) at time \( t \) is given by:
\[ P(t, T) = \mathbb{E}\left[e^{-\int_t^T r_s \, ds} \mid \mathcal{F}_t\right] = e^{A(T-t) + B(T-t)^\top X_t}, \] where \( A(\tau) \) and \( B(\tau) \) satisfy the following Ricatti equations (with \( \tau = T - t \)): \[ \begin{aligned} \partial_\tau A(\tau) &= \frac{1}{2} B(\tau)^\top H_0 B(\tau) + K_0^\top B(\tau), \quad A(0) = 0, \\ \partial_\tau B(\tau) &= \frac{1}{2} B(\tau)^\top H_1 B(\tau) + K_1^\top B(\tau) - \mathbf{1}, \quad B(0) = \mathbf{0}, \end{aligned} \] and \( \mathbf{1} \) is a vector of ones (assuming \( r_t = \mathbf{1}^\top X_t \)).Example: CIR Model (Affine Term Structure)
The Cox-Ingersoll-Ross (CIR) model for interest rates is given by:
\[ dr_t = \kappa (\theta - r_t) dt + \sigma \sqrt{r_t} dW_t, \] where \( \kappa, \theta, \sigma > 0 \) and \( 2 \kappa \theta \geq \sigma^2 \) (Feller condition). This is an affine diffusion with: \[ K_0 = \kappa \theta, \quad K_1 = -\kappa, \quad H_0 = 0, \quad H_1 = \sigma^2. \]The Ricatti equations for bond pricing are:
\[ \begin{aligned} \partial_\tau A(\tau) &= \kappa \theta B(\tau), \quad A(0) = 0, \\ \partial_\tau B(\tau) &= \frac{1}{2} \sigma^2 B(\tau)^2 - \kappa B(\tau) - 1, \quad B(0) = 0. \end{aligned} \]Solving these ODEs yields:
\[ B(\tau) = \frac{2 \left(e^{\gamma \tau} - 1\right)}{(\gamma + \kappa) \left(e^{\gamma \tau} - 1\right) + 2 \gamma}, \] \[ A(\tau) = \frac{2 \kappa \theta}{\sigma^2} \log \left(\frac{2 \gamma e^{(\gamma + \kappa) \tau / 2}}{(\gamma + \kappa) \left(e^{\gamma \tau} - 1\right) + 2 \gamma}\right), \] where \( \gamma = \sqrt{\kappa^2 + 2 \sigma^2} \).The bond price is then:
\[ P(t, T) = e^{A(T-t) + B(T-t) r_t}. \]Affine Jump-Diffusions:
An affine process can also include jumps. The general form of an affine jump-diffusion is:
\[ dX_t = (K_0 + K_1 X_t) dt + \sqrt{H_0 + H_1 X_t} \, dW_t + dJ_t, \] where \( J_t \) is a pure jump process with affine jump intensity \( \ell_0 + \ell_1 X_t \) and jump size distribution independent of \( X \). The characteristic function is still exponential-affine, but the Ricatti equations are modified to include jump terms.Option Pricing in Affine Models:
For an affine process \( X_t \), the price of a European call option with payoff \( (e^{X_T} - K)^+ \) at time \( t \) is given by:
\[ C(t, X_t) = \frac{1}{2 \pi} \int_{\mathbb{R}} \hat{f}(u) e^{\phi(T-t, -i u) + \psi(T-t, -i u)^\top X_t} du, \] where \( \hat{f}(u) \) is the Fourier transform of the payoff function \( f(x) = (e^x - K)^+ \), and \( \phi \) and \( \psi \) are solutions to the Ricatti equations with \( \mathbf{u} = -i u \).Practical Applications:
- Interest Rate Modeling: Affine processes are widely used in interest rate modeling (e.g., Vasicek, CIR, and multi-factor Gaussian models) due to their tractability and ability to fit the yield curve.
- Credit Risk: Affine jump-diffusions are used to model default intensities in reduced-form credit risk models (e.g., Duffie-Singleton framework). The affine structure allows for closed-form solutions for credit default swap (CDS) spreads and bond prices.
- Stochastic Volatility Modeling: Affine processes are used in stochastic volatility models (e.g., Heston model) to capture the dynamics of volatility. The affine structure enables semi-closed-form solutions for option prices.
- Portfolio Optimization: Affine processes are used in dynamic portfolio optimization problems, where the state variables (e.g., wealth, factor exposures) follow affine dynamics. The affine structure simplifies the Hamilton-Jacobi-Bellman (HJB) equation.
- Derivative Pricing: The affine structure allows for efficient computation of derivative prices using Fourier transform methods, which is particularly useful for path-dependent options and exotic derivatives.
Common Pitfalls and Important Notes:
- Feller Condition: For the CIR model, the Feller condition \( 2 \kappa \theta \geq \sigma^2 \) ensures that the process \( r_t \) remains positive. If this condition is violated, the process can hit zero, leading to potential arbitrage opportunities.
- Existence and Uniqueness: Not all choices of \( K_0, K_1, H_0, H_1 \) lead to a well-defined affine process. The parameters must satisfy certain admissibility conditions to ensure the existence and uniqueness of the solution to the SDE.
- Numerical Stability: Solving the Ricatti equations numerically can be challenging, especially for high-dimensional affine processes. Care must be taken to ensure numerical stability, particularly for long time horizons.
- Dimensionality: While affine processes are tractable, their complexity grows rapidly with the dimension \( d \). High-dimensional affine models may require sophisticated numerical methods for calibration and simulation.
- Non-Affine Extensions: Affine processes are a subset of more general processes. In cases where the affine assumption is too restrictive, consider extensions such as quadratic term structure models or non-affine stochastic volatility models.
- Calibration: Calibrating affine models to market data can be computationally intensive. The choice of calibration method (e.g., least squares, maximum likelihood) and the handling of numerical optimization are critical for accurate results.
Quant Interview Questions on Affine Processes:
- Derive the bond price formula in the Vasicek model. Hint: Start with the SDE for \( r_t \), write down the Ricatti equations for \( A(\tau) \) and \( B(\tau) \), and solve them.
- Explain the Feller condition in the CIR model and its implications. Hint: Discuss the boundary behavior of \( r_t \) and the role of the Feller condition in ensuring positivity.
- How would you calibrate an affine term structure model to market data? Hint: Discuss the choice of objective function, optimization methods, and potential challenges (e.g., overfitting, numerical instability).
- What are the advantages and disadvantages of affine processes in finance? Hint: Discuss tractability, flexibility, and limitations compared to non-affine models.
- Derive the characteristic function of an affine jump-diffusion. Hint: Extend the derivation for affine diffusions to include jump terms, and discuss how the Ricatti equations are modified.
- Explain how the Heston model uses affine processes for option pricing. Hint: Discuss the joint dynamics of the asset price and volatility, and how the affine structure enables semi-closed-form solutions.
- What is the role of the Ricatti equations in affine processes? Hint: Explain how they arise from the exponential-affine form of the characteristic function and their role in derivative pricing.
Topic 29: Jump-Diffusion Processes (Merton Model)
Jump-Diffusion Process: A stochastic process that combines a continuous diffusion component (modeled by a Brownian motion) with a discontinuous jump component (modeled by a compound Poisson process). This hybrid model captures both small, continuous price movements and sudden, large jumps in asset prices.
Merton Jump-Diffusion Model (1976): A specific jump-diffusion model proposed by Robert Merton to describe asset price dynamics, particularly in equity markets. It extends the Black-Scholes framework by incorporating Poisson-driven jumps, addressing the "volatility smile" observed in option pricing.
The Merton jump-diffusion process for an asset price \( S_t \) is given by the following stochastic differential equation (SDE):
\[ \frac{dS_t}{S_t} = (\mu - \lambda \kappa) dt + \sigma dW_t + dJ_t \]where:
- \( \mu \): Drift rate of the asset (expected return)
- \( \sigma \): Volatility of the diffusion component
- \( W_t \): Standard Brownian motion
- \( J_t \): Jump process, defined as \( J_t = \sum_{i=1}^{N_t} (Y_i - 1) \)
- \( N_t \): Poisson process with intensity \( \lambda \) (average number of jumps per unit time)
- \( Y_i \): Sequence of independent and identically distributed (i.i.d.) random variables representing the jump sizes (typically log-normal)
- \( \kappa = \mathbb{E}[Y_i - 1] \): Expected relative jump size (compensator to ensure \( \mathbb{E}[dJ_t] = 0 \))
Under the risk-neutral measure \( \mathbb{Q} \), the SDE becomes:
\[ \frac{dS_t}{S_t} = (r - \lambda \kappa) dt + \sigma dW_t^{\mathbb{Q}} + dJ_t^{\mathbb{Q}} \]where \( r \) is the risk-free rate, and \( W_t^{\mathbb{Q}} \) is a Brownian motion under \( \mathbb{Q} \). The jump process \( J_t^{\mathbb{Q}} \) has intensity \( \lambda^{\mathbb{Q}} \) and jump size distribution \( Y_i^{\mathbb{Q}} \).
Log-Normal Jump Sizes: In the Merton model, jump sizes \( Y_i \) are typically assumed to be log-normally distributed:
\[ \log(Y_i) \sim \mathcal{N}(\gamma, \delta^2) \]Thus, \( Y_i \) has mean \( e^{\gamma + \frac{\delta^2}{2}} \) and variance \( e^{2\gamma + \delta^2}(e^{\delta^2} - 1) \). The compensator \( \kappa \) is:
\[ \kappa = \mathbb{E}[Y_i - 1] = e^{\gamma + \frac{\delta^2}{2}} - 1 \]Solution to the SDE: The solution to the Merton jump-diffusion SDE is:
\[ S_t = S_0 \exp \left( \left( \mu - \frac{\sigma^2}{2} - \lambda \kappa \right) t + \sigma W_t + \sum_{i=1}^{N_t} \log(Y_i) \right) \]Under the risk-neutral measure \( \mathbb{Q} \), this becomes:
\[ S_t = S_0 \exp \left( \left( r - \frac{\sigma^2}{2} - \lambda^{\mathbb{Q}} \kappa^{\mathbb{Q}} \right) t + \sigma W_t^{\mathbb{Q}} + \sum_{i=1}^{N_t^{\mathbb{Q}}} \log(Y_i^{\mathbb{Q}}) \right) \]Characteristic Function: The characteristic function of \( \log(S_t) \) under the risk-neutral measure is crucial for option pricing. For the Merton model, it is given by:
\[ \phi_{\log(S_t)}(u) = \exp \left( i u \left( \log(S_0) + \left( r - \frac{\sigma^2}{2} - \lambda^{\mathbb{Q}} \kappa^{\mathbb{Q}} \right) t \right) - \frac{u^2 \sigma^2 t}{2} + \lambda^{\mathbb{Q}} t \left( \phi_{\log(Y)}(u) - 1 \right) \right) \]where \( \phi_{\log(Y)}(u) \) is the characteristic function of \( \log(Y_i^{\mathbb{Q}}) \):
\[ \phi_{\log(Y)}(u) = \exp \left( i u \gamma^{\mathbb{Q}} - \frac{u^2 \delta^2}{2} \right) \]Option Pricing Formula: The price of a European call option with strike \( K \) and maturity \( T \) in the Merton model is:
\[ C(S_0, K, T) = \sum_{n=0}^{\infty} \frac{e^{-\lambda^{\mathbb{Q}} T} (\lambda^{\mathbb{Q}} T)^n}{n!} C_{BS}(S_0, K, T; r_n, \sigma_n) \]where:
- \( C_{BS}(S_0, K, T; r_n, \sigma_n) \): Black-Scholes call option price with adjusted parameters:
- \( r_n = r - \lambda^{\mathbb{Q}} \kappa^{\mathbb{Q}} + \frac{n \gamma^{\mathbb{Q}}}{T} \)
- \( \sigma_n^2 = \sigma^2 + \frac{n \delta^2}{T} \)
The put option price can be obtained via put-call parity.
Example: Deriving the Option Pricing Formula
The Merton model option pricing formula is derived using the following steps:
- Risk-Neutral Valuation: The price of a European call option is the discounted expected payoff under the risk-neutral measure: \[ C(S_0, K, T) = e^{-rT} \mathbb{E}^{\mathbb{Q}} \left[ \max(S_T - K, 0) \right] \]
- Condition on the Number of Jumps: Let \( N_T \) be the number of jumps by time \( T \). The expectation can be written as: \[ \mathbb{E}^{\mathbb{Q}} \left[ \max(S_T - K, 0) \right] = \sum_{n=0}^{\infty} \mathbb{E}^{\mathbb{Q}} \left[ \max(S_T - K, 0) \mid N_T = n \right] \mathbb{P}(N_T = n) \] where \( \mathbb{P}(N_T = n) = \frac{e^{-\lambda^{\mathbb{Q}} T} (\lambda^{\mathbb{Q}} T)^n}{n!} \).
- Log-Normal Distribution: Given \( N_T = n \), \( \log(S_T) \) is normally distributed with mean and variance: \[ \log(S_T) \mid N_T = n \sim \mathcal{N} \left( \log(S_0) + \left( r - \frac{\sigma^2}{2} - \lambda^{\mathbb{Q}} \kappa^{\mathbb{Q}} \right) T + n \gamma^{\mathbb{Q}}, \sigma^2 T + n \delta^2 \right) \] This is equivalent to a Black-Scholes model with adjusted drift \( r_n \) and volatility \( \sigma_n \).
- Black-Scholes Formula: The conditional expectation \( \mathbb{E}^{\mathbb{Q}} \left[ \max(S_T - K, 0) \mid N_T = n \right] \) is the Black-Scholes call price with parameters \( r_n \) and \( \sigma_n \). Thus: \[ C(S_0, K, T) = \sum_{n=0}^{\infty} \frac{e^{-\lambda^{\mathbb{Q}} T} (\lambda^{\mathbb{Q}} T)^n}{n!} C_{BS}(S_0, K, T; r_n, \sigma_n) \]
Implied Volatility and the Volatility Smile: The Merton model generates a volatility smile because the presence of jumps introduces skewness and excess kurtosis in the return distribution. The implied volatility \( \sigma_{\text{imp}}(K, T) \) for a given strike \( K \) and maturity \( T \) can be approximated as:
\[ \sigma_{\text{imp}}^2(K, T) \approx \sigma^2 + \lambda^{\mathbb{Q}} \left( \delta^2 + \gamma^{\mathbb{Q}^2} \right) \frac{T}{2} - \lambda^{\mathbb{Q}} \gamma^{\mathbb{Q}} \left( \log\left(\frac{K}{S_0}\right) - \left( r - \frac{\sigma^2}{2} - \lambda^{\mathbb{Q}} \kappa^{\mathbb{Q}} \right) T \right) \]This shows that implied volatility is a decreasing function of the log-moneyness \( \log(K/S_0) \), creating a downward-sloping smile for short maturities.
Key Parameters in the Merton Model:
- Jump Intensity (\( \lambda \)): Average number of jumps per unit time. Higher \( \lambda \) increases the likelihood of jumps.
- Jump Size Mean (\( \gamma \)): Mean of \( \log(Y_i) \). Positive \( \gamma \) implies upward jumps on average, while negative \( \gamma \) implies downward jumps.
- Jump Size Volatility (\( \delta \)): Volatility of \( \log(Y_i) \). Higher \( \delta \) increases the variability of jump sizes.
- Diffusion Volatility (\( \sigma \)): Volatility of the continuous Brownian motion component.
Practical Applications:
- Equity Option Pricing: The Merton model is widely used to price equity options, especially for stocks with significant jump risk (e.g., tech stocks, biotech). It explains the volatility smile better than the Black-Scholes model.
- Credit Risk Modeling: Jump-diffusion processes are used in structural credit risk models to capture sudden jumps in asset values (e.g., due to default events). The Merton model can be adapted to model default as a jump to zero.
- Commodity and Energy Markets: Commodity prices often exhibit jumps due to supply shocks (e.g., geopolitical events, natural disasters). The Merton model is used to price options on commodities like oil, gas, and electricity.
- Risk Management: The model helps in measuring and managing jump risk, which is not captured by traditional Value-at-Risk (VaR) models that assume continuous price paths.
- Real Options: In corporate finance, the Merton model is used to value real options (e.g., investment timing options) where the underlying project value may experience jumps due to technological breakthroughs or regulatory changes.
Important Notes and Common Pitfalls:
- Parameter Estimation:
- Estimating the parameters (\( \lambda, \gamma, \delta, \sigma \)) is challenging. Common methods include:
- Maximum Likelihood Estimation (MLE): Use historical return data to estimate parameters by maximizing the likelihood function.
- Method of Moments: Match the moments of the model to the empirical moments of the data.
- Filtering Techniques: Use Kalman filters or particle filters to estimate latent variables (e.g., jumps).
- Pitfall: Overfitting to historical data. The jump parameters may not be stable over time, especially during crises.
- Estimating the parameters (\( \lambda, \gamma, \delta, \sigma \)) is challenging. Common methods include:
- Infinite Summation in Option Pricing:
- The option pricing formula involves an infinite sum over the number of jumps. In practice, the sum is truncated at a finite \( n \) (e.g., \( n = 5 \) to \( n = 10 \)) for computational efficiency.
- Pitfall: Truncating too early can lead to significant pricing errors, especially for out-of-the-money options where the jump component is more important.
- Risk-Neutral vs. Real-World Measures:
- The jump intensity and size distribution may differ under the risk-neutral measure \( \mathbb{Q} \) and the real-world measure \( \mathbb{P} \). This is due to the market price of jump risk.
- Pitfall: Assuming \( \lambda^{\mathbb{Q}} = \lambda \) or \( Y_i^{\mathbb{Q}} = Y_i \) without justification. These parameters must be calibrated to market option prices.
- Correlation Between Jumps and Diffusion:
- The Merton model assumes independence between the jump component and the diffusion component. This may not hold in practice (e.g., jumps in stock prices may be accompanied by increased volatility).
- Pitfall: Ignoring potential correlation can lead to mispricing of options, especially those sensitive to volatility (e.g., variance swaps).
- Negative Prices:
- The Merton model can produce negative asset prices if the jump size distribution allows for large downward jumps. This is unrealistic for most assets.
- Pitfall: Ensure the jump size distribution is chosen such that \( S_t > 0 \) (e.g., log-normal jumps guarantee positivity).
- Computational Complexity:
- Pricing options in the Merton model is computationally intensive due to the infinite sum and the need to evaluate the Black-Scholes formula for each term.
- Pitfall: Using slow numerical methods (e.g., brute-force summation) can make the model impractical for real-time applications. Fast Fourier transform (FFT) methods are often used to speed up computations.
- Model Extensions:
- The Merton model can be extended in several ways:
- Stochastic Volatility: Combine with Heston or SABR models to capture volatility dynamics.
- Time-Varying Jump Intensity: Allow \( \lambda \) to be stochastic (e.g., self-exciting jumps).
- Multiple Jump Types: Introduce different jump processes for upward and downward jumps.
- Pitfall: Adding too many parameters can lead to overfitting and poor out-of-sample performance.
- The Merton model can be extended in several ways:
Example: Calibrating the Merton Model to Market Data
Suppose you are given the following market data for European call options on a stock:
| Strike \( K \) | Maturity \( T \) | Market Price |
|---|---|---|
| 90 | 0.5 | 12.00 |
| 100 | 0.5 | 5.00 |
| 110 | 0.5 | 1.50 |
Additional information: \( S_0 = 100 \), \( r = 0.05 \). Calibrate the Merton model parameters \( \sigma, \lambda^{\mathbb{Q}}, \gamma^{\mathbb{Q}}, \delta \) to fit these prices.
- Initial Guess: Start with initial guesses for the parameters, e.g., \( \sigma = 0.2 \), \( \lambda^{\mathbb{Q}} = 0.5 \), \( \gamma^{\mathbb{Q}} = -0.1 \), \( \delta = 0.1 \).
- Compute Model Prices: For each option, compute the model price using the Merton option pricing formula (truncating the sum at \( n = 5 \)).
- Objective Function: Define an objective function as the sum of squared differences between model prices and market prices: \[ \text{Error} = \sum_{i=1}^{3} \left( C_{\text{model}}(K_i, T_i) - C_{\text{market}}(K_i, T_i) \right)^2 \]
- Optimization: Use a numerical optimization algorithm (e.g., Levenberg-Marquardt, Nelder-Mead) to minimize the error by adjusting the parameters.
- Result: After optimization, you might obtain parameters such as \( \sigma = 0.18 \), \( \lambda^{\mathbb{Q}} = 0.4 \), \( \gamma^{\mathbb{Q}} = -0.08 \), \( \delta = 0.12 \).
Note: In practice, you would use more market data (e.g., options with different strikes and maturities) and more sophisticated optimization techniques to ensure robust calibration.
Quant Interview Questions on Jump-Diffusion Processes:
- Explain the intuition behind the Merton jump-diffusion model. How does it differ from the Black-Scholes model?
Answer: The Merton model extends the Black-Scholes model by incorporating a jump component to capture sudden, large price movements. In Black-Scholes, asset prices follow a geometric Brownian motion (GBM), which assumes continuous paths and normally distributed returns. The Merton model adds a compound Poisson process to GBM, allowing for discontinuous jumps. This addresses the "volatility smile" observed in option prices, as jumps introduce skewness and excess kurtosis in the return distribution.
- Derive the solution to the Merton jump-diffusion SDE.
Answer: Start with the SDE: \[ \frac{dS_t}{S_t} = (\mu - \lambda \kappa) dt + \sigma dW_t + dJ_t \] where \( J_t = \sum_{i=1}^{N_t} (Y_i - 1) \). Rewrite as: \[ d \log(S_t) = \left( \mu - \frac{\sigma^2}{2} - \lambda \kappa \right) dt + \sigma dW_t + d \left( \sum_{i=1}^{N_t} \log(Y_i) \right) \] Integrate both sides from 0 to \( t \): \[ \log(S_t) - \log(S_0) = \left( \mu - \frac{\sigma^2}{2} - \lambda \kappa \right) t + \sigma W_t + \sum_{i=1}^{N_t} \log(Y_i) \] Exponentiate to obtain: \[ S_t = S_0 \exp \left( \left( \mu - \frac{\sigma^2}{2} - \lambda \kappa \right) t + \sigma W_t + \sum_{i=1}^{N_t} \log(Y_i) \right) \]
- Why does the Merton model produce a volatility smile? Explain intuitively and mathematically.
Answer: Intuition: The volatility smile arises because the Merton model introduces jumps, which create fat tails and skewness in the return distribution. Out-of-the-money (OTM) options are more sensitive to these tails, so their prices (and thus implied volatilities) are higher than in the Black-Scholes model.
Mathematics: The implied volatility \( \sigma_{\text{imp}}(K, T) \) is derived by equating the Merton model price to the Black-Scholes price. For OTM puts (low \( K \)), the probability of a large downward jump increases the option price, leading to higher implied volatility. Similarly, OTM calls (high \( K \)) are affected by upward jumps. The smile shape depends on the jump parameters \( \lambda, \gamma, \delta \). For example, negative \( \gamma \) (downward jumps) creates a downward-sloping smile for short maturities. - How would you estimate the parameters of the Merton model from historical data?
Answer: Common methods include:
- Maximum Likelihood Estimation (MLE):
- Assume a parametric form for the jump size distribution (e.g., log-normal).
- Write the likelihood function for the observed returns, accounting for both the diffusion and jump components.
- Maximize the likelihood function numerically to estimate \( \mu, \sigma, \lambda, \gamma, \delta \).
- Method of Moments:
- Compute the empirical moments (mean, variance, skewness, kurtosis) of the return data.
- Match these to the theoretical moments of the Merton model, which depend on the parameters.
- Solve the resulting system of equations for the parameters.
- Filtering Techniques:
- Use a Kalman filter or particle filter to estimate the latent variables (e.g., jumps) and parameters simultaneously.
- This is useful when jumps are not directly observable.
- Maximum Likelihood Estimation (MLE):
- Explain the role of the compensator \( \kappa \) in the Merton model. What happens if it is omitted?
Answer: The compensator \( \kappa = \mathbb{E}[Y_i - 1] \) ensures that the jump process \( J_t \) has zero mean under the real-world measure \( \mathbb{P} \), i.e., \( \mathbb{E}[dJ_t] = 0 \). This is necessary for the drift term \( (\mu - \lambda \kappa) dt \) to represent the expected return of the asset.
If \( \kappa \) is omitted, the drift term becomes \( \mu dt \), but the expected return would actually be \( (\mu + \lambda \mathbb{E}[Y_i - 1]) dt \). This would lead to:- Incorrect pricing of assets and derivatives, as the expected return is mispecified.
- Biased parameter estimates when calibrating the model to data.
- How does the Merton model address the "volatility skew" observed in equity markets?
Answer: The volatility skew refers to the asymmetric shape of the implied volatility curve, where OTM puts have higher implied volatilities than OTM calls. The Merton model can generate a skew by:
- Allowing for negative mean jump sizes (\( \gamma < 0 \)), which increases the probability of large downward jumps. This makes OTM puts more expensive, leading to higher implied volatilities.
- Adjusting the jump intensity \( \lambda \) and jump size volatility \( \delta \) to control the steepness of the skew.
- What are the limitations of the Merton jump-diffusion model?
Answer: Key limitations include:
- Independence Assumption: The model assumes jumps are independent of the diffusion component and of each other. In reality, jumps may be accompanied by increased volatility or clustering.
- Constant Parameters: The jump intensity \( \lambda \) and size distribution are assumed constant, but they may vary over time (e.g., higher jump intensity during crises).
- Single Jump Type: The model assumes a single type of jump, but in practice, there may be multiple jump types (e.g., upward and downward jumps with different distributions).
- Computational Complexity: Option pricing requires evaluating an infinite sum, which is computationally intensive. Truncation can introduce errors.
- Negative Prices: If the jump size distribution is not carefully chosen, the model can produce negative asset prices, which is unrealistic.
- How would you extend the Merton model to include stochastic volatility?
Answer: To combine the Merton model with stochastic volatility (e.g., Heston model), replace the constant volatility \( \sigma \) with a stochastic process \( \sqrt{v_t} \), where \( v_t \) follows a mean-reverting square-root process: \[ \frac{dS_t}{S_t} = (\mu - \lambda \kappa) dt + \sqrt{v_t} dW_t^S + dJ_t \] \[ dv_t = \kappa_v (\theta_v - v_t) dt + \xi \sqrt{v_t} dW_t^v \] where \( \kappa_v \) is the mean-reversion speed, \( \theta_v \) is the long-term variance, and \( \xi \) is the volatility of volatility. The Brownian motions \( W_t^S \) and \( W_t^v \) may be correlated. This extension captures both jump risk and volatility dynamics, providing a better fit to the volatility surface.
Topic 30: Itô's Lemma for Jump-Diffusion Processes
Jump-Diffusion Process: A stochastic process that combines continuous diffusion (modeled by a Brownian motion) with discontinuous jumps (modeled by a Poisson process). Mathematically, a jump-diffusion process \( X_t \) satisfies the following stochastic differential equation (SDE):
\[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t + J_t dN_t, \]where:
- \( \mu(t, X_t) \) is the drift term,
- \( \sigma(t, X_t) \) is the diffusion term,
- \( W_t \) is a standard Brownian motion,
- \( N_t \) is a Poisson process with intensity \( \lambda \),
- \( J_t \) is the jump size at time \( t \), often modeled as a random variable (e.g., \( J_t \sim \mathcal{N}(\mu_J, \sigma_J^2) \)).
Itô’s Lemma for Jump-Diffusion Processes: An extension of Itô’s Lemma to processes that include jumps. It provides a way to compute the differential of a function \( f(t, X_t) \) where \( X_t \) follows a jump-diffusion process. The lemma accounts for both the continuous and discontinuous components of \( X_t \).
Itô’s Lemma for Jump-Diffusion Processes:
Let \( X_t \) be a jump-diffusion process defined by:
\[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t + J_t dN_t, \]and let \( f(t, x) \) be a twice continuously differentiable function in \( x \) and once in \( t \). Then, the differential of \( f(t, X_t) \) is given by:
\[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t + \left[ f(t, X_t + J_t) - f(t, X_t) \right] dN_t. \]Alternatively, in integral form:
\[ f(t, X_t) = f(0, X_0) + \int_0^t \left( \frac{\partial f}{\partial s} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) ds + \int_0^t \sigma \frac{\partial f}{\partial x} dW_s + \int_0^t \left[ f(s, X_s + J_s) - f(s, X_s) \right] dN_s. \]Example: Geometric Jump-Diffusion Process
Consider a stock price \( S_t \) following a geometric jump-diffusion process:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t + J_t S_t dN_t, \]where \( J_t \) is the percentage jump size (e.g., \( J_t = e^{Y_t} - 1 \) with \( Y_t \sim \mathcal{N}(\mu_J, \sigma_J^2) \)). Let \( f(t, S_t) = \ln(S_t) \). We apply Itô’s Lemma to find \( d(\ln S_t) \).
Compute the partial derivatives:
\[ \frac{\partial f}{\partial t} = 0, \quad \frac{\partial f}{\partial S} = \frac{1}{S}, \quad \frac{\partial^2 f}{\partial S^2} = -\frac{1}{S^2}. \]Substitute into Itô’s Lemma:
\[ d(\ln S_t) = \left( 0 + \mu S_t \cdot \frac{1}{S_t} + \frac{1}{2} \sigma^2 S_t^2 \cdot \left(-\frac{1}{S_t^2}\right) \right) dt + \sigma S_t \cdot \frac{1}{S_t} dW_t + \left[ \ln(S_t + J_t S_t) - \ln(S_t) \right] dN_t. \]Simplify:
\[ d(\ln S_t) = \left( \mu - \frac{1}{2} \sigma^2 \right) dt + \sigma dW_t + \ln(1 + J_t) dN_t. \]This result is useful for pricing options under jump-diffusion models (e.g., Merton’s jump-diffusion model).
Generalization to Multivariate Jump-Diffusion Processes:
Let \( \mathbf{X}_t = (X_{1,t}, \dots, X_{n,t})^\top \) be an \( n \)-dimensional jump-diffusion process with dynamics:
\[ dX_{i,t} = \mu_i(t, \mathbf{X}_t) dt + \sum_{j=1}^m \sigma_{ij}(t, \mathbf{X}_t) dW_{j,t} + J_{i,t} dN_{i,t}, \quad i = 1, \dots, n, \]where \( \mathbf{W}_t = (W_{1,t}, \dots, W_{m,t})^\top \) is an \( m \)-dimensional Brownian motion, and \( N_{i,t} \) are Poisson processes with intensities \( \lambda_i \). Let \( f(t, \mathbf{x}) \) be a sufficiently smooth function. Then:
\[ df(t, \mathbf{X}_t) = \left( \frac{\partial f}{\partial t} + \sum_{i=1}^n \mu_i \frac{\partial f}{\partial x_i} + \frac{1}{2} \sum_{i=1}^n \sum_{j=1}^n \left( \sum_{k=1}^m \sigma_{ik} \sigma_{jk} \right) \frac{\partial^2 f}{\partial x_i \partial x_j} \right) dt \] \[ + \sum_{i=1}^n \sum_{j=1}^m \sigma_{ij} \frac{\partial f}{\partial x_i} dW_{j,t} + \sum_{i=1}^n \left[ f(t, \mathbf{X}_t + \mathbf{J}_{i,t}) - f(t, \mathbf{X}_t) \right] dN_{i,t}, \]where \( \mathbf{J}_{i,t} \) is the jump vector for the \( i \)-th component (e.g., \( \mathbf{J}_{i,t} = (0, \dots, J_{i,t}, \dots, 0)^\top \)).
Derivation of Itô’s Lemma for Jump-Diffusion Processes:
The derivation extends the standard Itô’s Lemma by incorporating jumps. We use a Taylor expansion of \( f(t, X_t) \) and account for the discontinuous changes due to jumps.
Taylor Expansion: For small \( \Delta t \), expand \( f(t + \Delta t, X_{t + \Delta t}) \) around \( (t, X_t) \):
\[ f(t + \Delta t, X_{t + \Delta t}) \approx f(t, X_t) + \frac{\partial f}{\partial t} \Delta t + \frac{\partial f}{\partial x} \Delta X_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} (\Delta X_t)^2. \]Substitute \( \Delta X_t \): From the SDE, \( \Delta X_t = \mu \Delta t + \sigma \Delta W_t + J_t \Delta N_t \), where \( \Delta W_t \sim \mathcal{N}(0, \Delta t) \) and \( \Delta N_t \sim \text{Poisson}(\lambda \Delta t) \).
Compute \( (\Delta X_t)^2 \):
\[ (\Delta X_t)^2 = (\mu \Delta t + \sigma \Delta W_t + J_t \Delta N_t)^2. \]Expanding and taking expectations, the cross terms \( \Delta t \Delta W_t \) and \( \Delta W_t \Delta N_t \) vanish, and \( (\Delta W_t)^2 \approx \Delta t \). The term \( (J_t \Delta N_t)^2 \) is non-zero only if \( \Delta N_t = 1 \), in which case it is \( J_t^2 \). Thus:
\[ (\Delta X_t)^2 \approx \sigma^2 \Delta t + J_t^2 \Delta N_t. \]Combine Terms: Substitute \( \Delta X_t \) and \( (\Delta X_t)^2 \) into the Taylor expansion:
\[ \Delta f \approx \frac{\partial f}{\partial t} \Delta t + \frac{\partial f}{\partial x} (\mu \Delta t + \sigma \Delta W_t + J_t \Delta N_t) + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} (\sigma^2 \Delta t + J_t^2 \Delta N_t). \]Separate the continuous and jump terms:
\[ \Delta f \approx \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) \Delta t + \sigma \frac{\partial f}{\partial x} \Delta W_t + \left[ \frac{\partial f}{\partial x} J_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} J_t^2 \right] \Delta N_t. \]Jump Term Simplification: The jump term can be written as a difference:
\[ \frac{\partial f}{\partial x} J_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2} J_t^2 \approx f(t, X_t + J_t) - f(t, X_t), \]using a first-order Taylor expansion of \( f(t, X_t + J_t) \) around \( X_t \).
Limit as \( \Delta t \to 0 \): Taking the limit, we obtain the differential form of Itô’s Lemma for jump-diffusion processes:
\[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t + \left[ f(t, X_t + J_t) - f(t, X_t) \right] dN_t. \]
Practical Applications:
Option Pricing: Itô’s Lemma is used to derive the partial differential equation (PDE) for option prices under jump-diffusion models. For example, in Merton’s jump-diffusion model, the Black-Scholes PDE is modified to include an integral term accounting for jumps.
Risk Management: Jump-diffusion processes are used to model sudden market movements (e.g., crashes or rallies). Itô’s Lemma helps in computing Greeks (sensitivities) for hedging purposes.
Interest Rate Modeling: Short-rate models with jumps (e.g., jump-extended Vasicek or CIR models) use Itô’s Lemma to derive the dynamics of bond prices or interest rate derivatives.
Credit Risk: Default events can be modeled as jumps in a firm’s asset value process. Itô’s Lemma is used to derive the dynamics of credit spreads or default probabilities.
Stochastic Control: In optimal control problems with jump-diffusion state processes, Itô’s Lemma is used to derive the Hamilton-Jacobi-Bellman (HJB) equation.
Example: Merton’s Jump-Diffusion Model
In Merton’s model, the stock price \( S_t \) follows:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t + J_t S_t dN_t, \]where \( J_t = e^{Y_t} - 1 \) and \( Y_t \sim \mathcal{N}(\mu_J, \sigma_J^2) \). The price of a European call option \( C(t, S_t) \) with strike \( K \) and maturity \( T \) satisfies the PDE:
\[ \frac{\partial C}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 C}{\partial S^2} + (r - \lambda \kappa) S \frac{\partial C}{\partial S} - rC + \lambda \mathbb{E}[C(t, S_t (1 + J_t)) - C(t, S_t)] = 0, \]where \( \kappa = \mathbb{E}[e^{Y_t} - 1] \) and \( r \) is the risk-free rate. The expectation term arises from the jump component in Itô’s Lemma.
Common Pitfalls and Important Notes:
Jump Size Distribution: The distribution of \( J_t \) must be carefully specified. Common choices include log-normal (for percentage jumps) or normal (for additive jumps). The choice affects the expectation term in the PDE.
Independence Assumptions: Itô’s Lemma for jump-diffusion processes typically assumes that \( W_t \), \( N_t \), and \( J_t \) are independent. Violations of this assumption require adjustments (e.g., using compensators for dependent jumps).
Higher-Order Terms: In the derivation, higher-order terms in the Taylor expansion (e.g., \( (\Delta N_t)^2 \)) are often negligible, but this may not hold if jumps are large or frequent.
Compensated Poisson Process: Sometimes, the Poisson process is compensated (i.e., \( d\tilde{N}_t = dN_t - \lambda dt \)) to make it a martingale. In this case, the drift term in Itô’s Lemma must be adjusted to include \( -\lambda \mathbb{E}[f(t, X_t + J_t) - f(t, X_t)] \).
Numerical Methods: Solving PDEs or SDEs with jumps often requires numerical methods (e.g., finite difference methods for PDEs or Monte Carlo simulation for SDEs). The jump component adds complexity to these methods.
Dimensionality: For multivariate jump-diffusion processes, the cross-partial derivatives \( \frac{\partial^2 f}{\partial x_i \partial x_j} \) must be carefully computed, especially if jumps are correlated across dimensions.
Martingale Property: Unlike pure diffusion processes, jump-diffusion processes may not be martingales even if the drift is zero, due to the jump component. This affects risk-neutral pricing and hedging.
Quant Interview Tip:
In quant interviews, you may be asked to derive Itô’s Lemma for a jump-diffusion process or apply it to a specific model (e.g., Merton’s model). Key points to remember:
- Start with the Taylor expansion and carefully account for the jump term.
- Be comfortable with computing expectations of the jump term (e.g., \( \mathbb{E}[f(t, X_t + J_t) - f(t, X_t)] \)).
- For option pricing, know how the jump component modifies the Black-Scholes PDE.
- Practice deriving the dynamics of \( \ln(S_t) \) or \( S_t^\beta \) for jump-diffusion processes.
Topic 31: Lévy Processes and Their Role in Finance
Definition (Lévy Process): A Lévy process \( \{X_t\}_{t \geq 0} \) is a stochastic process with the following properties:
- Independent increments: For any \( 0 \leq t_1 < t_2 < \dots < t_n \), the random variables \( X_{t_2} - X_{t_1}, X_{t_3} - X_{t_2}, \dots, X_{t_n} - X_{t_{n-1}} \) are independent.
- Stationary increments: The distribution of \( X_{t+s} - X_t \) depends only on \( s \) (not on \( t \)).
- Stochastic continuity: For all \( \epsilon > 0 \), \( \lim_{h \to 0} \mathbb{P}(|X_{t+h} - X_t| > \epsilon) = 0 \).
- Càdlàg paths: The process has right-continuous paths with left limits (almost surely).
Lévy processes generalize Brownian motion by allowing jumps and non-normal distributions while preserving independent and stationary increments.
Lévy-Khintchine Representation: The characteristic function \( \phi_X(u) = \mathbb{E}[e^{iuX_t}] \) of a Lévy process \( X_t \) is given by:
\[ \phi_X(u) = \exp\left( t \left[ iu\gamma - \frac{1}{2}u^2\sigma^2 + \int_{\mathbb{R}} \left( e^{iux} - 1 - iux \mathbb{I}_{|x|<1} \right) \nu(dx) \right] \right), \] where:- \( \gamma \in \mathbb{R} \) is the drift term,
- \( \sigma \geq 0 \) is the diffusion coefficient,
- \( \nu \) is the Lévy measure satisfying \( \int_{\mathbb{R}} \min(1, x^2) \nu(dx) < \infty \).
The triplet \( (\gamma, \sigma^2, \nu) \) uniquely characterizes the Lévy process.
Definition (Lévy Measure): The Lévy measure \( \nu \) describes the intensity and size of jumps in the process. For a Borel set \( A \), \( \nu(A) \) represents the expected number of jumps per unit time whose size belongs to \( A \).
Itô's Formula for Lévy Processes: Let \( X_t \) be a Lévy process with characteristic triplet \( (\gamma, \sigma^2, \nu) \), and let \( f(t, x) \) be a twice continuously differentiable function. Then:
\[ df(t, X_t) = \left( \frac{\partial f}{\partial t} + \gamma \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t + \int_{\mathbb{R}} \left( f(t, X_{t-} + x) - f(t, X_{t-}) \right) \tilde{N}(dt, dx), \] where:- \( W_t \) is a standard Brownian motion,
- \( \tilde{N}(dt, dx) = N(dt, dx) - \nu(dx)dt \) is the compensated Poisson random measure,
- \( N(dt, dx) \) is the Poisson random measure counting jumps of size \( x \) in time \( dt \).
Example (Geometric Lévy Process): Consider a stock price model \( S_t = S_0 e^{X_t} \), where \( X_t \) is a Lévy process with triplet \( (\gamma, \sigma^2, \nu) \). Using Itô's formula for Lévy processes, the dynamics of \( S_t \) are:
\[ \frac{dS_t}{S_{t-}} = \left( \gamma + \frac{1}{2} \sigma^2 + \int_{\mathbb{R}} (e^x - 1 - x \mathbb{I}_{|x|<1}) \nu(dx) \right) dt + \sigma dW_t + \int_{\mathbb{R}} (e^x - 1) \tilde{N}(dt, dx). \]This generalizes the Black-Scholes model by incorporating jumps.
Exponential Lévy Model for Asset Prices: Under the risk-neutral measure \( \mathbb{Q} \), the price of an asset \( S_t \) is given by:
\[ S_t = S_0 e^{(r - \delta)t + X_t}, \] where:- \( r \) is the risk-free rate,
- \( \delta \) is the dividend yield,
- \( X_t \) is a Lévy process with risk-neutral triplet \( (\gamma_{\mathbb{Q}}, \sigma^2, \nu_{\mathbb{Q}}) \), where \( \gamma_{\mathbb{Q}} \) is chosen to ensure \( e^{-rt} S_t \) is a martingale.
The martingale condition imposes:
\[ \gamma_{\mathbb{Q}} + \frac{1}{2} \sigma^2 + \int_{\mathbb{R}} (e^x - 1 - x \mathbb{I}_{|x|<1}) \nu_{\mathbb{Q}}(dx) = 0. \]Definition (Infinite Divisibility): A random variable \( X \) is infinitely divisible if for every \( n \in \mathbb{N} \), there exist i.i.d. random variables \( Y_1, Y_2, \dots, Y_n \) such that \( X \stackrel{d}{=} Y_1 + Y_2 + \dots + Y_n \). Lévy processes are intimately connected to infinitely divisible distributions: the distribution of \( X_t \) is infinitely divisible for all \( t \geq 0 \).
Example (Variance Gamma Process): The Variance Gamma (VG) process is a pure-jump Lévy process with no diffusion component (\( \sigma = 0 \)) and Lévy measure:
\[ \nu(dx) = \frac{e^{-\lambda_+ x}}{x} \mathbb{I}_{x > 0} dx + \frac{e^{-\lambda_- |x|}}{|x|} \mathbb{I}_{x < 0} dx, \] where \( \lambda_+, \lambda_- > 0 \). The VG process is popular in finance for modeling asset returns due to its ability to capture skewness and excess kurtosis.The characteristic function of the VG process is:
\[ \phi_{X_t}(u) = \left( \frac{\lambda_+ \lambda_-}{(\lambda_+ - iu)(\lambda_- + iu)} \right)^t. \]Change of Measure for Lévy Processes: Let \( X_t \) be a Lévy process under measure \( \mathbb{P} \) with triplet \( (\gamma, \sigma^2, \nu) \). Under an equivalent measure \( \mathbb{Q} \), the process remains a Lévy process with triplet \( (\gamma_{\mathbb{Q}}, \sigma^2, \nu_{\mathbb{Q}}) \), where:
- The diffusion coefficient \( \sigma \) is unchanged.
- The Lévy measure transforms as \( \nu_{\mathbb{Q}}(dx) = e^{\theta(x)} \nu(dx) \) for some function \( \theta(x) \).
- The drift term adjusts to:
Important Notes and Pitfalls:
- Martingale Condition: When modeling asset prices under the risk-neutral measure, always ensure the martingale condition is satisfied. For exponential Lévy models, this imposes a constraint on the drift term \( \gamma_{\mathbb{Q}} \). Failure to enforce this can lead to arbitrage opportunities.
- Jump Activity: The Lévy measure \( \nu \) determines the jump activity. If \( \nu(\mathbb{R}) = \infty \), the process has infinitely many small jumps (infinite activity). If \( \int_{|x|<1} |x| \nu(dx) = \infty \), the process has infinite variation. These properties significantly impact the behavior of the process and the complexity of numerical methods.
- Numerical Challenges: Simulating Lévy processes, especially those with infinite activity or infinite variation, can be computationally intensive. Approximation methods (e.g., truncating small jumps or using series representations) are often employed.
- Heavy Tails: Many Lévy processes (e.g., VG, CGMY) exhibit heavy tails, which can lead to extreme events (jumps) more frequently than in the Black-Scholes model. This has important implications for risk management and option pricing.
- Dependence on Parameters: The parameters of a Lévy process (e.g., \( \lambda_+, \lambda_- \) in the VG process) can be highly sensitive to calibration data. Small changes in parameters can lead to significant differences in option prices, especially for out-of-the-money options.
Example (Calibrating a Lévy Process to Market Data): To calibrate a Lévy process to market option prices:
- Choose a parametric form for the Lévy process (e.g., VG, Merton jump-diffusion, CGMY).
- Express the risk-neutral characteristic function \( \phi_{X_t}(u) \) in terms of the model parameters.
- Use the Carr-Madan formula (or similar Fourier-based methods) to compute model option prices: \[ C(K, T) = \frac{e^{-\alpha \log K}}{\pi} \int_0^\infty e^{-iu \log K} \frac{e^{-rT} \phi_{X_T}(u - i(\alpha + 1))}{\alpha^2 + \alpha - u^2 + i(2\alpha + 1)u} du, \] where \( \alpha > 0 \) is a damping factor.
- Minimize the difference between model and market prices (e.g., using least squares or relative entropy) to find optimal parameters.
Merton Jump-Diffusion Model: A popular Lévy process in finance is the Merton jump-diffusion model, where the log-returns follow:
\[ d \log S_t = \left( \mu - \frac{1}{2} \sigma^2 - \lambda \kappa \right) dt + \sigma dW_t + dJ_t, \] where:- \( J_t \) is a compound Poisson process with intensity \( \lambda \) and jump sizes \( Y_i \sim \mathcal{N}(\mu_J, \sigma_J^2) \),
- \( \kappa = \mathbb{E}[e^{Y_i} - 1] = e^{\mu_J + \frac{1}{2} \sigma_J^2} - 1 \).
The risk-neutral dynamics are obtained by setting \( \mu = r \) and adjusting the jump intensity to \( \lambda_{\mathbb{Q}} = \lambda (1 + \kappa) \).
Practical Applications in Finance:
- Option Pricing: Lévy processes provide more realistic models for asset prices than geometric Brownian motion, especially for markets exhibiting jumps, skewness, and excess kurtosis. They are widely used for pricing exotic options, barrier options, and options on assets with sudden price movements (e.g., commodities, cryptocurrencies).
- Risk Management: The heavy-tailed nature of many Lévy processes makes them suitable for modeling extreme events (e.g., market crashes). Value-at-Risk (VaR) and Expected Shortfall (ES) calculations benefit from the flexibility of Lévy processes to capture tail risk.
- Credit Risk Modeling: Lévy processes are used to model default intensities and credit spreads, where jumps represent sudden changes in credit quality (e.g., downgrades or defaults).
- High-Frequency Data: Lévy processes can model the microstructure of financial markets, including order book dynamics and price impact. Their ability to generate realistic paths with jumps aligns with empirical observations in high-frequency data.
- Interest Rate Modeling: Lévy processes are employed in short-rate models and forward rate models to capture the non-normal behavior of interest rates, including jumps during monetary policy announcements.
Common Interview Questions:
- Explain the difference between a Lévy process and a Brownian motion.
Answer: A Brownian motion is a continuous Lévy process with no jumps (\( \nu = 0 \)) and normally distributed increments. A general Lévy process can have jumps (described by the Lévy measure \( \nu \)) and non-normal distributions, while still preserving independent and stationary increments.
- What is the Lévy-Khintchine representation, and why is it important?
Answer: The Lévy-Khintchine representation provides the characteristic function of a Lévy process in terms of its triplet \( (\gamma, \sigma^2, \nu) \). It is important because it uniquely characterizes the process and is used in Fourier-based option pricing methods (e.g., Carr-Madan, Lewis).
- How do you ensure that an exponential Lévy model is arbitrage-free?
Answer: The model must satisfy the martingale condition under the risk-neutral measure. For \( S_t = S_0 e^{(r - \delta)t + X_t} \), this requires:
\[ \gamma_{\mathbb{Q}} + \frac{1}{2} \sigma^2 + \int_{\mathbb{R}} (e^x - 1 - x \mathbb{I}_{|x|<1}) \nu_{\mathbb{Q}}(dx) = 0. \] This ensures \( e^{-rt} S_t \) is a martingale. - What are the advantages of using a Variance Gamma process over a Black-Scholes model?
Answer: The VG process can capture skewness and excess kurtosis in asset returns, which the Black-Scholes model cannot. It also allows for jumps, making it more suitable for modeling assets with sudden price movements. The VG process is still analytically tractable, with a known characteristic function.
- Explain how you would calibrate a Lévy process to market option prices.
Answer: Calibration involves:
- Choosing a parametric Lévy process (e.g., VG, Merton).
- Computing model option prices using Fourier methods (e.g., Carr-Madan formula).
- Minimizing the difference between model and market prices (e.g., using least squares) to find optimal parameters.
- What is the role of the Lévy measure in a Lévy process?
Answer: The Lévy measure \( \nu \) describes the intensity and size of jumps in the process. For a Borel set \( A \), \( \nu(A) \) is the expected number of jumps per unit time whose size belongs to \( A \). It determines the jump activity and variation of the process.
Topic 32: Poisson Processes and Compound Poisson Processes
Poisson Process: A Poisson process \( \{N(t), t \geq 0\} \) is a counting process that models the number of events occurring in a fixed interval of time or space. It satisfies the following properties:
- Zero Initial Count: \( N(0) = 0 \).
- Independent Increments: The number of events occurring in disjoint intervals are independent.
- Stationary Increments: The number of events in an interval depends only on the length of the interval, not its starting point.
- Poisson Distribution: For any \( t \geq 0 \), \( N(t) \) follows a Poisson distribution with parameter \( \lambda t \), where \( \lambda > 0 \) is the intensity (or rate) of the process: \[ \mathbb{P}(N(t) = k) = \frac{e^{-\lambda t} (\lambda t)^k}{k!}, \quad k = 0, 1, 2, \dots \]
Compound Poisson Process: A compound Poisson process \( \{X(t), t \geq 0\} \) is defined as: \[ X(t) = \sum_{i=1}^{N(t)} Y_i, \] where \( \{N(t), t \geq 0\} \) is a Poisson process with intensity \( \lambda \), and \( \{Y_i, i \geq 1\} \) is a sequence of independent and identically distributed (i.i.d.) random variables, independent of \( N(t) \). The \( Y_i \) represent the sizes or magnitudes of the events.
Key Formulas for Poisson Processes:
- Mean and Variance: \[ \mathbb{E}[N(t)] = \lambda t, \quad \text{Var}(N(t)) = \lambda t. \]
- Interarrival Times: The time between consecutive events (interarrival times) \( T_1, T_2, \dots \) are i.i.d. exponential random variables with parameter \( \lambda \): \[ \mathbb{P}(T_i \leq t) = 1 - e^{-\lambda t}, \quad t \geq 0. \]
- Arrival Times: The time of the \( n \)-th event, \( S_n = T_1 + T_2 + \dots + T_n \), follows a Gamma distribution with shape \( n \) and rate \( \lambda \): \[ f_{S_n}(t) = \frac{\lambda^n t^{n-1} e^{-\lambda t}}{(n-1)!}, \quad t \geq 0. \]
Key Formulas for Compound Poisson Processes:
- Mean and Variance: Let \( \mu = \mathbb{E}[Y_i] \) and \( \sigma^2 = \text{Var}(Y_i) \). Then: \[ \mathbb{E}[X(t)] = \lambda t \mu, \quad \text{Var}(X(t)) = \lambda t (\sigma^2 + \mu^2). \]
- Moment Generating Function (MGF): The MGF of \( X(t) \) is given by: \[ M_{X(t)}(u) = \mathbb{E}[e^{uX(t)}] = \exp\left(\lambda t (M_Y(u) - 1)\right), \] where \( M_Y(u) \) is the MGF of \( Y_i \).
- Characteristic Function: The characteristic function of \( X(t) \) is: \[ \phi_{X(t)}(u) = \mathbb{E}[e^{iuX(t)}] = \exp\left(\lambda t (\phi_Y(u) - 1)\right), \] where \( \phi_Y(u) \) is the characteristic function of \( Y_i \).
Example 1: Mean and Variance of a Compound Poisson Process
Suppose \( N(t) \) is a Poisson process with intensity \( \lambda = 2 \), and the jump sizes \( Y_i \) are i.i.d. with \( \mathbb{E}[Y_i] = 3 \) and \( \text{Var}(Y_i) = 4 \). Compute \( \mathbb{E}[X(t)] \) and \( \text{Var}(X(t)) \).
Solution:
Using the formulas for mean and variance:
\[ \mathbb{E}[X(t)] = \lambda t \mu = 2t \cdot 3 = 6t, \] \[ \text{Var}(X(t)) = \lambda t (\sigma^2 + \mu^2) = 2t (4 + 3^2) = 2t \cdot 13 = 26t. \]Example 2: Probability of No Events in a Poisson Process
Let \( N(t) \) be a Poisson process with rate \( \lambda = 0.5 \). Compute the probability that no events occur in the interval \( [0, 3] \).
Solution:
The number of events in \( [0, 3] \) follows a Poisson distribution with parameter \( \lambda t = 0.5 \cdot 3 = 1.5 \). The probability of no events is:
\[ \mathbb{P}(N(3) = 0) = \frac{e^{-1.5} (1.5)^0}{0!} = e^{-1.5} \approx 0.2231. \]Derivation: Mean of a Compound Poisson Process
Let \( X(t) = \sum_{i=1}^{N(t)} Y_i \). Using the law of total expectation:
\[ \mathbb{E}[X(t)] = \mathbb{E}\left[\mathbb{E}\left[\sum_{i=1}^{N(t)} Y_i \mid N(t)\right]\right] = \mathbb{E}\left[N(t) \mathbb{E}[Y_1]\right] = \mathbb{E}[N(t)] \mathbb{E}[Y_1] = \lambda t \mu. \]Here, we used the independence of \( N(t) \) and \( \{Y_i\} \), and the fact that \( \mathbb{E}[N(t)] = \lambda t \).
Derivation: Variance of a Compound Poisson Process
Using the law of total variance:
\[ \text{Var}(X(t)) = \mathbb{E}[\text{Var}(X(t) \mid N(t))] + \text{Var}(\mathbb{E}[X(t) \mid N(t)]). \]Compute each term separately:
- \( \text{Var}(X(t) \mid N(t)) = \text{Var}\left(\sum_{i=1}^{N(t)} Y_i \mid N(t)\right) = N(t) \text{Var}(Y_1) = N(t) \sigma^2 \).
- \( \mathbb{E}[X(t) \mid N(t)] = N(t) \mathbb{E}[Y_1] = N(t) \mu \).
Thus:
\[ \text{Var}(X(t)) = \mathbb{E}[N(t) \sigma^2] + \text{Var}(N(t) \mu) = \sigma^2 \mathbb{E}[N(t)] + \mu^2 \text{Var}(N(t)) = \lambda t \sigma^2 + \mu^2 \lambda t = \lambda t (\sigma^2 + \mu^2). \]Practical Applications:
- Finance:
- Modeling jumps in asset prices (e.g., Merton's jump-diffusion model).
- Credit risk modeling (e.g., default times as Poisson events).
- Operational risk (e.g., modeling rare but severe losses).
- Insurance:
- Modeling claim arrivals and claim sizes in non-life insurance.
- Ruin theory (e.g., Cramér-Lundberg model).
- Queueing Theory:
- Modeling customer arrivals in service systems.
- Telecommunications:
- Modeling packet arrivals in networks.
Common Pitfalls and Important Notes:
- Confusing Poisson Process with Poisson Distribution:
A Poisson process is a stochastic process, while the Poisson distribution is a discrete probability distribution. The Poisson process counts events over time, and \( N(t) \) follows a Poisson distribution for each fixed \( t \).
- Memoryless Property:
The exponential distribution (interarrival times) is memoryless: \( \mathbb{P}(T_i > s + t \mid T_i > s) = \mathbb{P}(T_i > t) \). This is a key property of Poisson processes.
- Independence Assumption:
The independence of increments is crucial. Real-world processes may not always satisfy this, especially in finance where events can be clustered (e.g., market crashes).
- Stationarity:
The stationary increments property implies that the rate \( \lambda \) is constant. For time-varying rates, a non-homogeneous Poisson process is used.
- Compound Poisson Process:
Ensure that the jump sizes \( Y_i \) are independent of the Poisson process \( N(t) \). If they are dependent, the process is no longer a standard compound Poisson process.
- Thinning of Poisson Processes:
If events of a Poisson process are independently classified into types with probabilities \( p \) and \( 1-p \), the resulting processes are independent Poisson processes with rates \( p\lambda \) and \( (1-p)\lambda \). This is useful for modeling multiple event types.
- Superposition of Poisson Processes:
The superposition (sum) of independent Poisson processes is a Poisson process with rate equal to the sum of the individual rates.
Common Quant Interview Questions:
-
Question: Let \( N(t) \) be a Poisson process with rate \( \lambda \). What is the distribution of the time until the first event?
Answer: The time until the first event \( T_1 \) is exponentially distributed with parameter \( \lambda \): \[ \mathbb{P}(T_1 \leq t) = 1 - e^{-\lambda t}, \quad t \geq 0. \]
-
Question: Suppose \( N(t) \) is a Poisson process with rate \( \lambda \). What is \( \mathbb{E}[N(s) \mid N(t)] \) for \( s < t \)?
Answer: By the independent increments property, \( N(s) \) and \( N(t) - N(s) \) are independent. Thus: \[ \mathbb{E}[N(s) \mid N(t)] = \mathbb{E}[N(s)] = \lambda s. \] However, if \( s > t \), then: \[ \mathbb{E}[N(s) \mid N(t)] = N(t) + \mathbb{E}[N(s) - N(t)] = N(t) + \lambda (s - t). \]
-
Question: Let \( X(t) \) be a compound Poisson process with rate \( \lambda \) and jump size distribution \( Y \). What is the probability that \( X(t) = 0 \)?
Answer: \( X(t) = 0 \) if and only if \( N(t) = 0 \). Thus: \[ \mathbb{P}(X(t) = 0) = \mathbb{P}(N(t) = 0) = e^{-\lambda t}. \]
-
Question: Suppose \( N(t) \) is a Poisson process with rate \( \lambda \). What is the probability that the second event occurs before time \( t \)?
Answer: The second event occurs before time \( t \) if \( N(t) \geq 2 \). Thus: \[ \mathbb{P}(N(t) \geq 2) = 1 - \mathbb{P}(N(t) = 0) - \mathbb{P}(N(t) = 1) = 1 - e^{-\lambda t} - \lambda t e^{-\lambda t}. \]
-
Question: Let \( X(t) \) be a compound Poisson process with rate \( \lambda \) and jump sizes \( Y_i \sim \text{Exp}(\mu) \). What is the distribution of \( X(t) \)?
Answer: The compound Poisson process with exponential jumps is a special case where \( X(t) \) follows a compound Poisson distribution. The MGF of \( X(t) \) is: \[ M_{X(t)}(u) = \exp\left(\lambda t \left(\frac{\mu}{\mu - u} - 1\right)\right) = \exp\left(\frac{\lambda t u}{\mu - u}\right), \quad u < \mu. \] This is the MGF of a Gamma distribution with shape \( \lambda t \) and rate \( \mu \). Thus, \( X(t) \sim \text{Gamma}(\lambda t, \mu) \).
Topic 33: Change of Time in SDEs (Dambis-Dubins-Schwarz Theorem)
Definition (Change of Time): A change of time is a strictly increasing, continuous stochastic process \(\tau = \{\tau_t, t \geq 0\}\) such that \(\tau_t\) is a stopping time for each \(t \geq 0\) and \(\tau_t \to \infty\) almost surely as \(t \to \infty\). The process \(\tau\) is often called a time-change.
Definition (Dambis-Dubins-Schwarz (DDS) Theorem): Let \(M = \{M_t, t \geq 0\}\) be a continuous local martingale with quadratic variation \(\langle M \rangle_t \to \infty\) almost surely as \(t \to \infty\). Then, there exists a Brownian motion \(B = \{B_t, t \geq 0\}\) (possibly on an extended probability space) such that:
\[ M_t = B_{\langle M \rangle_t}, \quad t \geq 0. \]The process \(B\) is called the DDS Brownian motion associated with \(M\).
Key Formula (Time-Changed SDE): Consider the SDE:
\[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t, \]where \(W_t\) is a standard Brownian motion. Let \(\tau_t\) be a time-change process. Define \(Y_t = X_{\tau_t}\). Then, under appropriate regularity conditions, \(Y_t\) satisfies the SDE:
\[ dY_t = \mu(\tau_t, Y_t) \tau'_t dt + \sigma(\tau_t, Y_t) \sqrt{\tau'_t} dB_t, \]where \(B_t\) is a Brownian motion (the DDS Brownian motion associated with the time-changed process).
Quadratic Variation of Time-Changed Martingale: For a continuous local martingale \(M_t\) and a time-change \(\tau_t\), the quadratic variation of the time-changed process \(M_{\tau_t}\) is:
\[ \langle M_{\tau} \rangle_t = \langle M \rangle_{\tau_t}. \]Derivation of the Dambis-Dubins-Schwarz Theorem (Sketch):
- Inverse Time-Change: Define the inverse time-change \(\rho_t = \inf \{s \geq 0 : \langle M \rangle_s > t\}\). Since \(\langle M \rangle_t \to \infty\), \(\rho_t\) is well-defined and finite for all \(t \geq 0\).
- Define the DDS Brownian Motion: Let \(B_t = M_{\rho_t}\). By the optional stopping theorem, \(B_t\) is a continuous local martingale with \(\langle B \rangle_t = t\), so by Lévy’s characterization, \(B_t\) is a Brownian motion.
- Reconstruct \(M_t\): Since \(\rho_{\langle M \rangle_t} = t\) (because \(\langle M \rangle\) is continuous and strictly increasing), we have: \[ M_t = M_{\rho_{\langle M \rangle_t}} = B_{\langle M \rangle_t}. \]
Example: Time-Change for Geometric Brownian Motion
Consider the SDE for geometric Brownian motion:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t. \]The solution is:
\[ S_t = S_0 \exp\left(\left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W_t\right). \]Let \(\tau_t = \int_0^t \sigma^2 S_u^2 du\) be a time-change. Define \(Y_t = S_{\tau_t^{-1}}\), where \(\tau_t^{-1}\) is the inverse of \(\tau_t\). Then, \(Y_t\) satisfies:
\[ dY_t = \frac{\mu}{\sigma^2 Y_t} dt + dB_t, \]where \(B_t\) is a Brownian motion (the DDS Brownian motion associated with \(W_{\tau_t^{-1}}\)).
Practical Applications:
- Reduction to Brownian Motion: The DDS theorem allows us to represent a continuous local martingale as a time-changed Brownian motion. This is useful for simplifying problems involving martingales, as Brownian motion has well-understood properties.
- Option Pricing: In finance, time-changes can be used to model stochastic volatility. For example, the Heston model can be viewed as a time-changed Black-Scholes model, where the time-change captures the volatility dynamics.
- Stochastic Control: Time-changes are used in optimal stopping problems and stochastic control to simplify the dynamics of the state process.
- Credit Risk Modeling: Time-changes can model the random arrival of default events, where the default time is represented as a hitting time of a time-changed process.
Important Notes and Common Pitfalls:
- Continuity of \(\langle M \rangle_t\): The DDS theorem requires \(\langle M \rangle_t\) to be continuous. If \(\langle M \rangle_t\) has jumps, the theorem does not apply directly.
- Uniqueness of DDS Brownian Motion: The DDS Brownian motion \(B_t\) is unique in law, but not pathwise unique. Different extensions of the probability space may yield different Brownian motions.
- Time-Change Invertibility: The time-change \(\tau_t\) must be strictly increasing and continuous for the inverse \(\tau_t^{-1}\) to exist. If \(\tau_t\) is not strictly increasing, the inverse may not be well-defined.
- Regularity Conditions: When applying time-changes to SDEs, ensure that the coefficients \(\mu\) and \(\sigma\) satisfy the necessary regularity conditions (e.g., Lipschitz continuity) for the time-changed SDE to be well-posed.
- Martingale Property: The DDS theorem applies to local martingales. If \(M_t\) is a true martingale, additional conditions (e.g., uniform integrability) are needed to ensure \(B_t\) is a true Brownian motion.
- Quadratic Variation Growth: The condition \(\langle M \rangle_t \to \infty\) is crucial. If \(\langle M \rangle_t\) is bounded, the DDS theorem does not apply, and the martingale may not converge to a Brownian motion.
Quant Interview Most Asked Questions:
-
Explain the Dambis-Dubins-Schwarz theorem and its significance.
Answer: The DDS theorem states that any continuous local martingale \(M_t\) with \(\langle M \rangle_t \to \infty\) can be represented as a time-changed Brownian motion \(B_{\langle M \rangle_t}\). This is significant because it allows us to reduce problems involving complex martingales to problems involving Brownian motion, which is easier to analyze. Applications include option pricing in stochastic volatility models and simplifying stochastic control problems.
-
How does the DDS theorem apply to the Black-Scholes model?
Answer: In the Black-Scholes model, the stock price \(S_t\) is a geometric Brownian motion. The log-price \(X_t = \log S_t\) is a Brownian motion with drift. If we introduce a stochastic volatility \(\sigma_t\), the quadratic variation of \(X_t\) becomes \(\langle X \rangle_t = \int_0^t \sigma_u^2 du\). The DDS theorem implies that \(X_t = B_{\langle X \rangle_t}\), where \(B_t\) is a Brownian motion. This shows that the Black-Scholes model with stochastic volatility can be viewed as a time-changed Black-Scholes model with constant volatility.
-
Derive the SDE for a time-changed process \(Y_t = X_{\tau_t}\), where \(X_t\) satisfies \(dX_t = \mu dt + \sigma dW_t\) and \(\tau_t\) is a time-change.
Answer: By the DDS theorem, \(W_t = B_{\langle W \rangle_t}\), where \(B_t\) is a Brownian motion. For the time-change \(\tau_t\), we have: \[ X_{\tau_t} = X_0 + \int_0^{\tau_t} \mu ds + \int_0^{\tau_t} \sigma dW_s. \] The second integral can be written as \(\int_0^t \sigma \sqrt{\tau'_s} dB_s\) by the DDS theorem, where \(B_t\) is a Brownian motion. Thus: \[ dY_t = \mu \tau'_t dt + \sigma \sqrt{\tau'_t} dB_t. \]
-
What is the role of the inverse time-change in the DDS theorem?
Answer: The inverse time-change \(\rho_t = \inf \{s \geq 0 : \langle M \rangle_s > t\}\) is used to define the DDS Brownian motion \(B_t = M_{\rho_t}\). The inverse ensures that the quadratic variation of \(B_t\) is \(t\), which is necessary for \(B_t\) to be a Brownian motion by Lévy’s characterization. The inverse time-change "unwinds" the time distortion introduced by \(\langle M \rangle_t\).
-
Can the DDS theorem be applied to a martingale with bounded quadratic variation? Why or why not?
Answer: No, the DDS theorem requires \(\langle M \rangle_t \to \infty\) as \(t \to \infty\). If \(\langle M \rangle_t\) is bounded, the inverse time-change \(\rho_t\) may not be well-defined for all \(t \geq 0\), and the DDS Brownian motion \(B_t\) would not be defined for all \(t\). In this case, \(M_t\) may converge to a limit rather than behaving like a Brownian motion.
Topic 34: Stochastic Differential Equations with Jumps (Kunita's Lemma)
Stochastic Differential Equations (SDEs) with Jumps: These are SDEs that incorporate discontinuous jumps in the process, typically modeled using a Poisson random measure to describe the jump arrivals and magnitudes. They generalize the standard SDE framework to account for sudden, discrete changes in the state variable.
Lévy Process: A stochastic process \( L_t \) is called a Lévy process if it has stationary and independent increments, is continuous in probability, and \( L_0 = 0 \) almost surely. Lévy processes can include both continuous (Brownian motion) and discontinuous (jump) components.
Poisson Random Measure: Let \( (\Omega, \mathcal{F}, \mathbb{P}) \) be a probability space, and let \( (E, \mathcal{E}) \) be a measurable space. A Poisson random measure \( N \) on \( E \times [0, \infty) \) with intensity measure \( \nu \) is a random measure such that:
- For each \( A \in \mathcal{E} \), \( N(A \times [0, t]) \) is a Poisson process with intensity \( \nu(A) \).
- For disjoint sets \( A_1, \dots, A_n \in \mathcal{E} \), the processes \( N(A_1 \times [0, t]), \dots, N(A_n \times [0, t]) \) are independent.
The compensated Poisson random measure is defined as \( \tilde{N}(dt, dz) = N(dt, dz) - \nu(dz) dt \).
Kunita’s Lemma: Kunita’s Lemma provides a change of variable formula (Itô’s formula) for processes driven by Lévy processes or, more generally, semimartingales with jumps. It extends the classical Itô formula to account for the jump component of the process.
Key Formulas
General Form of an SDE with Jumps:
\[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t + \int_E \gamma(t, X_{t-}, z) \tilde{N}(dt, dz), \] where:- \( X_t \) is the state process,
- \( \mu(t, X_t) \) is the drift coefficient,
- \( \sigma(t, X_t) \) is the diffusion coefficient,
- \( W_t \) is a standard Brownian motion,
- \( \tilde{N}(dt, dz) \) is the compensated Poisson random measure,
- \( \gamma(t, X_{t-}, z) \) is the jump coefficient, and
- \( X_{t-} \) denotes the left limit of \( X \) at time \( t \).
Itô’s Formula for SDEs with Jumps (Kunita’s Lemma):
Let \( X_t \) be a semimartingale with jumps, given by:
\[ dX_t = \mu_t dt + \sigma_t dW_t + \int_E \gamma(t, z) \tilde{N}(dt, dz). \]For a twice continuously differentiable function \( f(t, x) \), the change of variable formula is:
\[ \begin{aligned} f(t, X_t) = & f(0, X_0) + \int_0^t \frac{\partial f}{\partial s}(s, X_s) ds + \int_0^t \frac{\partial f}{\partial x}(s, X_s) \mu_s ds \\ & + \int_0^t \frac{\partial f}{\partial x}(s, X_s) \sigma_s dW_s + \frac{1}{2} \int_0^t \frac{\partial^2 f}{\partial x^2}(s, X_s) \sigma_s^2 ds \\ & + \int_0^t \int_E \left[ f(s, X_{s-} + \gamma(s, z)) - f(s, X_{s-}) - \frac{\partial f}{\partial x}(s, X_{s-}) \gamma(s, z) \right] \nu(dz) ds \\ & + \int_0^t \int_E \left[ f(s, X_{s-} + \gamma(s, z)) - f(s, X_{s-}) \right] \tilde{N}(ds, dz). \end{aligned} \]Characteristic Function of a Lévy Process:
The characteristic function \( \phi_{L_t}(u) = \mathbb{E}[e^{i u L_t}] \) of a Lévy process \( L_t \) is given by the Lévy-Khintchine formula:
\[ \phi_{L_t}(u) = \exp \left( t \left[ i u b - \frac{1}{2} u^2 \sigma^2 + \int_{\mathbb{R}} \left( e^{i u z} - 1 - i u z \mathbf{1}_{|z| \leq 1} \right) \nu(dz) \right] \right), \] where \( b \in \mathbb{R} \), \( \sigma \geq 0 \), and \( \nu \) is the Lévy measure satisfying \( \int_{\mathbb{R}} (1 \wedge z^2) \nu(dz) < \infty \).Derivations
Derivation of Kunita’s Lemma (Sketch):
The proof of Kunita’s Lemma relies on the following steps:
- Decomposition of the Process: Write \( X_t \) as the sum of a continuous semimartingale \( X_t^c \) and a pure jump process \( X_t^d \): \[ X_t = X_t^c + X_t^d, \] where \( X_t^c = \int_0^t \mu_s ds + \int_0^t \sigma_s dW_s \) and \( X_t^d = \int_0^t \int_E \gamma(s, z) \tilde{N}(ds, dz) \).
- Apply Itô’s Formula to the Continuous Part: For the continuous part \( X_t^c \), the classical Itô formula applies: \[ f(t, X_t^c) = f(0, X_0) + \int_0^t \frac{\partial f}{\partial s}(s, X_s^c) ds + \int_0^t \frac{\partial f}{\partial x}(s, X_s^c) dX_s^c + \frac{1}{2} \int_0^t \frac{\partial^2 f}{\partial x^2}(s, X_s^c) d\langle X^c \rangle_s. \]
- Account for the Jump Part: The jump part \( X_t^d \) contributes two additional terms:
- A compensator term (integral with respect to \( \nu(dz) ds \)) accounting for the "average" effect of jumps.
- A martingale term (integral with respect to \( \tilde{N}(ds, dz) \)) accounting for the randomness of the jumps.
- Combine the Terms: The final formula is obtained by combining the contributions from the continuous and jump parts, and noting that \( X_s = X_s^c + X_s^d \). The cross-variation terms between \( X^c \) and \( X^d \) vanish because the jumps are orthogonal to the continuous martingale part.
Practical Applications
1. Modeling Asset Prices with Jumps:
In financial mathematics, SDEs with jumps are used to model asset prices that exhibit sudden, large movements (e.g., due to earnings announcements or market crashes). A common model is the Merton jump-diffusion model:
\[ \frac{dS_t}{S_{t-}} = \mu dt + \sigma dW_t + dJ_t, \] where \( J_t = \sum_{i=1}^{N_t} (Y_i - 1) \), \( N_t \) is a Poisson process with intensity \( \lambda \), and \( Y_i \) are i.i.d. random variables representing the jump sizes. Kunita’s Lemma can be used to derive the dynamics of options prices or other derivatives in this model.2. Risk Management and Hedging:
In the presence of jumps, delta-hedging strategies must account for the possibility of discontinuous price movements. Kunita’s Lemma helps derive the correct hedging ratios by providing the dynamics of the hedging portfolio under the jump-diffusion model. For example, the change in the value of a portfolio \( \Pi_t \) can be written as:
\[ d\Pi_t = \frac{\partial \Pi_t}{\partial t} dt + \frac{\partial \Pi_t}{\partial S} dS_t + \frac{1}{2} \frac{\partial^2 \Pi_t}{\partial S^2} \sigma^2 S_t^2 dt + \left[ \Pi_t(S_{t-} + \gamma(t, z)) - \Pi_t(S_{t-}) \right] \tilde{N}(dt, dz). \] This formula is essential for computing the Greeks (e.g., delta, gamma) in jump-diffusion models.3. Credit Risk Modeling:
In credit risk, the default of a firm can be modeled as a jump process. The firm’s asset value \( V_t \) may follow a jump-diffusion SDE:
\[ dV_t = \mu V_t dt + \sigma V_t dW_t + V_{t-} dJ_t, \] where \( J_t \) is a compound Poisson process representing sudden drops in asset value (e.g., due to default). Kunita’s Lemma is used to derive the dynamics of credit derivatives (e.g., credit default swaps) written on \( V_t \).Common Pitfalls and Important Notes
1. Left Limits and Predictability:
In SDEs with jumps, it is crucial to use the left limit \( X_{t-} \) in the integrands (e.g., in \( \gamma(t, X_{t-}, z) \)) to ensure that the integrals are well-defined and predictable. Using \( X_t \) instead of \( X_{t-} \) can lead to anticipative integrals and incorrect results.
2. Compensated vs. Uncompensated Poisson Measure:
The compensated Poisson random measure \( \tilde{N}(dt, dz) = N(dt, dz) - \nu(dz) dt \) is a martingale, while the uncompensated measure \( N(dt, dz) \) is not. When applying Kunita’s Lemma, the compensator term (integral with respect to \( \nu(dz) dt \)) must be included to account for the "average" effect of jumps. Omitting this term leads to incorrect drift calculations.
3. Differentiability Assumptions:
Kunita’s Lemma requires the function \( f(t, x) \) to be twice continuously differentiable in \( x \) and once in \( t \). If \( f \) is not smooth (e.g., \( f(x) = \max(x - K, 0) \) for a call option), the formula must be applied with care, often using smoothing techniques or localization arguments.
4. Quadratic Variation and Jumps:
The quadratic variation of a process with jumps includes contributions from both the continuous and jump parts. For a semimartingale \( X_t \), the quadratic variation is:
\[ [X]_t = \int_0^t \sigma_s^2 ds + \int_0^t \int_E \gamma(s, z)^2 N(ds, dz). \] This is important for computing the variance of stochastic integrals and for applications in stochastic control.5. Lévy Measure and Moment Conditions:
The Lévy measure \( \nu \) must satisfy \( \int_{\mathbb{R}} (1 \wedge z^2) \nu(dz) < \infty \) for the process to be well-defined. If \( \int_{|z| > 1} |z| \nu(dz) = \infty \), the process has infinite variation, and additional care is needed in defining the SDE. For example, the Merton model assumes \( \mathbb{E}[Y_i] < \infty \), which implies \( \int_{|z| > 1} |z| \nu(dz) < \infty \).
6. Connection to Itô’s Formula:
Kunita's Lemma reduces to the classical Itô formula when the jump component is absent (i.e., \( \gamma(t, z) = 0 \)). This highlights the generality of Kunita's result as an extension of Itô's formula to processes with jumps.
Topic 35: Forward SDEs (FSDEs) and Backward SDEs (BSDEs)
Stochastic Differential Equations (SDEs): SDEs are differential equations in which one or more of the terms is a stochastic process, resulting in a solution that is also a stochastic process. They are used to model systems that evolve over time with inherent randomness.
Forward Stochastic Differential Equations (FSDEs): FSDEs are SDEs of the form:
\[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t, \quad X_0 = x_0, \]where \(X_t\) is the state variable, \(\mu(t, X_t)\) is the drift term, \(\sigma(t, X_t)\) is the diffusion term, and \(W_t\) is a Wiener process (Brownian motion). FSDEs describe the evolution of a process forward in time from a known initial condition.
Backward Stochastic Differential Equations (BSDEs): BSDEs are SDEs of the form:
\[ -dY_t = f(t, Y_t, Z_t) dt - Z_t dW_t, \quad Y_T = \xi, \]where \(Y_t\) is the state variable, \(f(t, Y_t, Z_t)\) is the driver function, \(Z_t\) is the control process, and \(\xi\) is a random variable representing the terminal condition at time \(T\). BSDEs describe the evolution of a process backward in time from a known terminal condition.
General Form of FSDE:
\[ dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t, \quad t \in [0, T], \quad X_0 = x_0. \]Here, \(X_t\) is an \(\mathbb{R}^n\)-valued stochastic process, \(\mu: [0, T] \times \mathbb{R}^n \to \mathbb{R}^n\) is the drift coefficient, \(\sigma: [0, T] \times \mathbb{R}^n \to \mathbb{R}^{n \times d}\) is the diffusion coefficient, and \(W_t\) is a \(d\)-dimensional Brownian motion.
General Form of BSDE:
\[ -dY_t = f(t, Y_t, Z_t) dt - Z_t dW_t, \quad t \in [0, T], \quad Y_T = \xi. \]Here, \(Y_t\) is an \(\mathbb{R}^m\)-valued stochastic process, \(f: [0, T] \times \mathbb{R}^m \times \mathbb{R}^{m \times d} \to \mathbb{R}^m\) is the driver function, \(Z_t\) is an \(\mathbb{R}^{m \times d}\)-valued process, and \(\xi\) is an \(\mathcal{F}_T\)-measurable \(\mathbb{R}^m\)-valued random variable.
Existence and Uniqueness of FSDE Solutions (Lipschitz Conditions):
The FSDE \(dX_t = \mu(t, X_t) dt + \sigma(t, X_t) dW_t\) has a unique strong solution if the following conditions hold:
- Lipschitz condition: There exists a constant \(K > 0\) such that for all \(t \in [0, T]\), \(x, y \in \mathbb{R}^n\), \[ \|\mu(t, x) - \mu(t, y)\| + \|\sigma(t, x) - \sigma(t, y)\| \leq K \|x - y\|. \]
- Linear growth condition: There exists a constant \(C > 0\) such that for all \(t \in [0, T]\), \(x \in \mathbb{R}^n\), \[ \|\mu(t, x)\| + \|\sigma(t, x)\| \leq C (1 + \|x\|). \]
Existence and Uniqueness of BSDE Solutions (Pardoux-Peng Theorem):
The BSDE \(-dY_t = f(t, Y_t, Z_t) dt - Z_t dW_t\) with terminal condition \(Y_T = \xi\) has a unique adapted solution \((Y_t, Z_t)\) if:
- The terminal condition \(\xi\) is square-integrable: \(\mathbb{E}[\|\xi\|^2] < \infty\).
- The driver \(f(t, y, z)\) is Lipschitz in \(y\) and \(z\): There exists a constant \(K > 0\) such that for all \(t \in [0, T]\), \(y, y' \in \mathbb{R}^m\), \(z, z' \in \mathbb{R}^{m \times d}\), \[ \|f(t, y, z) - f(t, y', z')\| \leq K (\|y - y'\| + \|z - z'\|). \]
- The driver \(f(t, 0, 0)\) is square-integrable: \(\mathbb{E}\left[\int_0^T \|f(t, 0, 0)\|^2 dt\right] < \infty\).
Example: Geometric Brownian Motion (FSDE)
The Geometric Brownian Motion (GBM) is a classic example of an FSDE, given by:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t, \quad S_0 = s_0, \]where \(\mu\) is the drift, \(\sigma\) is the volatility, and \(W_t\) is a Brownian motion. The solution to this FSDE is:
\[ S_t = s_0 \exp\left(\left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W_t\right). \]Derivation:
- Apply Itô's formula to \(f(S_t) = \log(S_t)\): \[ d(\log S_t) = \frac{1}{S_t} dS_t - \frac{1}{2 S_t^2} (dS_t)^2. \]
- Substitute \(dS_t = \mu S_t dt + \sigma S_t dW_t\) and \((dS_t)^2 = \sigma^2 S_t^2 dt\): \[ d(\log S_t) = \left(\mu - \frac{\sigma^2}{2}\right) dt + \sigma dW_t. \]
- Integrate both sides from \(0\) to \(t\): \[ \log S_t - \log s_0 = \left(\mu - \frac{\sigma^2}{2}\right) t + \sigma W_t. \]
- Exponentiate both sides to obtain the solution: \[ S_t = s_0 \exp\left(\left(\mu - \frac{\sigma^2}{2}\right) t + \sigma W_t\right). \]
Example: Linear BSDE
Consider the linear BSDE:
\[ -dY_t = (\alpha_t Y_t + \beta_t Z_t + \gamma_t) dt - Z_t dW_t, \quad Y_T = \xi, \]where \(\alpha_t\), \(\beta_t\), and \(\gamma_t\) are adapted processes. The solution to this BSDE is given by:
\[ Y_t = \mathbb{E}\left[\xi \Gamma_t^T + \int_t^T \gamma_s \Gamma_t^s ds \mid \mathcal{F}_t\right], \]where \(\Gamma_t^s\) is the adjoint process defined by:
\[ d\Gamma_t^s = \Gamma_t^s (\alpha_t dt + \beta_t dW_t), \quad \Gamma_s^s = 1. \]Derivation Sketch:
- Define the process \(\Gamma_t^s\) as the solution to the FSDE: \[ d\Gamma_t^s = \Gamma_t^s (\alpha_t dt + \beta_t dW_t), \quad \Gamma_s^s = 1. \]
- Apply Itô's product rule to \(Y_t \Gamma_t^0\) and take expectations to derive the solution for \(Y_t\).
Feynman-Kac Theorem (Connection between PDEs and BSDEs):
Let \(u(t, x)\) be the solution to the parabolic PDE:
\[ \frac{\partial u}{\partial t} + \mathcal{L}u + f(t, x, u, \sigma^T \nabla u) = 0, \quad u(T, x) = g(x), \]where \(\mathcal{L}\) is the infinitesimal generator of the diffusion \(X_t\):
\[ \mathcal{L} = \sum_{i=1}^n \mu_i(t, x) \frac{\partial}{\partial x_i} + \frac{1}{2} \sum_{i,j=1}^n (\sigma \sigma^T)_{ij}(t, x) \frac{\partial^2}{\partial x_i \partial x_j}. \]Then, \(u(t, x)\) can be represented as:
\[ u(t, x) = Y_t^{t,x}, \]where \(Y_t^{t,x}\) is the solution to the BSDE:
\[ -dY_s^{t,x} = f(s, X_s^{t,x}, Y_s^{t,x}, Z_s^{t,x}) ds - Z_s^{t,x} dW_s, \quad Y_T^{t,x} = g(X_T^{t,x}), \]and \(X_s^{t,x}\) is the solution to the FSDE starting at \(x\) at time \(t\):
\[ dX_s^{t,x} = \mu(s, X_s^{t,x}) ds + \sigma(s, X_s^{t,x}) dW_s, \quad X_t^{t,x} = x. \]Practical Applications:
- Finance (Option Pricing):
BSDEs are used to price European and American options, especially in incomplete markets or when the underlying asset follows a complex stochastic process. The terminal condition \(\xi\) represents the payoff of the option, and the solution \(Y_t\) gives the option price at time \(t\).
- Finance (Hedging):
The process \(Z_t\) in a BSDE provides the hedging strategy for replicating the contingent claim \(\xi\). For example, in the Black-Scholes model, \(Z_t\) corresponds to the delta of the option.
- Stochastic Control:
BSDEs arise in stochastic control problems, where the goal is to minimize a cost functional. The solution to the BSDE provides the value function and the optimal control.
- Risk Management:
BSDEs are used to compute risk measures such as Value at Risk (VaR) and Conditional Value at Risk (CVaR) in a dynamic setting.
- Physics and Engineering:
FSDEs model systems with noise, such as particle motion in a fluid (Brownian motion) or signal processing in the presence of noise.
Common Pitfalls and Important Notes:
- Adaptedness:
Solutions to BSDEs must be adapted to the filtration generated by the Brownian motion. This means that \(Y_t\) and \(Z_t\) cannot depend on future information. This is a common point of confusion when first working with BSDEs.
- Terminal vs. Initial Conditions:
FSDEs are solved forward in time from an initial condition, while BSDEs are solved backward in time from a terminal condition. Mixing these up can lead to incorrect interpretations of the solution.
- Lipschitz Conditions:
The existence and uniqueness of solutions to both FSDEs and BSDEs rely on Lipschitz conditions. If these conditions are not satisfied (e.g., in models with jumps or singular coefficients), more advanced techniques are required.
- Numerical Methods:
Solving BSDEs numerically is more challenging than solving FSDEs due to the backward nature of the problem. Common methods include the backward Euler scheme and the four-step scheme (for coupled FSDE-BSDE systems).
- Coupled FSDE-BSDE Systems:
In many applications (e.g., stochastic control), the FSDE and BSDE are coupled, meaning the solution to one affects the other. These systems require careful analysis and often involve fixed-point arguments for existence and uniqueness.
- Driver Dependence on \(Z_t\):
In BSDEs, the driver \(f(t, Y_t, Z_t)\) can depend on \(Z_t\), which introduces additional complexity. This dependence is crucial in applications like option pricing with nonlinear pricing rules (e.g., in the presence of default risk).
- Martingale Representation Theorem:
The existence of \(Z_t\) in BSDEs is guaranteed by the martingale representation theorem, which states that any square-integrable martingale can be represented as a stochastic integral with respect to the Brownian motion.
Quant Interview Question: Solving a Simple BSDE
Question: Consider the BSDE:
\[ -dY_t = (Y_t + Z_t) dt - Z_t dW_t, \quad Y_T = W_T. \]Find the explicit solution \((Y_t, Z_t)\).
Solution:
- Assume a solution of the form \(Y_t = a_t W_t + b_t\), where \(a_t\) and \(b_t\) are deterministic functions of time.
- Apply Itô's formula to \(Y_t\): \[ dY_t = a_t dW_t + W_t da_t + db_t. \]
- Substitute into the BSDE: \[ - (a_t dW_t + W_t da_t + db_t) = (a_t W_t + b_t + Z_t) dt - Z_t dW_t. \]
- Equate the coefficients of \(dW_t\) and \(dt\):
- Coefficient of \(dW_t\): \(-a_t = -Z_t \implies Z_t = a_t\).
- Coefficient of \(dt\): \(-W_t da_t - db_t = a_t W_t + b_t + Z_t = a_t W_t + b_t + a_t\).
- Equate the coefficients of \(W_t\) and the constant terms in the \(dt\) equation:
- Coefficient of \(W_t\): \(-da_t = a_t dt \implies \frac{da_t}{dt} = -a_t \implies a_t = C e^{-t}\).
- Constant term: \(-db_t = (b_t + a_t) dt \implies \frac{db_t}{dt} = -b_t - a_t = -b_t - C e^{-t}\).
- Solve the ODE for \(a_t\) with terminal condition \(Y_T = W_T \implies a_T = 1\): \[ a_T = C e^{-T} = 1 \implies C = e^T \implies a_t = e^{T - t}. \]
- Solve the ODE for \(b_t\) with terminal condition \(Y_T = W_T \implies b_T = 0\): \[ \frac{db_t}{dt} + b_t = -e^{T - t}, \quad b_T = 0. \]
- Combine the results to obtain the solution: \[ Y_t = e^{T - t} W_t + e^{T - t} (T - t), \quad Z_t = e^{T - t}. \]
The integrating factor is \(e^t\), so:
\[ \frac{d}{dt}(b_t e^t) = -e^{T} \implies b_t e^t = -e^{T} t + D. \]Using \(b_T = 0\): \(0 = -e^{T} T + D \implies D = e^{T} T\). Thus:
\[ b_t = e^{T - t} (T - t). \]Quant Interview Question: FSDE and Change of Measure
Question: Consider the FSDE under the real-world measure \(\mathbb{P}\):
\[ dS_t = \mu S_t dt + \sigma S_t dW_t^{\mathbb{P}}. \]Define the risk-neutral measure \(\mathbb{Q}\) via the Radon-Nikodym derivative:
\[ \frac{d\mathbb{Q}}{d\mathbb{P}} \bigg|_{\mathcal{F}_t} = \exp\left(-\frac{\mu - r}{\sigma} W_t^{\mathbb{P}} - \frac{1}{2} \left(\frac{\mu - r}{\sigma}\right)^2 t\right), \]where \(r\) is the risk-free rate. Show that under \(\mathbb{Q}\), the FSDE becomes:
\[ dS_t = r S_t dt + \sigma S_t dW_t^{\mathbb{Q}}, \]where \(W_t^{\mathbb{Q}}\) is a \(\mathbb{Q}\)-Brownian motion.
Solution:
- By Girsanov's theorem, the process: \[ W_t^{\mathbb{Q}} = W_t^{\mathbb{P}} + \frac{\mu - r}{\sigma} t \]
- Substitute \(W_t^{\mathbb{P}} = W_t^{\mathbb{Q}} - \frac{\mu - r}{\sigma} t\) into the original FSDE: \[ dS_t = \mu S_t dt + \sigma S_t \left(dW_t^{\mathbb{Q}} - \frac{\mu - r}{\sigma} dt\right). \]
- Simplify the drift term: \[ dS_t = \mu S_t dt + \sigma S_t dW_t^{\mathbb{Q}} - (\mu - r) S_t dt = r S_t dt + \sigma S_t dW_t^{\mathbb{Q}}. \]
- Thus, under \(\mathbb{Q}\), the FSDE is: \[ dS_t = r S_t dt + \sigma S_t dW_t^{\mathbb{Q}}. \]
is a \(\mathbb{Q}\)-Brownian motion.
Topic 36: Nonlinear Feynman-Kac Representation for BSDEs
Backward Stochastic Differential Equation (BSDE): A BSDE is an equation of the form
\[ -dY_t = f(t, Y_t, Z_t)dt - Z_t dW_t, \quad Y_T = \xi, \] where:- \((W_t)_{t \geq 0}\) is a standard Brownian motion on a filtered probability space \((\Omega, \mathcal{F}, \mathbb{F}, \mathbb{P})\),
- \(\xi\) is an \(\mathcal{F}_T\)-measurable random variable (terminal condition),
- \(f: [0,T] \times \mathbb{R} \times \mathbb{R}^d \to \mathbb{R}\) is the driver function,
- \((Y_t, Z_t)\) are adapted processes to be determined.
Nonlinear Feynman-Kac Representation: The nonlinear Feynman-Kac formula connects the solution of a BSDE to the solution of a semilinear parabolic PDE. Specifically, if \(u(t,x)\) solves the PDE
\[ \frac{\partial u}{\partial t} + \mathcal{L}u + f(t,x,u,\sigma^T \nabla u) = 0, \quad u(T,x) = g(x), \] where \(\mathcal{L}\) is the infinitesimal generator of a diffusion process \(X_t\), then the solution \((Y_t, Z_t)\) of the BSDE can be represented as: \[ Y_t = u(t, X_t), \quad Z_t = \sigma^T(t, X_t) \nabla u(t, X_t). \]Key Formulas:
- BSDE Formulation: \[ Y_t = \xi + \int_t^T f(s, Y_s, Z_s) ds - \int_t^T Z_s dW_s. \]
- Forward SDE for \(X_t\): \[ dX_t = b(t, X_t) dt + \sigma(t, X_t) dW_t, \quad X_0 = x. \]
- Infinitesimal Generator \(\mathcal{L}\): \[ \mathcal{L} = \sum_{i=1}^d b_i(t,x) \frac{\partial}{\partial x_i} + \frac{1}{2} \sum_{i,j=1}^d (\sigma \sigma^T)_{ij}(t,x) \frac{\partial^2}{\partial x_i \partial x_j}. \]
- Nonlinear Feynman-Kac Representation: If \(u(t,x)\) solves: \[ \frac{\partial u}{\partial t} + \mathcal{L}u + f(t,x,u,\sigma^T \nabla u) = 0, \quad u(T,x) = g(x), \] then: \[ Y_t = u(t, X_t), \quad Z_t = \sigma^T(t, X_t) \nabla u(t, X_t). \]
Example: Black-Scholes BSDE and PDE
Consider the Black-Scholes model for a European option with payoff \(g(S_T)\):
- Forward SDE for \(S_t\) (stock price): \[ dS_t = r S_t dt + \sigma S_t dW_t. \]
- BSDE for the option price \(Y_t\): \[ -dY_t = -r Y_t dt - Z_t dW_t, \quad Y_T = g(S_T). \] Here, the driver \(f(t,y,z) = -r y\).
- Corresponding PDE (Black-Scholes PDE): \[ \frac{\partial u}{\partial t} + \frac{1}{2} \sigma^2 x^2 \frac{\partial^2 u}{\partial x^2} + r x \frac{\partial u}{\partial x} - r u = 0, \quad u(T,x) = g(x). \]
- Nonlinear Feynman-Kac representation: \[ Y_t = u(t, S_t), \quad Z_t = \sigma S_t \frac{\partial u}{\partial x}(t, S_t). \]
Derivation Outline (Nonlinear Feynman-Kac):
- Assume a smooth solution \(u(t,x)\) to the PDE exists.
- Apply Itô's formula to \(u(t, X_t)\): \[ du(t, X_t) = \left( \frac{\partial u}{\partial t} + \mathcal{L}u \right) dt + \nabla u \cdot \sigma dW_t. \]
- Substitute the PDE into the drift term: Using the PDE, \(\frac{\partial u}{\partial t} + \mathcal{L}u = -f(t,x,u,\sigma^T \nabla u)\), so: \[ du(t, X_t) = -f(t, X_t, u, \sigma^T \nabla u) dt + \nabla u \cdot \sigma dW_t. \]
- Define \(Y_t = u(t, X_t)\) and \(Z_t = \sigma^T(t, X_t) \nabla u(t, X_t)\): Then: \[ dY_t = -f(t, Y_t, Z_t) dt + Z_t dW_t. \] Integrating from \(t\) to \(T\) and using \(Y_T = u(T, X_T) = g(X_T) = \xi\) recovers the BSDE.
Practical Applications:
- Option Pricing in Incomplete Markets: The nonlinear Feynman-Kac representation allows pricing of options in markets where the hedging strategy is not unique (e.g., defaultable claims, volatility uncertainty).
- Stochastic Control and Differential Games: BSDEs arise naturally in stochastic control problems (e.g., Hamilton-Jacobi-Bellman equations) and zero-sum games (Isaacs equations).
- Risk Measures and Portfolio Optimization: BSDEs are used to compute dynamic risk measures (e.g., conditional Value-at-Risk) and solve portfolio optimization problems with constraints.
- Numerical Methods: The connection to PDEs enables the use of finite difference or finite element methods to solve BSDEs numerically.
Common Pitfalls and Important Notes:
- Existence and Uniqueness: The BSDE may not have a solution if the driver \(f\) is not Lipschitz or the terminal condition \(\xi\) is not square-integrable. Always verify assumptions (e.g., Pardoux-Peng conditions).
- Smoothness of \(u(t,x)\): The nonlinear Feynman-Kac representation assumes \(u(t,x)\) is smooth. In practice, \(u\) may only be a viscosity solution of the PDE.
- Driver Dependence on \(Z_t\): If \(f\) depends on \(Z_t\), the PDE becomes fully nonlinear (e.g., \(f(t,y,z) = \frac{1}{2} |z|^2\) leads to a quadratic BSDE and a Hamilton-Jacobi-Bellman PDE).
- Markovian vs. Non-Markovian: The Feynman-Kac representation applies when the BSDE is Markovian (i.e., \(f\) and \(\xi\) depend only on \(X_t\)). Non-Markovian BSDEs require more general techniques.
- Numerical Stability: When solving the PDE numerically, ensure the scheme is stable and consistent (e.g., Crank-Nicolson for parabolic PDEs). The BSDE approach may offer better stability for high-dimensional problems.
- Connection to Forward-Backward SDEs (FBSDEs): The pair \((X_t, Y_t)\) forms an FBSDE, which is useful for Monte Carlo methods (e.g., regression-based approaches).
Example: Quadratic BSDE and PDE
Consider the BSDE with driver \(f(t,y,z) = \frac{1}{2} |z|^2\) and terminal condition \(\xi = g(X_T)\):
\[ -dY_t = \frac{1}{2} |Z_t|^2 dt - Z_t dW_t, \quad Y_T = g(X_T). \]The corresponding PDE is:
\[ \frac{\partial u}{\partial t} + \mathcal{L}u + \frac{1}{2} |\sigma^T \nabla u|^2 = 0, \quad u(T,x) = g(x). \]This is a Hamilton-Jacobi-Bellman equation arising in stochastic control. The solution \(u(t,x)\) represents the value function of an optimization problem.
Topic 37: Malliavin Calculus Basics (Skorokhod Integral, Clark-Ocone Formula)
Malliavin Calculus: An infinite-dimensional differential calculus on the Wiener space, extending classical calculus to functionals of Brownian motion. It provides tools for stochastic analysis, particularly for studying the smoothness of laws of random variables and for representing random variables as stochastic integrals.
Wiener Space (\(C_0([0,T])\)): The space of continuous functions \( \omega: [0,T] \to \mathbb{R} \) such that \( \omega(0) = 0 \), equipped with the Wiener measure \( \mathbb{P} \). This is the canonical space for Brownian motion \( W_t(\omega) = \omega(t) \).
Malliavin Derivative (\(D\)): A linear operator that acts on smooth random variables (functionals of Brownian motion). For a smooth random variable \( F \), the Malliavin derivative \( D_t F \) is a stochastic process representing the "derivative" of \( F \) with respect to the Brownian path at time \( t \).
Formally, if \( F = f(W_{t_1}, \dots, W_{t_n}) \) for \( f \in C^\infty(\mathbb{R}^n) \), then:
\[ D_t F = \sum_{i=1}^n \frac{\partial f}{\partial x_i}(W_{t_1}, \dots, W_{t_n}) \mathbf{1}_{[0,t_i]}(t). \]Domain of \( D \) (\( \mathbb{D}^{1,2} \)): The Hilbert space of random variables \( F \) such that:
\[ \|F\|_{1,2}^2 = \mathbb{E}[F^2] + \mathbb{E}\left[\int_0^T (D_t F)^2 dt\right] < \infty. \]Skorokhod Integral (\( \delta \)): The adjoint operator of the Malliavin derivative \( D \). For a stochastic process \( u \in \text{Dom}(\delta) \), the Skorokhod integral \( \delta(u) \) is a random variable satisfying:
\[ \mathbb{E}[F \delta(u)] = \mathbb{E}\left[\int_0^T D_t F \cdot u_t dt\right] \quad \text{(Duality relation)}. \]The Skorokhod integral generalizes the Itô integral to non-adapted processes and is often denoted by \( \int_0^T u_t \delta W_t \).
Clark-Ocone Formula: Provides an explicit martingale representation for a random variable \( F \in \mathbb{D}^{1,2} \) in terms of its Malliavin derivative:
\[ F = \mathbb{E}[F] + \int_0^T \mathbb{E}[D_t F | \mathcal{F}_t] \delta W_t. \]Here, \( \mathcal{F}_t \) is the filtration generated by Brownian motion up to time \( t \).
Chaos Expansion: Every square-integrable random variable \( F \in L^2(\Omega) \) can be represented as an infinite sum of multiple Itô integrals:
\[ F = \sum_{n=0}^\infty I_n(f_n), \]where \( I_n(f_n) \) is the \( n \)-th order multiple Itô integral of a symmetric function \( f_n \in L^2([0,T]^n) \). The Malliavin derivative and Skorokhod integral act on these chaos components as:
\[ D_t I_n(f_n) = n I_{n-1}(f_n(\cdot, t)), \quad \delta(I_{n-1}(f_n)) = I_n(\tilde{f}_n), \]where \( \tilde{f}_n \) is the symmetrization of \( f_n \).
Key Formulas and Derivations
Malliavin Derivative of Itô Processes: Let \( X_t \) be an Itô process:
\[ dX_t = u_t dt + v_t dW_t, \]where \( u, v \) are adapted processes in \( \mathbb{D}^{1,2} \). Then:
\[ D_s X_t = v_s \mathbf{1}_{[0,t]}(s) + \int_s^t D_s u_r dr + \int_s^t D_s v_r dW_r. \]Example: Malliavin Derivative of \( F = W_T^2 \)
Let \( F = W_T^2 \). Using the definition of the Malliavin derivative for smooth functionals:
\[ F = f(W_T), \quad f(x) = x^2 \implies f'(x) = 2x. \]Thus:
\[ D_t F = f'(W_T) \mathbf{1}_{[0,T]}(t) = 2 W_T \mathbf{1}_{[0,T]}(t). \]Alternatively, using Itô's formula:
\[ W_T^2 = \int_0^T 2 W_t dW_t + \int_0^T 1 dt. \]Applying the Malliavin derivative to the Itô integral (using the formula for Itô processes):
\[ D_t \left( \int_0^T 2 W_s dW_s \right) = 2 W_t + \int_t^T 2 D_t W_s dW_s = 2 W_t, \]since \( D_t W_s = \mathbf{1}_{[0,s]}(t) \). Combining with \( D_t \left( \int_0^T 1 ds \right) = 0 \), we recover \( D_t F = 2 W_T \mathbf{1}_{[0,T]}(t) \).
Skorokhod Integral of Elementary Processes: For an elementary process \( u_t = \sum_{i=1}^n F_i \mathbf{1}_{(t_{i-1}, t_i]}(t) \), where \( F_i \) is \( \mathcal{F}_{t_{i-1}} \)-measurable, the Skorokhod integral coincides with the Itô integral:
\[ \delta(u) = \sum_{i=1}^n F_i (W_{t_i} - W_{t_{i-1}}). \]For non-adapted \( F_i \), the Skorokhod integral is defined via the duality relation.
Example: Skorokhod Integral of \( u_t = W_T \mathbf{1}_{[0,T]}(t) \)
Compute \( \delta(u) \) using the duality relation. For any \( F \in \mathbb{D}^{1,2} \):
\[ \mathbb{E}[F \delta(u)] = \mathbb{E}\left[\int_0^T D_t F \cdot W_T dt\right] = \mathbb{E}\left[W_T \int_0^T D_t F dt\right]. \]Using the Clark-Ocone formula for \( F \):
\[ F = \mathbb{E}[F] + \int_0^T \mathbb{E}[D_t F | \mathcal{F}_t] dW_t, \]we have:
\[ \int_0^T D_t F dt = \int_0^T \mathbb{E}[D_t F | \mathcal{F}_t] dt + \int_0^T \left( \int_0^t D_s D_t F dW_s \right) dt. \]Taking expectations and using Fubini's theorem:
\[ \mathbb{E}\left[W_T \int_0^T D_t F dt\right] = \mathbb{E}\left[W_T \int_0^T \mathbb{E}[D_t F | \mathcal{F}_t] dt\right] = \mathbb{E}\left[\int_0^T \mathbb{E}[W_T D_t F | \mathcal{F}_t] dt\right]. \]By the Clark-Ocone formula for \( W_T \):
\[ W_T = \int_0^T 1 dW_t \implies \mathbb{E}[W_T D_t F | \mathcal{F}_t] = \mathbb{E}[D_t F | \mathcal{F}_t] \cdot \mathbb{E}[W_T | \mathcal{F}_t] + \text{Cov}(W_T, D_t F | \mathcal{F}_t). \]Since \( \mathbb{E}[W_T | \mathcal{F}_t] = W_t \) and \( \text{Cov}(W_T, D_t F | \mathcal{F}_t) = \mathbb{E}[(W_T - W_t) D_t F | \mathcal{F}_t] = \mathbb{E}[D_t F \cdot (W_T - W_t)] \), we get:
\[ \mathbb{E}[F \delta(u)] = \mathbb{E}\left[\int_0^T \mathbb{E}[D_t F | \mathcal{F}_t] W_t dt\right] + \mathbb{E}\left[\int_0^T D_t F \cdot (W_T - W_t) dt\right]. \]The first term is \( \mathbb{E}\left[F \int_0^T W_t dW_t\right] \), and the second term is \( \mathbb{E}[F (T - \int_0^T W_t dW_t)] \) (by Itô's formula for \( W_T^2 \)). Thus:
\[ \mathbb{E}[F \delta(u)] = \mathbb{E}\left[F \left( \int_0^T W_t dW_t + T - \int_0^T W_t dW_t \right)\right] = \mathbb{E}[F T]. \]Therefore, \( \delta(u) = T \).
Derivation of Clark-Ocone Formula:
- Start with the chaos expansion of \( F \in \mathbb{D}^{1,2} \): \[ F = \sum_{n=0}^\infty I_n(f_n). \]
- Apply the Malliavin derivative \( D_t \): \[ D_t F = \sum_{n=1}^\infty n I_{n-1}(f_n(\cdot, t)). \]
- Project onto \( \mathcal{F}_t \): \[ \mathbb{E}[D_t F | \mathcal{F}_t] = \sum_{n=1}^\infty n \mathbb{E}[I_{n-1}(f_n(\cdot, t)) | \mathcal{F}_t]. \]
- Use the martingale property of Itô integrals: \( \mathbb{E}[I_{n-1}(f_n(\cdot, t)) | \mathcal{F}_t] = I_{n-1}(f_n(\cdot, t) \mathbf{1}_{[0,t]^{n-1}}) \).
- Integrate with respect to \( \delta W_t \): \[ \int_0^T \mathbb{E}[D_t F | \mathcal{F}_t] \delta W_t = \sum_{n=1}^\infty n I_n(f_n(\cdot, t) \mathbf{1}_{[0,t]^n}) = \sum_{n=1}^\infty I_n(f_n) = F - \mathbb{E}[F]. \]
Practical Applications
1. Hedging in Incomplete Markets: The Clark-Ocone formula provides the minimal variance hedge for a contingent claim \( F \) in an incomplete market. The hedging strategy is given by \( \mathbb{E}[D_t F | \mathcal{F}_t] \).
2. Sensitivity Analysis (Greeks): Malliavin calculus provides a pathwise method for computing Greeks (e.g., Delta, Vega) without finite differences. For example, the Delta of a European call option \( F = (S_T - K)^+ \) is:
\[ \Delta = \mathbb{E}[D_t F] = \mathbb{E}\left[\mathbf{1}_{\{S_T > K\}} D_t S_T\right], \]where \( D_t S_T = S_T \sigma(t, S_t) \) for a geometric Brownian motion \( S_t \).
3. Stochastic Control and BSDEs: Malliavin calculus is used to prove the existence and uniqueness of solutions to backward stochastic differential equations (BSDEs) and to derive the stochastic maximum principle.
4. Numerical Methods: The Skorokhod integral is used in the development of numerical schemes for stochastic PDEs and in the analysis of Monte Carlo methods for SDEs.
Common Pitfalls and Important Notes
1. Domain of Operators: The Malliavin derivative \( D \) and Skorokhod integral \( \delta \) are unbounded operators. Their domains \( \mathbb{D}^{1,2} \) and \( \text{Dom}(\delta) \) are dense in \( L^2(\Omega) \), but not all random variables or processes belong to these domains. For example, \( F = \mathbf{1}_{\{W_T > 0\}} \) is not in \( \mathbb{D}^{1,2} \).
2. Adaptedness: The Skorokhod integral coincides with the Itô integral for adapted processes, but for non-adapted processes, it is a generalization. The Skorokhod integral of a non-adapted process may not be a martingale.
3. Chain Rule: The Malliavin derivative satisfies a chain rule for Lipschitz functions. For \( F \in \mathbb{D}^{1,2} \) and \( \phi \in C^1 \) with bounded derivative:
\[ D_t \phi(F) = \phi'(F) D_t F. \]For \( \phi \) not differentiable (e.g., \( \phi(x) = \max(x, 0) \)), the chain rule fails, and \( \phi(F) \) may not be in \( \mathbb{D}^{1,2} \).
4. Clark-Ocone Formula Assumptions: The Clark-Ocone formula requires \( F \in \mathbb{D}^{1,2} \). For \( F \notin \mathbb{D}^{1,2} \), the formula may not hold, or the integrand \( \mathbb{E}[D_t F | \mathcal{F}_t] \) may not be square-integrable.
5. Numerical Implementation: While Malliavin calculus provides elegant theoretical results, numerical implementation (e.g., computing \( D_t F \) or \( \delta(u) \)) can be challenging. Monte Carlo methods combined with finite differences are often used for practical computations.
6. Relationship to Itô's Lemma: The Malliavin derivative is not a direct generalization of the Itô derivative. While Itô's lemma is a chain rule for Itô processes, the Malliavin derivative is a Fréchet derivative on the Wiener space. The two are related but distinct concepts.
Topic 38: Greeks via Malliavin Calculus (Delta, Vega, Gamma)
Malliavin Calculus: A branch of stochastic analysis that extends calculus of variations to functionals of Brownian motion. It provides a framework for differentiating random variables with respect to the underlying Brownian paths, enabling computation of sensitivities (Greeks) in financial models.
Greeks: Quantities representing the sensitivity of the price of a financial derivative to changes in underlying parameters. In this context, we focus on:
- Delta (Δ): Sensitivity to the underlying asset price \( S_t \).
- Vega (ν): Sensitivity to volatility \( \sigma \).
- Gamma (Γ): Second-order sensitivity to the underlying asset price.
Malliavin Derivative: For a random variable \( F \) in the space \( \mathbb{D}^{1,2} \), the Malliavin derivative \( D_t F \) is defined as the stochastic process satisfying:
\[ \mathbb{E} \left[ F \int_0^T u_t \, dW_t \right] = \mathbb{E} \left[ \int_0^T D_t F \cdot u_t \, dt \right], \]for all adapted processes \( u_t \) in \( L^2([0,T] \times \Omega) \). Here, \( W_t \) is a standard Brownian motion.
Skorohod Integral: The adjoint operator of the Malliavin derivative, denoted \( \delta(u) \), satisfies:
\[ \mathbb{E} \left[ F \delta(u) \right] = \mathbb{E} \left[ \int_0^T D_t F \cdot u_t \, dt \right]. \]It generalizes the Itô integral to anticipating integrands.
Clark-Ocone Formula: For \( F \in \mathbb{D}^{1,2} \), the following representation holds:
\[ F = \mathbb{E}[F] + \int_0^T \mathbb{E}[D_t F | \mathcal{F}_t] \, dW_t. \]This formula is pivotal for computing Greeks via Malliavin calculus.
Greeks via Malliavin Calculus: For a payoff \( \Phi(S_T) \), where \( S_T \) follows geometric Brownian motion:
\[ dS_t = r S_t \, dt + \sigma S_t \, dW_t, \]the Greeks are given by:
- Delta: \[ \Delta = \frac{\partial}{\partial S_0} \mathbb{E}[\Phi(S_T)] = \mathbb{E} \left[ \Phi(S_T) \cdot \frac{W_T}{S_0 \sigma T} \right]. \]
- Vega: \[ \nu = \frac{\partial}{\partial \sigma} \mathbb{E}[\Phi(S_T)] = \mathbb{E} \left[ \Phi(S_T) \cdot \left( \frac{W_T^2}{\sigma T} - \frac{W_T}{\sigma} - \frac{1}{\sigma} \right) \right]. \]
- Gamma: \[ \Gamma = \frac{\partial^2}{\partial S_0^2} \mathbb{E}[\Phi(S_T)] = \mathbb{E} \left[ \Phi(S_T) \cdot \frac{W_T^2 - \sigma T W_T - T}{S_0^2 \sigma^2 T^2} \right]. \]
Example: Computing Delta for a European Call Option
Consider a European call option with payoff \( \Phi(S_T) = (S_T - K)^+ \), where \( S_T = S_0 \exp \left( (r - \frac{1}{2} \sigma^2)T + \sigma W_T \right) \).
Step 1: Compute the Malliavin Derivative of \( S_T \).
\[ D_t S_T = D_t \left( S_0 \exp \left( (r - \frac{1}{2} \sigma^2)T + \sigma W_T \right) \right) = \sigma S_T \cdot \mathbb{1}_{[0,T]}(t). \]Step 2: Apply the Clark-Ocone Formula.
\[ \Phi(S_T) = \mathbb{E}[\Phi(S_T)] + \int_0^T \mathbb{E}[D_t \Phi(S_T) | \mathcal{F}_t] \, dW_t. \]Since \( D_t \Phi(S_T) = \Phi'(S_T) D_t S_T = \mathbb{1}_{\{S_T > K\}} \cdot \sigma S_T \), we have:
\[ \mathbb{E}[D_t \Phi(S_T) | \mathcal{F}_t] = \sigma \mathbb{E}[S_T \mathbb{1}_{\{S_T > K\}} | \mathcal{F}_t]. \]Step 3: Compute Delta.
Using the formula for Delta:
\[ \Delta = \mathbb{E} \left[ (S_T - K)^+ \cdot \frac{W_T}{S_0 \sigma T} \right]. \]This expectation can be computed numerically or analytically using the Black-Scholes framework. The result matches the Black-Scholes Delta:
\[ \Delta = N(d_1), \quad \text{where} \quad d_1 = \frac{\ln(S_0 / K) + (r + \frac{1}{2} \sigma^2)T}{\sigma \sqrt{T}}. \]Example: Computing Vega for a European Call Option
Using the same payoff \( \Phi(S_T) = (S_T - K)^+ \), we compute Vega.
Step 1: Differentiate \( S_T \) with respect to \( \sigma \).
\[ \frac{\partial S_T}{\partial \sigma} = S_T \left( W_T - \sigma T \right). \]Step 2: Apply the Malliavin Weight for Vega.
The Vega is given by:
\[ \nu = \mathbb{E} \left[ (S_T - K)^+ \cdot \left( \frac{W_T^2}{\sigma T} - \frac{W_T}{\sigma} - \frac{1}{\sigma} \right) \right]. \]This can be simplified using integration by parts and properties of the log-normal distribution. The result matches the Black-Scholes Vega:
\[ \nu = S_0 \sqrt{T} N'(d_1), \quad \text{where} \quad N'(x) = \frac{1}{\sqrt{2 \pi}} e^{-x^2 / 2}. \]Key Notes and Pitfalls:
- Smoothness of Payoff: Malliavin calculus requires the payoff \( \Phi \) to be sufficiently smooth (e.g., \( \Phi \in \mathbb{D}^{1,2} \)). For discontinuous payoffs (e.g., digital options), regularization techniques or localization may be needed.
- Anticipative Integrands: The Skorohod integral allows for anticipative integrands, but its computation can be non-trivial. In practice, the Clark-Ocone formula often simplifies the representation.
- Numerical Implementation: The expectations in the Greek formulas are typically computed via Monte Carlo simulation. Variance reduction techniques (e.g., importance sampling) are essential for efficiency.
- Model Dependence: The formulas for Greeks depend on the underlying model (e.g., Black-Scholes, Heston). For more complex models, the Malliavin derivative may require additional terms (e.g., for stochastic volatility models).
- Higher-Order Greeks: Malliavin calculus can also compute higher-order Greeks (e.g., Vanna, Volga) by further differentiating the payoff or using iterated Malliavin derivatives.
General Formula for Greeks via Malliavin Calculus
For a payoff \( \Phi(S_T) \) and a parameter \( \theta \) (e.g., \( S_0 \), \( \sigma \)), the Greek \( \mathcal{G} \) is given by:
\[ \mathcal{G} = \frac{\partial}{\partial \theta} \mathbb{E}[\Phi(S_T)] = \mathbb{E} \left[ \Phi(S_T) \cdot \pi_\theta \right], \]where \( \pi_\theta \) is the Malliavin weight for the parameter \( \theta \). The weights for Delta, Vega, and Gamma are:
- Delta Weight (\( \theta = S_0 \)): \[ \pi_{S_0} = \frac{W_T}{S_0 \sigma T}. \]
- Vega Weight (\( \theta = \sigma \)): \[ \pi_\sigma = \frac{W_T^2}{\sigma T} - \frac{W_T}{\sigma} - \frac{1}{\sigma}. \]
- Gamma Weight (\( \theta = S_0 \), second derivative): \[ \pi_{S_0^2} = \frac{W_T^2 - \sigma T W_T - T}{S_0^2 \sigma^2 T^2}. \]
Practical Applications:
- Monte Carlo Simulation: Malliavin calculus enables the computation of Greeks in Monte Carlo frameworks, where finite-difference methods may be inefficient or unstable. This is particularly useful for high-dimensional models (e.g., basket options, path-dependent options).
- Stochastic Volatility Models: In models like Heston, where the volatility is stochastic, Malliavin calculus provides a way to compute vega and other volatility sensitivities without bumping parameters.
- American Options: For American options, Malliavin calculus can be combined with regression-based methods (e.g., Longstaff-Schwartz) to compute Greeks.
- Risk Management: Greeks computed via Malliavin calculus are used for hedging and risk management, especially in portfolios with complex derivatives.
- Model Calibration: Sensitivities to model parameters (e.g., volatility, correlation) are essential for calibrating models to market data. Malliavin calculus provides a unified framework for these computations.
Topic 39: Monte Carlo Simulation of SDEs (Euler-Maruyama Scheme)
Stochastic Differential Equation (SDE): An SDE is a differential equation in which one or more terms are stochastic processes, resulting in a solution that is itself a stochastic process. A general form of an SDE is:
\[ dX_t = \mu(X_t, t) \, dt + \sigma(X_t, t) \, dW_t \]where:
- \(X_t\) is the stochastic process,
- \(\mu(X_t, t)\) is the drift coefficient,
- \(\sigma(X_t, t)\) is the diffusion coefficient,
- \(W_t\) is a Wiener process (standard Brownian motion).
Monte Carlo Simulation: A computational technique that uses repeated random sampling to obtain numerical results, often used to approximate the distribution of an unknown probabilistic entity.
Euler-Maruyama Scheme: A numerical method for approximating solutions to SDEs. It is the stochastic analogue of the Euler method for ordinary differential equations (ODEs).
Euler-Maruyama Discretization: Given an SDE \(dX_t = \mu(X_t, t) \, dt + \sigma(X_t, t) \, dW_t\), the Euler-Maruyama approximation \(X_t\) on a discretized time grid \(0 = t_0 < t_1 < \dots < t_N = T\) with step size \(\Delta t = t_{n+1} - t_n\) is:
\[ X_{t_{n+1}} = X_{t_n} + \mu(X_{t_n}, t_n) \Delta t + \sigma(X_{t_n}, t_n) \Delta W_n \]where \(\Delta W_n = W_{t_{n+1}} - W_{t_n} \sim \mathcal{N}(0, \Delta t)\) are independent and identically distributed (i.i.d.) normal random variables.
Derivation of the Euler-Maruyama Scheme:
The SDE \(dX_t = \mu(X_t, t) \, dt + \sigma(X_t, t) \, dW_t\) can be integrated formally over a small time interval \([t_n, t_{n+1}]\):
\[ X_{t_{n+1}} - X_{t_n} = \int_{t_n}^{t_{n+1}} \mu(X_s, s) \, ds + \int_{t_n}^{t_{n+1}} \sigma(X_s, s) \, dW_s \]The Euler-Maruyama scheme approximates the integrals as follows:
- The drift integral is approximated using the left-endpoint rule: \[ \int_{t_n}^{t_{n+1}} \mu(X_s, s) \, ds \approx \mu(X_{t_n}, t_n) \Delta t. \]
- The diffusion integral is approximated by replacing \(\sigma(X_s, s)\) with its value at the left endpoint and using the property of the Wiener process: \[ \int_{t_n}^{t_{n+1}} \sigma(X_s, s) \, dW_s \approx \sigma(X_{t_n}, t_n) \Delta W_n, \] where \(\Delta W_n \sim \mathcal{N}(0, \Delta t)\).
Combining these approximations yields the Euler-Maruyama scheme.
Strong Convergence: The Euler-Maruyama scheme has strong order of convergence \(0.5\). This means that for a fixed time step \(\Delta t\), the expected absolute error satisfies:
\[ \mathbb{E} \left[ |X_T - X_T^{\Delta t}| \right] \leq C (\Delta t)^{0.5}, \]where \(X_T\) is the exact solution at time \(T\), \(X_T^{\Delta t}\) is the Euler-Maruyama approximation, and \(C\) is a constant independent of \(\Delta t\).
Weak Convergence: The Euler-Maruyama scheme has weak order of convergence \(1\). This means that for a sufficiently smooth function \(g\), the error in the expected value satisfies:
\[ \left| \mathbb{E}[g(X_T)] - \mathbb{E}[g(X_T^{\Delta t})] \right| \leq C \Delta t, \]where \(C\) is a constant independent of \(\Delta t\).
Example: Simulating Geometric Brownian Motion (GBM)
Consider the SDE for GBM:
\[ dS_t = \mu S_t \, dt + \sigma S_t \, dW_t, \]where \(\mu\) is the drift and \(\sigma\) is the volatility. The Euler-Maruyama discretization is:
\[ S_{t_{n+1}} = S_{t_n} + \mu S_{t_n} \Delta t + \sigma S_{t_n} \Delta W_n. \]Python-like Pseudocode:
import numpy as np
def euler_maruyama_gbm(S0, mu, sigma, T, N):
dt = T / N
time_grid = np.linspace(0, T, N+1)
S = np.zeros(N+1)
S[0] = S0
for n in range(N):
dW = np.random.normal(0, np.sqrt(dt))
S[n+1] = S[n] + mu * S[n] * dt + sigma * S[n] * dW
return time_grid, S
This generates a single path of the GBM process.
Example: Pricing a European Call Option
To price a European call option with payoff \(\max(S_T - K, 0)\) using Monte Carlo simulation:
- Simulate \(M\) paths of the underlying asset price \(S_T\) using the Euler-Maruyama scheme.
- Compute the payoff for each path: \(C_i = \max(S_T^{(i)} - K, 0)\).
- Discount the average payoff to present value: \(C_0 = e^{-rT} \frac{1}{M} \sum_{i=1}^M C_i\).
Python-like Pseudocode:
def monte_carlo_option_price(S0, K, T, r, mu, sigma, M, N):
dt = T / N
payoffs = np.zeros(M)
for i in range(M):
S = S0
for n in range(N):
dW = np.random.normal(0, np.sqrt(dt))
S = S + mu * S * dt + sigma * S * dW
payoffs[i] = max(S - K, 0)
option_price = np.exp(-r * T) * np.mean(payoffs)
return option_price
Practical Applications:
- Option Pricing: Monte Carlo simulation of SDEs is widely used to price complex financial derivatives, especially those with path-dependent features (e.g., Asian, barrier, or lookback options).
- Risk Management: Simulating asset price paths allows for the computation of risk measures such as Value-at-Risk (VaR) and Expected Shortfall (ES).
- Stochastic Control: Used in optimal control problems where the system dynamics are governed by SDEs (e.g., portfolio optimization).
- Interest Rate Modeling: Simulating interest rate paths for models like the Vasicek or Cox-Ingersoll-Ross (CIR) model.
- Stochastic Volatility Models: Simulating paths for models like Heston, where both the asset price and its volatility are stochastic.
Common Pitfalls and Important Notes:
- Discretization Error: The Euler-Maruyama scheme introduces discretization error. Smaller time steps \(\Delta t\) reduce this error but increase computational cost. For weak convergence, a larger \(\Delta t\) may suffice, while strong convergence requires finer discretization.
- Boundary Conditions: For SDEs with boundaries (e.g., reflecting or absorbing), the Euler-Maruyama scheme may not respect these boundaries. Special care (e.g., reflection or truncation) is needed.
- Non-Lipschitz Coefficients: If \(\mu\) or \(\sigma\) are not Lipschitz continuous, the Euler-Maruyama scheme may diverge. Examples include the square-root diffusion in the CIR model, where \(\sigma(X_t) = \sigma \sqrt{X_t}\). In such cases, implicit schemes or truncation may be necessary.
- Variance Reduction: Monte Carlo simulations can have high variance. Techniques like antithetic variates, control variates, or importance sampling can improve efficiency.
- Random Number Generation: Poor-quality random number generators can lead to biased results. Use well-tested libraries (e.g., Mersenne Twister) for generating \(\Delta W_n\).
- Initial Conditions: The initial condition \(X_0\) must be specified. For financial applications, this is often the current spot price of the asset.
- Parallelization: Monte Carlo simulations are embarrassingly parallel. Paths can be simulated independently, making the method highly scalable on multi-core or distributed systems.
- Higher-Order Schemes: For improved accuracy, higher-order schemes like the Milstein scheme (strong order 1) can be used. The Milstein scheme for \(dX_t = \mu(X_t) \, dt + \sigma(X_t) \, dW_t\) is: \[ X_{t_{n+1}} = X_{t_n} + \mu(X_{t_n}) \Delta t + \sigma(X_{t_n}) \Delta W_n + \frac{1}{2} \sigma(X_{t_n}) \sigma'(X_{t_n}) \left( (\Delta W_n)^2 - \Delta t \right). \]
Quant Interview Questions:
-
Explain the Euler-Maruyama scheme and its convergence properties.
Answer: The Euler-Maruyama scheme is a numerical method for approximating SDEs. It discretizes the SDE using a time step \(\Delta t\) and approximates the drift and diffusion terms. It has strong order of convergence \(0.5\) and weak order of convergence \(1\).
-
How would you simulate a path of geometric Brownian motion using the Euler-Maruyama scheme?
Answer: For \(dS_t = \mu S_t \, dt + \sigma S_t \, dW_t\), the Euler-Maruyama update is \(S_{t_{n+1}} = S_{t_n} + \mu S_{t_n} \Delta t + \sigma S_{t_n} \Delta W_n\), where \(\Delta W_n \sim \mathcal{N}(0, \Delta t)\).
-
What are the limitations of the Euler-Maruyama scheme?
Answer: Limitations include discretization error, potential divergence for non-Lipschitz coefficients, and the need for small time steps for strong convergence. It may also violate boundary conditions in some models.
-
How can you improve the efficiency of a Monte Carlo simulation for option pricing?
Answer: Techniques include variance reduction methods (e.g., antithetic variates, control variates), using higher-order schemes (e.g., Milstein), and parallelizing the simulation.
-
Explain the difference between strong and weak convergence in the context of SDE simulations.
Answer: Strong convergence measures the pathwise error (\(\mathbb{E}[|X_T - X_T^{\Delta t}|]\)), while weak convergence measures the error in expected values of functions (\(\left| \mathbb{E}[g(X_T)] - \mathbb{E}[g(X_T^{\Delta t})] \right|\)). Weak convergence is often sufficient for pricing derivatives.
-
How would you simulate a mean-reverting process like the Ornstein-Uhlenbeck process?
Answer: The Ornstein-Uhlenbeck SDE is \(dX_t = \theta (\mu - X_t) \, dt + \sigma \, dW_t\). The Euler-Maruyama update is \(X_{t_{n+1}} = X_{t_n} + \theta (\mu - X_{t_n}) \Delta t + \sigma \Delta W_n\).
Topic 40: Strong and Weak Convergence of Numerical Schemes
Numerical Scheme: A numerical scheme for a stochastic differential equation (SDE) is a discrete-time approximation of the continuous-time process described by the SDE. Common examples include the Euler-Maruyama scheme and the Milstein scheme.
Strong Convergence: A numerical scheme \( \hat{X}_N \) (with time step \( \Delta t \)) is said to converge strongly to the solution \( X_T \) of the SDE at time \( T \) if:
\[ \lim_{\Delta t \to 0} \mathbb{E}\left[ |X_T - \hat{X}_N| \right] = 0, \]where \( N = T / \Delta t \). The order of strong convergence is \( \gamma \) if there exists a constant \( C \) such that:
\[ \mathbb{E}\left[ |X_T - \hat{X}_N| \right] \leq C (\Delta t)^\gamma. \]Weak Convergence: A numerical scheme \( \hat{X}_N \) converges weakly to the solution \( X_T \) of the SDE at time \( T \) if for all sufficiently smooth functions \( g \):
\[ \lim_{\Delta t \to 0} \left| \mathbb{E}[g(X_T)] - \mathbb{E}[g(\hat{X}_N)] \right| = 0. \]The order of weak convergence is \( \beta \) if there exists a constant \( C \) such that:
\[ \left| \mathbb{E}[g(X_T)] - \mathbb{E}[g(\hat{X}_N)] \right| \leq C (\Delta t)^\beta. \]Euler-Maruyama Scheme: For the SDE \( dX_t = a(X_t) dt + b(X_t) dW_t \), the Euler-Maruyama approximation is:
\[ \hat{X}_{n+1} = \hat{X}_n + a(\hat{X}_n) \Delta t + b(\hat{X}_n) \Delta W_n, \]where \( \Delta W_n \sim \mathcal{N}(0, \Delta t) \). The Euler-Maruyama scheme has strong order \( \gamma = 0.5 \) and weak order \( \beta = 1 \).
Milstein Scheme: For the same SDE, the Milstein scheme is:
\[ \hat{X}_{n+1} = \hat{X}_n + a(\hat{X}_n) \Delta t + b(\hat{X}_n) \Delta W_n + \frac{1}{2} b(\hat{X}_n) b'(\hat{X}_n) \left( (\Delta W_n)^2 - \Delta t \right). \]The Milstein scheme has strong order \( \gamma = 1 \) and weak order \( \beta = 1 \).
Example: Strong Convergence of Euler-Maruyama
Consider the SDE \( dX_t = \mu X_t dt + \sigma X_t dW_t \) (geometric Brownian motion). The Euler-Maruyama approximation is:
\[ \hat{X}_{n+1} = \hat{X}_n (1 + \mu \Delta t + \sigma \Delta W_n). \]To verify strong convergence, compute \( \mathbb{E}[|X_T - \hat{X}_N|] \). For small \( \Delta t \), the error is dominated by the diffusion term, and the strong order is \( 0.5 \).
Example: Weak Convergence of Euler-Maruyama
For the same SDE, consider \( g(x) = x \). The weak error is:
\[ \left| \mathbb{E}[X_T] - \mathbb{E}[\hat{X}_N] \right| = \left| X_0 e^{\mu T} - \mathbb{E}[\hat{X}_N] \right|. \]Using the properties of the Euler-Maruyama scheme, the weak error is \( O(\Delta t) \), confirming weak order \( \beta = 1 \).
General Strong Convergence Criterion: For a numerical scheme \( \hat{X}_N \) to converge strongly to \( X_T \), it must satisfy:
\[ \lim_{\Delta t \to 0} \mathbb{E}\left[ \sup_{0 \leq t \leq T} |X_t - \hat{X}_{\lfloor t / \Delta t \rfloor}|^2 \right] = 0. \]General Weak Convergence Criterion: For a numerical scheme \( \hat{X}_N \) to converge weakly to \( X_T \), it must satisfy:
\[ \lim_{\Delta t \to 0} \left| \mathbb{E}\left[ \prod_{i=1}^k f_i(X_{t_i}) \right] - \mathbb{E}\left[ \prod_{i=1}^k f_i(\hat{X}_{\lfloor t_i / \Delta t \rfloor}) \right] \right| = 0, \]for all \( k \in \mathbb{N} \), \( 0 \leq t_1 < \dots < t_k \leq T \), and sufficiently smooth functions \( f_i \).
Key Differences:
- Strong Convergence: Measures the pathwise approximation error. Important for scenarios where the exact trajectory of the process matters (e.g., hedging in finance).
- Weak Convergence: Measures the approximation error of expectations. Important for scenarios where only the distribution of the process matters (e.g., pricing European options).
In practice, weak convergence is often easier to achieve and requires less computational effort than strong convergence.
Practical Applications:
- Option Pricing: Weak convergence is typically sufficient for pricing European options, as it ensures the convergence of the expected payoff.
- Hedging: Strong convergence is necessary for accurate hedging strategies, as it ensures the convergence of the entire path of the underlying asset.
- Risk Management: Both strong and weak convergence are relevant, depending on whether path-dependent or distribution-dependent quantities are of interest.
Common Pitfalls:
- Misidentifying Convergence Order: Confusing the orders of strong and weak convergence can lead to incorrect error estimates. For example, the Euler-Maruyama scheme has strong order \( 0.5 \) but weak order \( 1 \).
- Non-Smooth Payoffs: Weak convergence results often assume smooth payoff functions. For non-smooth payoffs (e.g., digital options), additional care is needed, and higher-order schemes may be required.
- Discretization Bias: Even with a convergent scheme, discretization bias can persist if \( \Delta t \) is not sufficiently small. Always verify convergence numerically.
- Computational Cost: Higher-order schemes (e.g., Milstein) improve strong convergence but may not significantly improve weak convergence. Balance accuracy with computational effort.
Quant Interview Question: Strong vs. Weak Convergence
Question: You are pricing a European call option using the Euler-Maruyama scheme. Your colleague argues that since the scheme has strong order \( 0.5 \), you need a very small time step to achieve accurate results. Is your colleague correct? Why or why not?
Answer: The colleague is incorrect. For pricing a European call option, weak convergence is sufficient because we only care about the expected payoff, not the exact path of the underlying asset. The Euler-Maruyama scheme has weak order \( 1 \), so a moderate time step is often adequate for accurate results. Strong convergence is unnecessary in this case.
Quant Interview Question: Milstein vs. Euler-Maruyama
Question: When would you choose the Milstein scheme over the Euler-Maruyama scheme for simulating an SDE?
Answer: The Milstein scheme is preferred over the Euler-Maruyama scheme when strong convergence is important, such as in hedging or simulating path-dependent options (e.g., Asian or barrier options). The Milstein scheme has a higher strong order (\( 1 \) vs. \( 0.5 \)), leading to more accurate pathwise approximations. However, for weak convergence (e.g., pricing European options), the Euler-Maruyama scheme is often sufficient and computationally cheaper.
Topic 41: Milstein Scheme for Higher-Order SDE Approximation
Stochastic Differential Equation (SDE): An SDE is a differential equation in which one or more terms are stochastic processes, resulting in a solution that is itself a stochastic process. A general SDE in Itô form is given by:
\[ dX_t = a(X_t, t) \, dt + b(X_t, t) \, dW_t \]where \(X_t\) is the stochastic process, \(a(X_t, t)\) is the drift coefficient, \(b(X_t, t)\) is the diffusion coefficient, and \(W_t\) is a Wiener process (Brownian motion).
Milstein Scheme: The Milstein scheme is a numerical method for approximating the solution of SDEs. It is a higher-order method compared to the Euler-Maruyama scheme, providing better accuracy by incorporating an additional term that accounts for the stochastic Taylor expansion up to the second order.
General SDE:
\[ dX_t = a(X_t, t) \, dt + b(X_t, t) \, dW_t \]Milstein Scheme Discretization:
For a time discretization \(0 = t_0 < t_1 < \dots < t_N = T\) with step size \(\Delta t = t_{n+1} - t_n\), the Milstein scheme is given by:
\[ X_{n+1} = X_n + a(X_n, t_n) \Delta t + b(X_n, t_n) \Delta W_n + \frac{1}{2} b(X_n, t_n) \frac{\partial b}{\partial x}(X_n, t_n) \left( (\Delta W_n)^2 - \Delta t \right) \]where \(\Delta W_n = W_{t_{n+1}} - W_{t_n} \sim \mathcal{N}(0, \Delta t)\).
Derivation of the Milstein Scheme:
- Itô-Taylor Expansion:
The Milstein scheme is derived using the Itô-Taylor expansion, which extends the deterministic Taylor expansion to stochastic processes. For a function \(f(X_t, t)\), the Itô formula is:
\[ df(X_t, t) = \left( \frac{\partial f}{\partial t} + a \frac{\partial f}{\partial x} + \frac{1}{2} b^2 \frac{\partial^2 f}{\partial x^2} \right) dt + b \frac{\partial f}{\partial x} dW_t \] - Stochastic Taylor Expansion:
Applying the Itô formula to \(a(X_t, t)\) and \(b(X_t, t)\) and integrating over \([t_n, t_{n+1}]\), we obtain:
\[ X_{t_{n+1}} = X_{t_n} + \int_{t_n}^{t_{n+1}} a(X_s, s) \, ds + \int_{t_n}^{t_{n+1}} b(X_s, s) \, dW_s \]Expanding \(a(X_s, s)\) and \(b(X_s, s)\) around \(X_{t_n}\) using Itô's formula:
\[ a(X_s, s) \approx a(X_{t_n}, t_n) + \frac{\partial a}{\partial x}(X_{t_n}, t_n) (X_s - X_{t_n}) + \frac{\partial a}{\partial t}(X_{t_n}, t_n) (s - t_n) \]Similarly for \(b(X_s, s)\). Substituting these into the integral and retaining terms up to \(\mathcal{O}(\Delta t)\) yields the Milstein scheme.
- Key Term:
The additional term \(\frac{1}{2} b \frac{\partial b}{\partial x} \left( (\Delta W_n)^2 - \Delta t \right)\) arises from the Itô correction term in the expansion of \(b(X_s, s)\).
Milstein Scheme for Multi-Dimensional SDEs:
For an \(m\)-dimensional SDE with \(d\)-dimensional Brownian motion:
\[ dX_t = a(X_t, t) \, dt + \sum_{j=1}^d b_j(X_t, t) \, dW_t^j \]The Milstein scheme is:
\[ X_{n+1}^i = X_n^i + a^i(X_n, t_n) \Delta t + \sum_{j=1}^d b_j^i(X_n, t_n) \Delta W_n^j + \sum_{j_1, j_2=1}^d \sum_{k=1}^m \frac{\partial b_{j_1}^i}{\partial x^k}(X_n, t_n) b_{j_2}^k(X_n, t_n) I_{j_1, j_2} \]where \(I_{j_1, j_2} = \int_{t_n}^{t_{n+1}} \int_{t_n}^s dW_u^{j_1} dW_s^{j_2}\) are the iterated Itô integrals.
Worked Example: Geometric Brownian Motion
Consider the SDE for geometric Brownian motion:
\[ dX_t = \mu X_t \, dt + \sigma X_t \, dW_t \]Here, \(a(X_t, t) = \mu X_t\) and \(b(X_t, t) = \sigma X_t\). The derivative of \(b\) with respect to \(X_t\) is:
\[ \frac{\partial b}{\partial x} = \sigma \]Applying the Milstein scheme:
\[ X_{n+1} = X_n + \mu X_n \Delta t + \sigma X_n \Delta W_n + \frac{1}{2} \sigma X_n \sigma \left( (\Delta W_n)^2 - \Delta t \right) \] \[ X_{n+1} = X_n \left( 1 + \mu \Delta t + \sigma \Delta W_n + \frac{1}{2} \sigma^2 \left( (\Delta W_n)^2 - \Delta t \right) \right) \]Practical Applications:
- Option Pricing: The Milstein scheme is used to simulate asset price paths in the Black-Scholes model and other stochastic volatility models, providing more accurate option prices than the Euler-Maruyama scheme.
- Risk Management: Used to simulate interest rate paths, credit risk models, and other financial instruments where higher accuracy is required.
- Stochastic Control: Applied in optimal control problems where the system dynamics are governed by SDEs.
- Quantitative Finance Interviews: The Milstein scheme is a common topic in quant interviews, often tested through derivation, comparison with Euler-Maruyama, or implementation in code.
Common Pitfalls and Important Notes:
- Strong vs. Weak Convergence:
The Milstein scheme achieves a strong order of convergence of 1.0 (i.e., \(\mathbb{E}[|X_T - X_T^{\Delta t}|] \leq C \Delta t\)) and a weak order of convergence of 1.0 (i.e., \(|\mathbb{E}[f(X_T)] - \mathbb{E}[f(X_T^{\Delta t})]| \leq C \Delta t\) for sufficiently smooth \(f\)). This is an improvement over the Euler-Maruyama scheme, which has strong order 0.5.
- Computational Cost:
The Milstein scheme requires computing the derivative of the diffusion coefficient \(\frac{\partial b}{\partial x}\), which can be computationally expensive or analytically intractable for complex SDEs. In such cases, finite difference approximations may be used.
- Multi-Dimensional SDEs:
For multi-dimensional SDEs, the Milstein scheme involves iterated Itô integrals \(I_{j_1, j_2}\), which are challenging to simulate. Simplified schemes or approximations (e.g., using independent Brownian motions) are often employed.
- Stability:
The Milstein scheme can exhibit numerical instability for stiff SDEs or large time steps. Implicit or semi-implicit variants may be required for stability.
- Comparison with Euler-Maruyama:
The Euler-Maruyama scheme is simpler and faster but less accurate. The Milstein scheme is preferred when higher accuracy is needed, especially for problems where the diffusion coefficient \(b\) is state-dependent.
Implementation Tips:
- For SDEs with additive noise (i.e., \(b(X_t, t) = b(t)\)), the Milstein scheme reduces to the Euler-Maruyama scheme because \(\frac{\partial b}{\partial x} = 0\).
- When implementing the Milstein scheme, ensure that the random variables \(\Delta W_n\) are generated as \(\mathcal{N}(0, \Delta t)\) and that \((\Delta W_n)^2\) is correctly handled.
- For multi-dimensional SDEs, consider using the "diagonal noise" approximation if the off-diagonal iterated integrals are difficult to compute.
Topic 42: Variance Reduction Techniques in SDE Simulation
Variance Reduction Techniques: Methods used in Monte Carlo simulations of Stochastic Differential Equations (SDEs) to reduce the variance of estimators, thereby improving efficiency and accuracy without increasing computational cost significantly.
1. Key Concepts and Definitions
Monte Carlo Simulation: A computational algorithm that relies on repeated random sampling to obtain numerical results, typically used to estimate the expectation of a random variable.
Variance of an Estimator: For an estimator \(\hat{\theta}\) of a parameter \(\theta\), the variance is given by \(\text{Var}(\hat{\theta}) = \mathbb{E}[(\hat{\theta} - \mathbb{E}[\hat{\theta}])^2]\). Lower variance implies more precise estimates.
Control Variate (CV): A random variable \(Y\) with known expectation \(\mathbb{E}[Y]\), used to reduce the variance of an estimator \(\hat{\theta}\) by constructing a new estimator \(\hat{\theta}_{\text{CV}} = \hat{\theta} + c(Y - \mathbb{E}[Y])\).
Antithetic Variates: A variance reduction technique where pairs of negatively correlated samples are generated to reduce the overall variance of the estimator.
Importance Sampling: A technique that changes the probability measure under which expectations are computed to reduce variance, often by focusing sampling on "important" regions of the sample space.
Stratified Sampling: A method where the sample space is divided into strata, and samples are drawn from each stratum to ensure better coverage and reduce variance.
2. Important Formulas
Basic Monte Carlo Estimator:
\[ \hat{\theta} = \frac{1}{N} \sum_{i=1}^N f(X_i), \quad \text{where } X_i \sim p(x) \]The variance of \(\hat{\theta}\) is:
\[ \text{Var}(\hat{\theta}) = \frac{\text{Var}(f(X))}{N} \]Control Variate Estimator:
\[ \hat{\theta}_{\text{CV}} = \hat{\theta} + c \left( \frac{1}{N} \sum_{i=1}^N Y_i - \mathbb{E}[Y] \right) \]The optimal choice of \(c\) to minimize variance is:
\[ c^* = -\frac{\text{Cov}(\hat{\theta}, \bar{Y})}{\text{Var}(\bar{Y})} \]where \(\bar{Y} = \frac{1}{N} \sum_{i=1}^N Y_i\).
Antithetic Variates Estimator:
\[ \hat{\theta}_{\text{AV}} = \frac{1}{2N} \sum_{i=1}^N \left( f(X_i) + f(X_i') \right) \]where \(X_i'\) is the antithetic counterpart of \(X_i\) (e.g., \(X_i' = a + b - X_i\) for uniform \(X_i \sim U(a, b)\)).
The variance of \(\hat{\theta}_{\text{AV}}\) is:
\[ \text{Var}(\hat{\theta}_{\text{AV}}) = \frac{\text{Var}(f(X)) + \text{Cov}(f(X), f(X'))}{2N} \]Importance Sampling Estimator:
\[ \hat{\theta}_{\text{IS}} = \frac{1}{N} \sum_{i=1}^N f(X_i) \frac{p(X_i)}{q(X_i)}, \quad \text{where } X_i \sim q(x) \]The variance of \(\hat{\theta}_{\text{IS}}\) is:
\[ \text{Var}(\hat{\theta}_{\text{IS}}) = \frac{1}{N} \text{Var}_q \left( f(X) \frac{p(X)}{q(X)} \right) \]The optimal importance sampling density \(q^*(x)\) is:
\[ q^*(x) \propto |f(x)| p(x) \]Stratified Sampling Estimator:
Divide the sample space into \(K\) strata with probabilities \(p_k\) and sample sizes \(N_k\) such that \(\sum_{k=1}^K N_k = N\). The estimator is:
\[ \hat{\theta}_{\text{SS}} = \sum_{k=1}^K p_k \left( \frac{1}{N_k} \sum_{i=1}^{N_k} f(X_{k,i}) \right), \quad \text{where } X_{k,i} \sim p(x | \text{stratum } k) \]The variance of \(\hat{\theta}_{\text{SS}}\) is:
\[ \text{Var}(\hat{\theta}_{\text{SS}}) = \sum_{k=1}^K \frac{p_k^2 \sigma_k^2}{N_k}, \quad \text{where } \sigma_k^2 = \text{Var}(f(X) | \text{stratum } k) \]The optimal allocation of samples is:
\[ N_k^* \propto p_k \sigma_k \]3. Derivations
Derivation of Optimal Control Variate Coefficient \(c^*\)
The control variate estimator is:
\[ \hat{\theta}_{\text{CV}} = \hat{\theta} + c (\bar{Y} - \mathbb{E}[Y]) \]The variance of \(\hat{\theta}_{\text{CV}}\) is:
\[ \text{Var}(\hat{\theta}_{\text{CV}}) = \text{Var}(\hat{\theta}) + c^2 \text{Var}(\bar{Y}) + 2c \text{Cov}(\hat{\theta}, \bar{Y}) \]To minimize \(\text{Var}(\hat{\theta}_{\text{CV}})\), take the derivative with respect to \(c\) and set it to zero:
\[ \frac{d}{dc} \text{Var}(\hat{\theta}_{\text{CV}}) = 2c \text{Var}(\bar{Y}) + 2 \text{Cov}(\hat{\theta}, \bar{Y}) = 0 \]Solving for \(c\) gives:
\[ c^* = -\frac{\text{Cov}(\hat{\theta}, \bar{Y})}{\text{Var}(\bar{Y})} \]Variance Reduction with Antithetic Variates
Assume \(X\) and \(X'\) are identically distributed and negatively correlated. The antithetic variates estimator is:
\[ \hat{\theta}_{\text{AV}} = \frac{f(X) + f(X')}{2} \]The variance is:
\[ \text{Var}(\hat{\theta}_{\text{AV}}) = \frac{\text{Var}(f(X)) + \text{Var}(f(X')) + 2 \text{Cov}(f(X), f(X'))}{4} \]Since \(\text{Var}(f(X)) = \text{Var}(f(X'))\), this simplifies to:
\[ \text{Var}(\hat{\theta}_{\text{AV}}) = \frac{\text{Var}(f(X)) + \text{Cov}(f(X), f(X'))}{2} \]If \(\text{Cov}(f(X), f(X')) < 0\), the variance is reduced compared to the basic Monte Carlo estimator.
Optimal Importance Sampling Density
The variance of the importance sampling estimator is:
\[ \text{Var}(\hat{\theta}_{\text{IS}}) = \frac{1}{N} \mathbb{E}_q \left[ \left( f(X) \frac{p(X)}{q(X)} - \theta \right)^2 \right] \]Minimizing this is equivalent to minimizing:
\[ \mathbb{E}_q \left[ \left( f(X) \frac{p(X)}{q(X)} \right)^2 \right] = \int \frac{f(x)^2 p(x)^2}{q(x)} dx \]By the Cauchy-Schwarz inequality, the minimum is achieved when:
\[ q^*(x) \propto |f(x)| p(x) \]4. Practical Applications
Control Variates in Option Pricing
When pricing an Asian option, the arithmetic average \(A_T = \frac{1}{T} \int_0^T S_t dt\) is often used. A control variate can be the geometric average \(G_T = \exp \left( \frac{1}{T} \int_0^T \log S_t dt \right)\), whose expectation can be computed analytically for geometric Brownian motion.
The control variate estimator is:
\[ \hat{V}_{\text{CV}} = \hat{V} + c \left( \frac{1}{N} \sum_{i=1}^N G_T^{(i)} - \mathbb{E}[G_T] \right) \]where \(\hat{V}\) is the Monte Carlo estimator of the Asian option price.
Antithetic Variates in SDE Simulation
Consider simulating the SDE \(dS_t = \mu S_t dt + \sigma S_t dW_t\) using the Euler-Maruyama method. For each sample path \(S_t^{(i)}\), generate an antithetic path \(S_t^{(i)'}\) by using the negative of the Brownian increments:
\[ \Delta W_t^{(i)'} = -\Delta W_t^{(i)} \]The antithetic variates estimator for \(\mathbb{E}[f(S_T)]\) is:
\[ \hat{\theta}_{\text{AV}} = \frac{1}{2N} \sum_{i=1}^N \left( f(S_T^{(i)}) + f(S_T^{(i)'}) \right) \]Importance Sampling for Rare Events
In credit risk modeling, the probability of default \(P_{\text{default}} = \mathbb{E}[\mathbb{I}_{S_T < K}]\) can be small. Importance sampling can be used by simulating under a measure where default is more likely, e.g., by increasing the drift \(\mu\) in the SDE for \(S_t\).
The importance sampling estimator is:
\[ \hat{P}_{\text{IS}} = \frac{1}{N} \sum_{i=1}^N \mathbb{I}_{S_T^{(i)} < K} \frac{p(S_T^{(i)})}{q(S_T^{(i)})} \]where \(q\) is the density under the new measure.
5. Common Pitfalls and Important Notes
Choosing Control Variates
A good control variate \(Y\) should be highly correlated with the estimator \(\hat{\theta}\) and have a known expectation. Poor choices of \(Y\) can lead to increased variance or bias.
Antithetic Variates and Nonlinearity
Antithetic variates work best when the function \(f\) is close to linear. For highly nonlinear functions, the negative correlation between \(f(X)\) and \(f(X')\) may not hold, reducing the effectiveness of the technique.
Importance Sampling and Overfitting
Choosing an importance sampling density \(q\) that is too narrow can lead to high variance if the actual samples fall outside the "important" region. It is crucial to ensure that \(q\) has heavier tails than \(p\).
Stratified Sampling and Stratum Choice
The effectiveness of stratified sampling depends on the choice of strata. Strata should be chosen such that the variance within each stratum is minimized, and the variance between strata is maximized.
Computational Overhead
Some variance reduction techniques (e.g., importance sampling with complex \(q\)) can introduce significant computational overhead. It is important to weigh the reduction in variance against the increased computational cost.
Combination of Techniques
Variance reduction techniques can often be combined. For example, control variates can be used alongside antithetic variates or stratified sampling to achieve further variance reduction.
Topic 43: Stochastic Control and Hamilton-Jacobi-Bellman (HJB) Equation
Stochastic Control: A branch of control theory that deals with systems influenced by random noise. The goal is to find a control policy that optimizes a given objective function, often representing cost or reward, in the presence of uncertainty.
Control Process: A stochastic process \( u_t \) adapted to the filtration \( \mathcal{F}_t \) generated by the underlying Brownian motion \( W_t \). The control influences the dynamics of the state process \( X_t \).
Value Function: The optimal value of the objective function starting from state \( x \) at time \( t \), defined as: \[ V(t, x) = \sup_{u \in \mathcal{U}} \mathbb{E} \left[ \int_t^T f(s, X_s, u_s) \, ds + g(X_T) \mid X_t = x \right], \] where \( \mathcal{U} \) is the set of admissible controls, \( f \) is the running cost/reward, and \( g \) is the terminal cost/reward.
Hamilton-Jacobi-Bellman (HJB) Equation: A partial differential equation (PDE) that characterizes the value function \( V(t, x) \) in stochastic control problems. It provides a necessary condition for optimality.
Key Formulas
Dynamics of the Controlled Process: The state process \( X_t \) follows a stochastic differential equation (SDE): \[ dX_t = \mu(t, X_t, u_t) \, dt + \sigma(t, X_t, u_t) \, dW_t, \quad X_0 = x, \] where \( \mu \) is the drift, \( \sigma \) is the diffusion coefficient, and \( u_t \) is the control process.
HJB Equation: The value function \( V(t, x) \) satisfies the following nonlinear PDE: \[ \frac{\partial V}{\partial t} + \sup_{u \in \mathcal{U}} \left\{ \mathcal{L}^u V(t, x) + f(t, x, u) \right\} = 0, \] with terminal condition \( V(T, x) = g(x) \), where \( \mathcal{L}^u \) is the infinitesimal generator of \( X_t \) under control \( u \): \[ \mathcal{L}^u V(t, x) = \mu(t, x, u) \frac{\partial V}{\partial x} + \frac{1}{2} \sigma^2(t, x, u) \frac{\partial^2 V}{\partial x^2}. \]
Optimal Control: The optimal control \( u^*(t, x) \) is the argument that achieves the supremum in the HJB equation: \[ u^*(t, x) = \arg \sup_{u \in \mathcal{U}} \left\{ \mathcal{L}^u V(t, x) + f(t, x, u) \right\}. \]
Derivation of the HJB Equation
Step 1: Dynamic Programming Principle (DPP)
The value function satisfies the following recursive relationship for any \( h > 0 \) such that \( t + h \leq T \): \[ V(t, x) = \sup_{u \in \mathcal{U}} \mathbb{E} \left[ \int_t^{t+h} f(s, X_s, u_s) \, ds + V(t+h, X_{t+h}) \mid X_t = x \right]. \] This principle states that the optimal value from \( t \) to \( T \) is the optimal value from \( t \) to \( t+h \) plus the optimal value from \( t+h \) to \( T \).
Step 2: Taylor Expansion of \( V(t+h, X_{t+h}) \)
Using Itô's formula, expand \( V(t+h, X_{t+h}) \) around \( (t, x) \): \[ V(t+h, X_{t+h}) = V(t, x) + \int_t^{t+h} \left( \frac{\partial V}{\partial t} + \mathcal{L}^u V \right) (s, X_s) \, ds + \int_t^{t+h} \sigma \frac{\partial V}{\partial x} \, dW_s. \] Substitute this into the DPP and take expectations to eliminate the Itô integral (martingale property):
\[ V(t, x) = \sup_{u \in \mathcal{U}} \mathbb{E} \left[ \int_t^{t+h} f(s, X_s, u_s) \, ds + V(t, x) + \int_t^{t+h} \left( \frac{\partial V}{\partial t} + \mathcal{L}^u V \right) (s, X_s) \, ds \mid X_t = x \right]. \]Step 3: Divide by \( h \) and Take \( h \to 0 \)
Rearrange the equation and divide by \( h \): \[ 0 = \sup_{u \in \mathcal{U}} \mathbb{E} \left[ \frac{1}{h} \int_t^{t+h} \left( f(s, X_s, u_s) + \frac{\partial V}{\partial t} + \mathcal{L}^u V \right) (s, X_s) \, ds \mid X_t = x \right]. \] As \( h \to 0 \), the integrand converges to its value at \( (t, x) \) (assuming continuity), yielding the HJB equation: \[ 0 = \frac{\partial V}{\partial t} + \sup_{u \in \mathcal{U}} \left\{ \mathcal{L}^u V(t, x) + f(t, x, u) \right\}. \]
Practical Applications
1. Portfolio Optimization (Merton Problem):
An investor aims to maximize expected utility of wealth at time \( T \). The wealth process \( X_t \) follows: \[ dX_t = \pi_t X_t (\mu \, dt + \sigma \, dW_t) + (1 - \pi_t) X_t r \, dt, \] where \( \pi_t \) is the fraction of wealth invested in a risky asset with drift \( \mu \) and volatility \( \sigma \), and \( r \) is the risk-free rate. The value function is: \[ V(t, x) = \sup_{\pi} \mathbb{E} \left[ U(X_T) \mid X_t = x \right], \] where \( U \) is a utility function (e.g., \( U(x) = \log x \) or \( U(x) = x^\gamma / \gamma \)). The HJB equation is solved to find the optimal portfolio \( \pi^*(t, x) \).
2. Optimal Execution in Algorithmic Trading:
A trader aims to sell \( Q \) shares of an asset over a time horizon \( T \) to minimize expected execution cost. The state process is the remaining inventory \( q_t \), and the control is the trading rate \( v_t \). The dynamics are: \[ dq_t = -v_t \, dt, \quad q_0 = Q. \] The execution cost includes temporary and permanent price impact. The HJB equation is used to derive the optimal trading strategy \( v^*(t, q) \).
3. Real Options in Corporate Finance:
A firm decides the optimal time to invest in a project (e.g., expand capacity) under uncertainty. The project value \( X_t \) follows a geometric Brownian motion: \[ dX_t = \mu X_t \, dt + \sigma X_t \, dW_t. \] The value function \( V(x) \) represents the option to invest, and the HJB equation helps determine the optimal investment threshold \( x^* \).
Common Pitfalls and Important Notes
1. Verification Theorem:
The HJB equation provides a necessary condition for optimality, but not always sufficient. A verification theorem is needed to confirm that a solution to the HJB equation is indeed the value function. The theorem typically requires the value function to be sufficiently smooth (e.g., \( C^{1,2} \)) and the optimal control to be admissible.
2. Viscosity Solutions:
The value function may not be smooth enough to satisfy the HJB equation in the classical sense. In such cases, the HJB equation is interpreted in the viscosity sense, which allows for generalized solutions. Viscosity solutions are continuous and satisfy the HJB equation in a weak sense.
3. Curse of Dimensionality:
The HJB equation is a PDE in \( n+1 \) dimensions (time + state space). For high-dimensional problems (e.g., \( n > 3 \)), numerical solutions become computationally intractable. Techniques like reinforcement learning or approximate dynamic programming are often used instead.
4. Boundary Conditions:
The terminal condition \( V(T, x) = g(x) \) is essential. For infinite-horizon problems, the HJB equation becomes stationary, and boundary conditions at infinity or other boundaries must be carefully specified.
5. Uniqueness of Solutions:
The HJB equation may have multiple solutions. The value function is typically the minimal solution that satisfies the terminal condition. Additional constraints (e.g., growth conditions) may be needed to ensure uniqueness.
6. Admissibility of Controls:
The set of admissible controls \( \mathcal{U} \) must be carefully defined to ensure the SDE for \( X_t \) has a unique solution. Controls are often required to be progressively measurable and satisfy integrability conditions (e.g., \( \mathbb{E} \left[ \int_0^T |u_t|^2 \, dt \right] < \infty \)).
Quant Interview Questions
1. Derive the HJB equation for a general stochastic control problem.
Hint: Start with the dynamic programming principle, apply Itô's formula to \( V(t+h, X_{t+h}) \), and take the limit as \( h \to 0 \).
2. Solve the Merton portfolio problem with logarithmic utility.
Solution: Assume \( V(t, x) = \log x + h(t) \). Substitute into the HJB equation and solve for \( h(t) \) and the optimal portfolio \( \pi^*(t, x) \). The optimal portfolio is constant: \( \pi^* = (\mu - r)/\sigma^2 \).
3. What is the difference between the HJB equation and the Black-Scholes PDE?
Answer: The Black-Scholes PDE is a special case of the HJB equation where the control is absent (or fixed), and the objective is to price a derivative. The HJB equation is more general and includes an optimization over controls.
4. Explain the concept of viscosity solutions in the context of the HJB equation.
Answer: Viscosity solutions are a generalization of classical solutions to PDEs, allowing for non-smooth value functions. A function \( V \) is a viscosity solution if it is continuous and satisfies the HJB equation in a weak sense (subsolution and supersolution properties).
5. How would you numerically solve the HJB equation for a high-dimensional problem?
Answer: For high-dimensional problems, traditional PDE methods (e.g., finite differences) are infeasible. Alternatives include:
- Reinforcement Learning: Use actor-critic methods or Q-learning to approximate the value function and optimal policy.
- Approximate Dynamic Programming: Use basis functions or neural networks to approximate the value function.
- Monte Carlo Methods: Use regression-based methods to estimate the value function from simulated paths.
Topic 44: Optimal Stopping Problems (American Options)
Optimal Stopping Problem: An optimal stopping problem involves choosing a time to take a particular action, based on sequentially observed random variables, in order to maximize an expected reward or minimize an expected cost. In the context of finance, this often relates to deciding when to exercise an American option.
American Option: An American option is a financial contract that can be exercised at any time up to and including its expiration date \( T \). This is in contrast to a European option, which can only be exercised at expiration.
Stopping Time: A stopping time \( \tau \) with respect to a filtration \( \{\mathcal{F}_t\}_{t \geq 0} \) is a random variable such that \( \{\tau \leq t\} \in \mathcal{F}_t \) for all \( t \geq 0 \). Intuitively, it is a decision to stop that depends only on the information available up to time \( t \).
Snell Envelope: The Snell envelope of a stochastic process \( X_t \) is the smallest supermartingale that dominates \( X_t \). It is used to solve optimal stopping problems and is defined as: \[ S_t = \text{ess sup}_{\tau \geq t} \mathbb{E}[X_\tau | \mathcal{F}_t], \] where the essential supremum is taken over all stopping times \( \tau \) with values in \( [t, T] \).
Key Concepts
Dynamic Programming Principle: The optimal stopping problem can be approached using the dynamic programming principle, which states that the value function \( V(t, x) \) (the maximum expected reward starting from time \( t \) with state \( x \)) satisfies: \[ V(t, x) = \max \left( \phi(t, x), \mathbb{E}[V(t + \Delta t, X_{t + \Delta t}) | X_t = x] \right), \] where \( \phi(t, x) \) is the reward for stopping immediately, and the expectation represents the reward for continuing.
Free Boundary Problem: The optimal exercise boundary for an American option is the boundary that separates the continuation region (where it is optimal to hold the option) from the stopping region (where it is optimal to exercise the option). This boundary is typically determined as part of the solution to a free boundary problem.
Important Formulas
Value of an American Option: The value \( V(t, S_t) \) of an American option at time \( t \) with underlying asset price \( S_t \) is given by the Snell envelope of the payoff process \( \Phi(t, S_t) \): \[ V(t, S_t) = \sup_{t \leq \tau \leq T} \mathbb{E}\left[ e^{-r(\tau - t)} \Phi(\tau, S_\tau) \bigg| \mathcal{F}_t \right], \] where \( \tau \) is a stopping time, \( r \) is the risk-free interest rate, and \( \Phi(t, S_t) \) is the payoff function (e.g., \( \max(S_t - K, 0) \) for a call option with strike \( K \)).
Black-Scholes PDE for American Options: The value \( V(t, S) \) of an American option satisfies the following variational inequality (a free boundary problem): \[ \min \left( -\frac{\partial V}{\partial t} - \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} - r S \frac{\partial V}{\partial S} + r V, V - \Phi(S) \right) = 0, \] with terminal condition \( V(T, S) = \Phi(S) \). Here, \( \sigma \) is the volatility of the underlying asset, and \( \Phi(S) \) is the payoff function.
Early Exercise Premium: The value of an American option can be decomposed into the value of a European option plus an early exercise premium: \[ V_{\text{American}}(t, S) = V_{\text{European}}(t, S) + \mathbb{E}\left[ \int_t^T e^{-r(u - t)} (r K - \delta S_u) \mathbf{1}_{\{S_u \geq S^*(u)\}} du \bigg| \mathcal{F}_t \right], \] where \( \delta \) is the dividend yield, \( K \) is the strike price, and \( S^*(u) \) is the optimal exercise boundary at time \( u \).
Derivations
Derivation of the Snell Envelope for Optimal Stopping
Consider a finite-horizon optimal stopping problem where the goal is to maximize \( \mathbb{E}[X_\tau] \) over all stopping times \( \tau \leq T \). The Snell envelope \( S_t \) is defined as:
\[ S_t = \text{ess sup}_{\tau \geq t} \mathbb{E}[X_\tau | \mathcal{F}_t]. \]Step 1: Show that \( S_t \) is a supermartingale. For \( s \leq t \), we have:
\[ \mathbb{E}[S_t | \mathcal{F}_s] = \mathbb{E}\left[ \text{ess sup}_{\tau \geq t} \mathbb{E}[X_\tau | \mathcal{F}_t] \bigg| \mathcal{F}_s \right] \leq \text{ess sup}_{\tau \geq s} \mathbb{E}[X_\tau | \mathcal{F}_s] = S_s, \] where the inequality follows because \( \{\tau \geq t\} \subset \{\tau \geq s\} \).Step 2: Show that \( S_t \) dominates \( X_t \). By definition, for any stopping time \( \tau \geq t \), \( \mathbb{E}[X_\tau | \mathcal{F}_t] \leq S_t \). Taking \( \tau = t \) gives \( X_t \leq S_t \).
Step 3: Show that \( S_t \) is the smallest supermartingale dominating \( X_t \). Suppose \( Y_t \) is another supermartingale dominating \( X_t \). Then for any stopping time \( \tau \geq t \),
\[ \mathbb{E}[X_\tau | \mathcal{F}_t] \leq \mathbb{E}[Y_\tau | \mathcal{F}_t] \leq Y_t, \] where the second inequality follows because \( Y_t \) is a supermartingale. Taking the essential supremum over \( \tau \geq t \) gives \( S_t \leq Y_t \).Step 4: The optimal stopping time \( \tau^* \) is the first time \( S_t = X_t \), i.e.,
\[ \tau^* = \inf \{ t \geq 0 : S_t = X_t \}. \] This is because it is optimal to stop when the reward for stopping immediately equals the Snell envelope (the best possible reward for continuing).Derivation of the Black-Scholes Variational Inequality
The value \( V(t, S) \) of an American option must satisfy two conditions:
- It must be at least as valuable as the payoff from immediate exercise: \( V(t, S) \geq \Phi(S) \).
- If the option is not exercised, its value must satisfy the Black-Scholes PDE (since it behaves like a European option in the continuation region).
Combining these, we obtain the variational inequality:
\[ \min \left( -\frac{\partial V}{\partial t} - \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} - r S \frac{\partial V}{\partial S} + r V, V - \Phi(S) \right) = 0. \]This can be interpreted as follows:
- If \( V(t, S) > \Phi(S) \), then the Black-Scholes PDE must hold (it is optimal to continue).
- If \( V(t, S) = \Phi(S) \), then the Black-Scholes PDE may not hold (it is optimal to stop).
Practical Applications
Pricing American Options
Optimal stopping theory is used to price American options, which are common in financial markets. Unlike European options, American options can be exercised at any time, making their valuation more complex. The Snell envelope and dynamic programming principles provide a framework for computing their fair value.
Example: Consider an American put option with strike \( K \) and expiration \( T \). The payoff is \( \Phi(S_t) = \max(K - S_t, 0) \). The value of the option is given by:
\[ V(t, S_t) = \sup_{t \leq \tau \leq T} \mathbb{E}\left[ e^{-r(\tau - t)} \max(K - S_\tau, 0) \bigg| \mathcal{F}_t \right]. \]Numerical methods such as finite difference methods, binomial trees, or Monte Carlo simulations (with least squares regression) are often used to compute this value.
Real Options in Corporate Finance
Optimal stopping problems also arise in corporate finance under the guise of "real options." For example, a company may have the option to invest in a project, abandon a project, or expand a project. Each of these decisions can be modeled as an optimal stopping problem, where the goal is to maximize the expected net present value (NPV) of the project.
Example: Suppose a company has the option to invest in a project at any time \( t \leq T \). The project's value \( X_t \) follows a geometric Brownian motion, and the investment cost is \( I \). The optimal investment time \( \tau \) is the solution to:
\[ \sup_{\tau \leq T} \mathbb{E}\left[ e^{-r \tau} (X_\tau - I) \right]. \]This is an optimal stopping problem where the payoff is \( X_t - I \).
Common Pitfalls and Important Notes
Confusing American and European Options
One common pitfall is treating American options like European options. While European options can only be exercised at expiration, American options can be exercised early, which introduces additional complexity. The early exercise premium must be accounted for in the valuation.
Numerical Methods for American Options
Numerical methods for pricing American options are more computationally intensive than those for European options. Common methods include:
- Binomial/Trinomial Trees: These methods discretize the underlying asset price and time, allowing for backward induction to compute the option value. They are intuitive but can be slow for high-dimensional problems.
- Finite Difference Methods: These methods solve the Black-Scholes PDE numerically. They are efficient for low-dimensional problems but require careful handling of the free boundary.
- Monte Carlo Methods: The least squares Monte Carlo (LSM) method is used for high-dimensional problems. It involves simulating paths of the underlying asset and using regression to estimate the continuation value.
Each method has trade-offs between accuracy, speed, and ease of implementation.
Early Exercise Boundary
The early exercise boundary for an American option is not always intuitive. For example:
- For American call options on non-dividend-paying stocks, it is never optimal to exercise early, so the American call has the same value as the European call.
- For American put options, early exercise may be optimal, especially when the stock price is very low (deep in the money).
- For American options on dividend-paying stocks, early exercise may be optimal just before a dividend payment.
Dividends and Early Exercise
Dividends play a crucial role in the early exercise decision for American options. For a call option, it may be optimal to exercise just before a dividend payment to capture the dividend. For a put option, dividends make early exercise less attractive because the stock price is expected to drop by the dividend amount.
Snell Envelope and Martingale Properties
The Snell envelope is a supermartingale, but it is a martingale up to the optimal stopping time. Specifically, if \( \tau^* \) is the optimal stopping time, then \( S_t \) is a martingale on \( [0, \tau^*] \). This property is useful for verifying the optimality of a candidate stopping time.
Topic 45: Reflected SDEs and Their Applications (Barrier Options)
Reflected Stochastic Differential Equation (SDE): A reflected SDE is a stochastic process that is constrained to remain within a certain domain, typically \([0, \infty)\) or \([a, b]\), by introducing a "reflection" term that instantaneously pushes the process back into the domain whenever it attempts to leave. The reflection is typically modeled using the local time of the process at the boundary.
Formally, a reflected SDE for a process \( X_t \) with reflection at 0 can be written as: \[ dX_t = \mu(X_t) dt + \sigma(X_t) dW_t + dL_t, \] where \( L_t \) is the local time of \( X \) at 0, a non-decreasing process that increases only when \( X_t = 0 \).
Local Time: The local time \( L_t^a \) of a process \( X \) at level \( a \) measures the amount of time \( X \) spends at \( a \) up to time \( t \). For a continuous semimartingale \( X \), the local time satisfies the Tanaka formula: \[ |X_t - a| = |X_0 - a| + \int_0^t \text{sgn}(X_s - a) dX_s + L_t^a, \] where \( \text{sgn}(x) = \mathbb{I}_{x > 0} - \mathbb{I}_{x < 0} \).
Reflected Brownian Motion (RBM): The simplest reflected SDE is the reflected Brownian motion, where \( \mu(x) = \mu \) and \( \sigma(x) = \sigma \) are constants. The RBM \( X_t \) on \([0, \infty)\) satisfies: \[ dX_t = \mu dt + \sigma dW_t + dL_t, \] where \( L_t \) is the local time of \( X \) at 0. The solution can be expressed as: \[ X_t = X_0 + \mu t + \sigma W_t + L_t. \] The local time \( L_t \) can be written as: \[ L_t = -\min\left\{0, \inf_{0 \leq s \leq t} (X_0 + \mu s + \sigma W_s)\right\}. \]
Skorokhod Problem: The Skorokhod problem provides a way to construct reflected processes. Given a continuous function \( y_t \) with \( y_0 \geq 0 \), the Skorokhod problem seeks a pair \( (x_t, l_t) \) such that:
- \( x_t = y_t + l_t \),
- \( x_t \geq 0 \),
- \( l_t \) is non-decreasing, \( l_0 = 0 \), and \( l_t \) increases only when \( x_t = 0 \).
Example: Reflected Brownian Motion with Drift
Consider the reflected SDE: \[ dX_t = \mu dt + \sigma dW_t + dL_t, \quad X_0 = x \geq 0. \] The solution is: \[ X_t = x + \mu t + \sigma W_t + L_t, \] where \( L_t \) is the local time at 0. Using the Skorokhod problem, we can write: \[ L_t = -\min\left\{0, \inf_{0 \leq s \leq t} (x + \mu s + \sigma W_s)\right\}. \] For \( \mu = 0 \) and \( \sigma = 1 \), this reduces to the reflected standard Brownian motion: \[ X_t = x + W_t + L_t, \quad L_t = -\min\left\{0, \inf_{0 \leq s \leq t} (x + W_s)\right\}. \]
Transition Density of Reflected Brownian Motion: The transition density \( p(t, x, y) \) of a reflected Brownian motion \( X_t \) with drift \( \mu \), volatility \( \sigma \), and reflection at 0 is given by: \[ p(t, x, y) = \frac{1}{\sigma \sqrt{2 \pi t}} \left[ \exp\left(-\frac{(y - x - \mu t)^2}{2 \sigma^2 t}\right) + \exp\left(-\frac{(y + x - \mu t)^2}{2 \sigma^2 t}\right) \right] - \frac{2 \mu}{\sigma^2} e^{2 \mu y / \sigma^2} \Phi\left(-\frac{y + x + \mu t}{\sigma \sqrt{t}}\right), \] where \( \Phi \) is the standard normal cumulative distribution function. For \( \mu = 0 \), this simplifies to: \[ p(t, x, y) = \frac{1}{\sigma \sqrt{2 \pi t}} \left[ \exp\left(-\frac{(y - x)^2}{2 \sigma^2 t}\right) + \exp\left(-\frac{(y + x)^2}{2 \sigma^2 t}\right) \right]. \]
Barrier Options: Barrier options are path-dependent options whose payoff depends on whether the underlying asset's price reaches a certain barrier level during the option's life. There are two main types:
- Knock-in options: Become active only if the barrier is hit.
- Knock-out options: Become inactive if the barrier is hit.
Pricing Barrier Options Using Reflected SDEs: The price of a barrier option can often be expressed in terms of the prices of standard options with modified boundary conditions. For example, the price of a down-and-out call option with barrier \( H \), strike \( K \), and maturity \( T \) can be written as: \[ C_{\text{DO}}(S_0, K, H, T) = C_{\text{BS}}(S_0, K, T) - \left(\frac{S_0}{H}\right)^{1 - 2r/\sigma^2} C_{\text{BS}}(H^2 / S_0, K, T), \] where \( C_{\text{BS}}(S, K, T) \) is the Black-Scholes price of a call option with initial stock price \( S \), strike \( K \), and maturity \( T \), and \( r \) is the risk-free rate. This formula assumes \( S_0 > H \) and \( K > H \).
For more general cases, the price of a barrier option can be derived using the transition density of the reflected process. For example, the price of a down-and-out call option is: \[ C_{\text{DO}}(S_0, K, H, T) = e^{-rT} \int_H^\infty (S_T - K)^+ p(T, S_0, S_T) dS_T, \] where \( p(T, S_0, S_T) \) is the transition density of the reflected geometric Brownian motion at the barrier \( H \).
Example: Pricing a Down-and-Out Call Option
Consider a down-and-out call option with barrier \( H = 90 \), strike \( K = 100 \), maturity \( T = 1 \) year, initial stock price \( S_0 = 100 \), risk-free rate \( r = 0.05 \), and volatility \( \sigma = 0.2 \).
First, compute the Black-Scholes price of a standard call option: \[ d_1 = \frac{\ln(S_0 / K) + (r + \sigma^2 / 2)T}{\sigma \sqrt{T}} = \frac{\ln(100 / 100) + (0.05 + 0.2^2 / 2) \cdot 1}{0.2 \cdot \sqrt{1}} = 0.35, \] \[ d_2 = d_1 - \sigma \sqrt{T} = 0.35 - 0.2 = 0.15. \] The Black-Scholes price is: \[ C_{\text{BS}}(100, 100, 1) = 100 \Phi(0.35) - 100 e^{-0.05} \Phi(0.15) \approx 100 \cdot 0.6368 - 95.12 \cdot 0.5596 \approx 11.56. \] Next, compute the price of the down-and-out call using the reflection principle: \[ \left(\frac{S_0}{H}\right)^{1 - 2r/\sigma^2} = \left(\frac{100}{90}\right)^{1 - 2 \cdot 0.05 / 0.04} = \left(\frac{10}{9}\right)^{0} = 1. \] The price of the down-and-out call is: \[ C_{\text{DO}}(100, 100, 90, 1) = C_{\text{BS}}(100, 100, 1) - C_{\text{BS}}(81, 100, 1). \] Compute \( C_{\text{BS}}(81, 100, 1) \): \[ d_1 = \frac{\ln(81 / 100) + (0.05 + 0.2^2 / 2) \cdot 1}{0.2 \cdot \sqrt{1}} = \frac{-0.2107 + 0.07}{0.2} = -0.7035, \] \[ d_2 = d_1 - 0.2 = -0.9035. \] \[ C_{\text{BS}}(81, 100, 1) = 81 \Phi(-0.7035) - 100 e^{-0.05} \Phi(-0.9035) \approx 81 \cdot 0.2409 - 95.12 \cdot 0.1831 \approx 2.16. \] Thus, the price of the down-and-out call is: \[ C_{\text{DO}}(100, 100, 90, 1) \approx 11.56 - 2.16 = 9.40. \]
Reflection Principle for Brownian Motion: The reflection principle is a key tool in deriving the distribution of reflected processes. For a standard Brownian motion \( W_t \), the reflection principle states that for any \( a > 0 \): \[ \mathbb{P}\left(\sup_{0 \leq s \leq t} W_s \geq a\right) = 2 \mathbb{P}(W_t \geq a) = 2 \left(1 - \Phi\left(\frac{a}{\sqrt{t}}\right)\right). \] This can be extended to drifted Brownian motion \( X_t = \mu t + \sigma W_t \): \[ \mathbb{P}\left(\sup_{0 \leq s \leq t} X_s \geq a\right) = \Phi\left(\frac{\mu t - a}{\sigma \sqrt{t}}\right) + e^{2 \mu a / \sigma^2} \Phi\left(\frac{\mu t + a}{\sigma \sqrt{t}}\right). \]
Important Notes and Pitfalls:
- Local Time and Reflection: The local time \( L_t \) is not absolutely continuous with respect to Lebesgue measure, meaning it cannot be expressed as an integral of some density. It is a singular process that increases only when \( X_t \) is at the boundary.
- Skorokhod Problem: The Skorokhod problem is well-posed only for continuous functions \( y_t \). For discontinuous processes (e.g., jump diffusions), the problem becomes more complex and may not have a unique solution.
- Barrier Options:
- Barrier options are highly sensitive to the volatility of the underlying asset, especially near the barrier. This is known as the "barrier effect."
- The reflection principle assumes continuous paths. In practice, discrete monitoring (e.g., daily) can lead to significant pricing errors, especially for barriers close to the initial price.
- The formula for the price of a down-and-out call option using the reflection principle assumes that the barrier is not hit before maturity. For more general cases, numerical methods (e.g., finite difference, Monte Carlo) are often required.
- Numerical Methods: When analytical solutions are not available, reflected SDEs can be simulated using the Skorokhod problem. For example, to simulate a reflected Brownian motion:
- Simulate a standard Brownian motion \( W_t \).
- Compute \( X_t = x + \mu t + \sigma W_t \).
- Set \( L_t = -\min\{0, \inf_{0 \leq s \leq t} X_s\} \).
- The reflected process is \( X_t + L_t \).
- Connection to PDEs: The transition density of a reflected diffusion satisfies a partial differential equation (PDE) with a Neumann boundary condition at the reflection barrier. For example, the density \( p(t, x, y) \) of a reflected Brownian motion satisfies: \[ \frac{\partial p}{\partial t} = \frac{\sigma^2}{2} \frac{\partial^2 p}{\partial x^2} + \mu \frac{\partial p}{\partial x}, \quad \frac{\partial p}{\partial x}(t, 0, y) = 0. \]
Example: Simulating a Reflected Brownian Motion
Simulate a reflected Brownian motion \( X_t \) with \( X_0 = 1 \), \( \mu = 0.1 \), \( \sigma = 0.2 \), and reflection at 0 over the interval \([0, 1]\) using a time step \( \Delta t = 0.01 \).
Steps:
- Initialize \( X_0 = 1 \), \( L_0 = 0 \), and \( t = 0 \).
- For each time step \( i = 1, 2, \dots, 100 \):
- Generate a standard normal random variable \( Z_i \).
- Update the Brownian motion: \( W_{t + \Delta t} = W_t + \sqrt{\Delta t} Z_i \).
- Compute the unreflected process: \( Y_{t + \Delta t} = X_t + \mu \Delta t + \sigma \sqrt{\Delta t} Z_i \).
- Compute the local time increment: \( \Delta L = -\min\{0, Y_{t + \Delta t}\} \).
- Update the reflected process: \( X_{t + \Delta t} = Y_{t + \Delta t} + \Delta L \).
- Update the local time: \( L_{t + \Delta t} = L_t + \Delta L \).
- Update time: \( t = t + \Delta t \).
This simulation ensures that \( X_t \) never becomes negative and that \( L_t \) increases only when \( X_t \) attempts to cross 0.
Common Interview Questions:
- Explain the difference between a reflected SDE and a standard SDE.
Answer: A reflected SDE includes an additional term (local time) that ensures the process remains within a certain domain. This term acts as a "push" at the boundary, whereas a standard SDE has no such constraint and can freely leave the domain.
- What is the Skorokhod problem, and how is it used in reflected SDEs?
Answer: The Skorokhod problem provides a way to construct a reflected process from a given unreflected process. It ensures that the reflected process stays within the desired domain by adding a non-decreasing process (local time) that increases only when the process is at the boundary.
- How would you price a down-and-out call option using the reflection principle?
Answer: The price of a down-and-out call option can be expressed as the price of a standard call option minus the price of a call option with a "reflected" strike. Specifically, for a barrier \( H \), the price is: \[ C_{\text{DO}}(S_0, K, H, T) = C_{\text{BS}}(S_0, K, T) - \left(\frac{S_0}{H}\right)^{1 - 2r/\sigma^2} C_{\text{BS}}(H^2 / S_0, K, T). \] This formula assumes continuous monitoring and that the barrier is not hit before maturity.
- What are the challenges in pricing barrier options?
Answer: Challenges include:
- Discrete Monitoring: Most barrier options are monitored discretely (e.g., daily), but analytical formulas assume continuous monitoring. This can lead to significant pricing errors.
- Volatility Sensitivity: Barrier options are highly sensitive to volatility, especially near the barrier. This can make them difficult to hedge.
- Numerical Methods: For complex barriers or payoffs, numerical methods (e.g., Monte Carlo, finite difference) are often required, which can be computationally intensive.
- How does the local time \( L_t \) behave for a reflected Brownian motion?
Answer: The local time \( L_t \) is a non-decreasing process that increases only when the reflected Brownian motion \( X_t \) is at 0. It is singular with respect to Lebesgue measure, meaning it does not have a density. The local time can be interpreted as the "amount of time" \( X_t \) spends at 0, though this is a heuristic since \( X_t \) is a continuous process.
Topic 46: Local Time of Brownian Motion and Its Properties
Definition (Local Time of Brownian Motion): The local time \( L_t^a(B) \) of a standard Brownian motion \( B_t \) at level \( a \) and time \( t \) is a stochastic process that measures the amount of time the Brownian motion spends at level \( a \) up to time \( t \). Formally, it is defined as the almost sure limit:
\[ L_t^a(B) = \lim_{\epsilon \to 0} \frac{1}{2\epsilon} \int_0^t \mathbf{1}_{[a-\epsilon, a+\epsilon]}(B_s) \, ds. \]Intuitively, \( L_t^a(B) \) quantifies the "sojourn time" of \( B_t \) in the vicinity of \( a \).
Key Formulas:
- Occupation Time Formula: For any non-negative measurable function \( f \), \[ \int_0^t f(B_s) \, ds = \int_{-\infty}^{\infty} f(a) L_t^a(B) \, da. \] This formula relates the integral of \( f(B_s) \) over time to an integral of \( f \) against the local time.
- Tanaka's Formula: For a standard Brownian motion \( B_t \) and any \( a \in \mathbb{R} \), \[ |B_t - a| = |B_0 - a| + \int_0^t \text{sgn}(B_s - a) \, dB_s + L_t^a(B), \] where \( \text{sgn}(x) = \mathbf{1}_{x > 0} - \mathbf{1}_{x < 0} \) is the sign function. This is a generalization of Itô's formula for the non-differentiable function \( f(x) = |x - a| \).
- Joint Continuity: The local time \( L_t^a(B) \) has a modification that is jointly continuous in \( (t, a) \). This property is crucial for many applications.
- Scaling Property: For any \( c > 0 \), \[ L_t^a(B) \stackrel{{\mathcal{D}}}{{=}} c^{1/2} L_{t/c}^a(c^{-1/2} B). \] This reflects the self-similarity of Brownian motion.
- Density of Local Time: The local time \( L_t^a(B) \) admits a density with respect to the Lebesgue measure, and for fixed \( t \), \( L_t^a(B) \) is almost surely continuous in \( a \).
Example (Derivation of Tanaka's Formula):
Consider the function \( f(x) = |x - a| \). While \( f \) is not differentiable at \( x = a \), we can approximate it by smooth functions and apply Itô's formula. Let \( f_\epsilon(x) \) be a smooth approximation to \( |x - a| \) such that:
- \( f_\epsilon(x) = |x - a| \) for \( |x - a| \geq \epsilon \),
- \( f_\epsilon(x) \) is smooth everywhere, with \( f_\epsilon'(x) = \text{sgn}(x - a) \) for \( |x - a| \geq \epsilon \), and \( f_\epsilon''(x) = 0 \) for \( |x - a| \geq \epsilon \).
Applying Itô's formula to \( f_\epsilon(B_t) \):
\[ f_\epsilon(B_t) = f_\epsilon(B_0) + \int_0^t f_\epsilon'(B_s) \, dB_s + \frac{1}{2} \int_0^t f_\epsilon''(B_s) \, ds. \]Taking \( \epsilon \to 0 \), the first two terms converge to \( |B_t - a| \) and \( \int_0^t \text{sgn}(B_s - a) \, dB_s \), respectively. The third term converges to \( L_t^a(B) \) by the definition of local time, yielding Tanaka's formula.
Local Time and the Itô-Tanaka Formula:
The Itô-Tanaka formula generalizes Itô's formula to convex functions. For a convex function \( f \), define its left derivative \( f'_- \) and second derivative measure \( \mu \) (a non-negative measure) such that:
\[ f(x) = f(0) + f'_(0) x + \int_{-\infty}^{\infty} (x - a)^+ \, \mu(da). \]The Itô-Tanaka formula states:
\[ f(B_t) = f(B_0) + \int_0^t f'_(B_s) \, dB_s + \frac{1}{2} \int_{-\infty}^{\infty} L_t^a(B) \, \mu(da). \]For \( f(x) = |x - a| \), \( \mu \) is twice the Dirac measure at \( a \), recovering Tanaka's formula.
Example (Application to Reflected Brownian Motion):
Reflected Brownian motion \( R_t \) is defined as \( R_t = |B_t| \). Using Tanaka's formula with \( a = 0 \):
\[ R_t = |B_t| = B_0 + \int_0^t \text{sgn}(B_s) \, dB_s + L_t^0(B). \]The process \( M_t = \int_0^t \text{sgn}(B_s) \, dB_s \) is a martingale (since \( \text{sgn}(B_s) \) is bounded), and \( L_t^0(B) \) is the local time at 0. Thus, reflected Brownian motion can be decomposed into a martingale and an increasing process (the local time).
Important Notes and Common Pitfalls:
- Non-Differentiability: Local time arises precisely because Brownian motion is not differentiable, and standard calculus tools (e.g., Itô's formula) must be extended to handle non-smooth functions like \( |x - a| \).
- Joint Continuity: While \( L_t^a(B) \) is continuous in \( t \) for fixed \( a \), and continuous in \( a \) for fixed \( t \), the joint continuity in \( (t, a) \) is a deeper result. This property is essential for many applications, such as the construction of stochastic integrals with respect to local time.
- Dependence on \( a \): The local time \( L_t^a(B) \) is not differentiable in \( a \), but it is Hölder continuous of order \( \alpha \) for any \( \alpha < 1/2 \). This reflects the fractal nature of Brownian motion.
- Martingale Properties: The process \( L_t^a(B) \) is not a martingale. However, the process \( (L_t^a(B))^2 - 4 \int_0^t \mathbf{1}_{B_s > a} \, ds \) is a martingale (for fixed \( a \)). This can be derived using the Itô-Tanaka formula and Itô's formula for \( f(x) = x^2 \).
- Connection to PDEs: Local time appears in the probabilistic representation of solutions to partial differential equations (PDEs) with boundary conditions. For example, the local time at 0 of a Brownian motion is related to the solution of the heat equation with Neumann boundary conditions.
- Common Misconception: Local time is not the same as the number of times the Brownian motion visits \( a \). Brownian motion visits every level \( a \) infinitely often in any time interval \( [0, t] \), but the local time measures the "density" of these visits.
Local Time and the Skorokhod Problem:
The Skorokhod problem is a method to construct reflected Brownian motion. Given a Brownian motion \( B_t \), the reflected Brownian motion \( R_t \) is the solution to:
\[ R_t = B_t + \ell_t, \quad \ell_t = \sup_{s \leq t} (-B_s)^+, \]where \( \ell_t \) is the minimal increasing process such that \( R_t \geq 0 \). The local time \( L_t^0(B) \) is equal to \( 2 \ell_t \), showing the connection between local time and reflection.
Practical Applications:
- Financial Mathematics:
- Barrier Options: Local time is used to price barrier options, where the payoff depends on whether the underlying asset's price hits a certain barrier. The local time at the barrier level appears in the pricing formula.
- Stochastic Volatility Models: In models where volatility is driven by a reflected Brownian motion (e.g., the Heston model with reflection), local time is used to ensure the volatility process remains non-negative.
- Stochastic Control: Local time appears in the dynamic programming equations for optimal control problems with state constraints. For example, in singular control problems, the optimal control may involve reflecting the state process at a boundary, leading to local time terms in the value function.
- Queueing Theory: In heavy-traffic limits of queueing systems, the workload process converges to a reflected Brownian motion. Local time describes the "idle time" of the system when the workload is zero.
- Mathematical Physics: Local time is used in the study of polymer models and interface growth, where it describes the interaction of a random path with a boundary or interface.
- Probability Theory: Local time is a fundamental object in the study of Markov processes and diffusion processes. It is used to analyze the behavior of processes at boundaries, construct excursion theory, and study the fine properties of sample paths.
Example (Local Time and the Running Maximum):
Let \( M_t = \sup_{s \leq t} B_s \) be the running maximum of Brownian motion. The joint distribution of \( (B_t, M_t) \) can be described using local time. Specifically, the process \( M_t - B_t \) is a reflected Brownian motion, and its local time at 0 is \( L_t^{M_t}(B) \). This is useful for deriving the joint density of \( (B_t, M_t) \):
\[ \mathbb{P}(B_t \in dx, M_t \in dy) = \frac{2(2y - x)}{\sqrt{2\pi t^3}} e^{-(2y - x)^2 / (2t)} \, dx \, dy, \quad y \geq 0, x \leq y. \]Quant Interview Questions:
- Derive Tanaka's formula for \( f(x) = |x - a| \).
Hint: Approximate \( |x - a| \) by smooth functions and apply Itô's formula, then take limits.
- Explain the difference between local time and the number of times Brownian motion hits a level \( a \).
Local time measures the "sojourn time" near \( a \), while the number of hits is infinite in any time interval. Local time is a continuous process, whereas the number of hits is not well-defined as a process.
- How is local time used in the pricing of barrier options?
For a barrier option with payoff depending on whether the underlying asset hits a barrier, the local time at the barrier appears in the risk-neutral pricing formula. For example, the price of a down-and-out call option can be expressed using the local time of the underlying asset's log-price at the barrier level.
- Show that the local time \( L_t^a(B) \) is not a martingale.
Use the fact that \( L_t^a(B) \) is an increasing process, and martingales cannot be increasing unless they are constant.
- What is the distribution of \( L_t^0(B) \) for a standard Brownian motion \( B_t \)?
The local time \( L_t^0(B) \) has the same distribution as \( \sup_{s \leq t} B_s \), which follows the half-normal distribution with density:
\[ \mathbb{P}(L_t^0(B) \in dx) = \sqrt{\frac{2}{\pi t}} e^{-x^2 / (2t)} \, dx, \quad x \geq 0. \] - Explain the occupation time formula and its significance.
The occupation time formula relates the integral of a function of Brownian motion over time to an integral of the function against the local time. It is significant because it allows us to "change variables" from time to space, which is useful in deriving properties of local time and solving PDEs.
- How does local time appear in the Skorokhod problem?
In the Skorokhod problem, the local time at 0 of the reflected Brownian motion \( R_t = B_t + \ell_t \) is equal to \( 2 \ell_t \), where \( \ell_t \) is the minimal increasing process ensuring \( R_t \geq 0 \). This shows that local time naturally arises in reflection problems.
Topic 47: Tanaka's Formula and Its Applications
Tanaka’s Formula: A generalization of Itô's formula for functions that are not twice continuously differentiable. It provides a way to express the local time of a semimartingale in terms of a stochastic integral, particularly useful for studying the behavior of Brownian motion at points where the function fails to be smooth (e.g., the absolute value function).
Local Time: For a continuous semimartingale \( X \), the local time \( L_t^a(X) \) at level \( a \) is a stochastic process that measures the amount of time \( X \) spends near \( a \) up to time \( t \). Formally, it is defined as the limit:
\[ L_t^a(X) = \lim_{\epsilon \to 0} \frac{1}{2\epsilon} \int_0^t \mathbf{1}_{\{a - \epsilon < X_s < a + \epsilon\}} \, d\langle X \rangle_s, \] where \( \langle X \rangle_s \) is the quadratic variation of \( X \).Tanaka’s Formula for Brownian Motion: Let \( W_t \) be a standard Brownian motion and \( a \in \mathbb{R} \). Then, for the function \( f(x) = |x - a| \), Tanaka’s formula states:
\[ |W_t - a| = |W_0 - a| + \int_0^t \text{sgn}(W_s - a) \, dW_s + L_t^a(W), \] where \( \text{sgn}(x) \) is the sign function defined as: \[ \text{sgn}(x) = \begin{cases} 1 & \text{if } x > 0, \\ -1 & \text{if } x < 0, \\ 0 & \text{if } x = 0, \end{cases} \] and \( L_t^a(W) \) is the local time of \( W \) at level \( a \).General Tanaka’s Formula for Semimartingales: Let \( X \) be a continuous semimartingale and \( a \in \mathbb{R} \). Then:
\[ |X_t - a| = |X_0 - a| + \int_0^t \text{sgn}(X_s - a) \, dX_s + L_t^a(X). \]Derivation of Tanaka’s Formula for \( f(x) = |x| \):
Consider the function \( f(x) = |x| \). This function is not differentiable at \( x = 0 \), so Itô's formula cannot be directly applied. However, we can approximate \( f \) by smooth functions.
Define a sequence of smooth functions \( f_n(x) \) such that \( f_n(x) \to |x| \) as \( n \to \infty \). For example, let:
\[ f_n(x) = \sqrt{x^2 + \frac{1}{n}}. \]Apply Itô's formula to \( f_n(W_t) \):
\[ f_n(W_t) = f_n(W_0) + \int_0^t f_n'(W_s) \, dW_s + \frac{1}{2} \int_0^t f_n''(W_s) \, ds. \] Here, \[ f_n'(x) = \frac{x}{\sqrt{x^2 + \frac{1}{n}}}, \quad f_n''(x) = \frac{1/n}{(x^2 + \frac{1}{n})^{3/2}}. \]Take the limit as \( n \to \infty \). Note that:
\[ f_n'(x) \to \text{sgn}(x), \quad f_n''(x) \, dx \to 2 \delta_0(x) \, dx, \] where \( \delta_0 \) is the Dirac delta function at 0. The term \( \frac{1}{2} \int_0^t f_n''(W_s) \, ds \) converges to the local time \( L_t^0(W) \). Thus, we obtain Tanaka’s formula: \[ |W_t| = |W_0| + \int_0^t \text{sgn}(W_s) \, dW_s + L_t^0(W). \]
Application: Reflecting Brownian Motion
Tanaka’s formula is used to construct reflecting Brownian motion. Let \( W_t \) be a standard Brownian motion and define:
\[ Y_t = |W_t|. \] By Tanaka’s formula: \[ Y_t = \int_0^t \text{sgn}(W_s) \, dW_s + L_t^0(W). \] The process \( Y_t \) is called reflecting Brownian motion, and \( L_t^0(W) \) is its local time at 0, which pushes \( Y_t \) away from 0 whenever it hits 0.Application: Pricing of Barrier Options
Tanaka’s formula is useful in the pricing of barrier options, particularly those with payoffs involving the absolute value of the underlying asset (e.g., double-barrier options). The local time term captures the effect of the asset price hitting the barrier, which is critical for accurate pricing.
For example, consider a down-and-out call option with barrier \( B \) and strike \( K \). The payoff at maturity \( T \) is \( (S_T - K)^+ \mathbf{1}_{\{\min_{0 \leq t \leq T} S_t > B\}} \). Tanaka’s formula helps in expressing the indicator function in terms of local time, facilitating the derivation of pricing formulas.
Important Notes:
The sign function \( \text{sgn}(x) \) is typically defined as 0 at \( x = 0 \) in Tanaka’s formula, but other conventions (e.g., \( \text{sgn}(0) = 1 \)) may appear in the literature. Consistency in definition is crucial for correct application.
Tanaka’s formula is a powerful tool for handling functions with singularities (e.g., \( |x| \), \( \max(x, 0) \)), where classical Itô calculus fails due to lack of smoothness.
The local time \( L_t^a(X) \) is a continuous, non-decreasing process that increases only when \( X_t = a \). It is not absolutely continuous with respect to Lebesgue measure.
In quantitative finance, Tanaka’s formula is often used in conjunction with the Skorokhod problem to model reflected processes (e.g., in interest rate models or stochastic control problems).
Common Pitfalls:
Misapplying Itô’s Formula: Attempting to apply Itô’s formula directly to \( |W_t| \) without accounting for the non-differentiability at 0 leads to incorrect results. Tanaka’s formula is the correct tool here.
Ignoring Local Time: Omitting the local time term \( L_t^a(X) \) in applications (e.g., barrier options) can lead to significant errors in modeling the behavior of the process at the barrier.
Confusing Local Time Definitions: Local time can be defined in different ways (e.g., right-continuous vs. left-continuous versions). Ensure consistency with the definition used in the problem context.
Overlooking Continuity Assumptions: Tanaka’s formula applies to continuous semimartingales. For processes with jumps, a more general version (e.g., involving jump terms) is required.
Quant Interview Question: Local Time and Tanaka’s Formula
Question: Let \( W_t \) be a standard Brownian motion. Use Tanaka’s formula to compute \( \mathbb{E}[L_t^0(W)] \), where \( L_t^0(W) \) is the local time of \( W \) at 0 up to time \( t \).
Solution:
Apply Tanaka’s formula to \( f(x) = |x| \):
\[ |W_t| = |W_0| + \int_0^t \text{sgn}(W_s) \, dW_s + L_t^0(W). \] Since \( W_0 = 0 \), this simplifies to: \[ |W_t| = \int_0^t \text{sgn}(W_s) \, dW_s + L_t^0(W). \]Take expectations on both sides. The stochastic integral \( \int_0^t \text{sgn}(W_s) \, dW_s \) is a martingale with mean 0 (since \( \text{sgn}(W_s) \) is bounded), so:
\[ \mathbb{E}[|W_t|] = \mathbb{E}[L_t^0(W)]. \]Compute \( \mathbb{E}[|W_t|] \). Since \( W_t \sim \mathcal{N}(0, t) \), we have:
\[ \mathbb{E}[|W_t|] = \int_{-\infty}^{\infty} |x| \frac{1}{\sqrt{2 \pi t}} e^{-x^2/(2t)} \, dx = 2 \int_0^{\infty} x \frac{1}{\sqrt{2 \pi t}} e^{-x^2/(2t)} \, dx. \] Let \( u = x^2/(2t) \), then \( du = x/t \, dx \), and the integral becomes: \[ \mathbb{E}[|W_t|] = 2 \int_0^{\infty} \sqrt{2t} \frac{1}{\sqrt{2 \pi t}} e^{-u} \, du = \sqrt{\frac{2t}{\pi}} \int_0^{\infty} e^{-u} \, du = \sqrt{\frac{2t}{\pi}}. \]Thus,
\[ \mathbb{E}[L_t^0(W)] = \sqrt{\frac{2t}{\pi}}. \]
Topic 48: Change of Numéraire Technique in Derivatives Pricing
Numéraire: In financial mathematics, a numéraire is a reference asset used to express the value of other assets. Common numéraires include the money-market account (risk-free asset), a zero-coupon bond, or a stock. The choice of numéraire affects the probability measure under which asset prices are martingales.
Change of Numéraire Technique: A method used in derivatives pricing to simplify the valuation of contingent claims by changing the reference asset (numéraire) and the corresponding risk-neutral measure. This technique is particularly useful for pricing options with payoffs dependent on multiple assets or when the standard risk-neutral measure is not convenient.
Risk-Neutral Measure (ℚ): A probability measure under which the discounted prices of all traded assets are martingales. The choice of numéraire determines the specific risk-neutral measure.
Martingale: A stochastic process \( M_t \) is a martingale with respect to a filtration \( \mathcal{F}_t \) and a probability measure \( \mathbb{P} \) if: \[ \mathbb{E}^\mathbb{P}[M_t | \mathcal{F}_s] = M_s \quad \text{for all } s \leq t. \] Under the risk-neutral measure, the discounted asset price process is a martingale.
Radon-Nikodym Derivative: The change of measure from one risk-neutral measure \( \mathbb{Q}^0 \) (with numéraire \( N_t^0 \)) to another \( \mathbb{Q}^1 \) (with numéraire \( N_t^1 \)) is given by the Radon-Nikodym derivative: \[ \frac{d\mathbb{Q}^1}{d\mathbb{Q}^0} \bigg|_t = \frac{N_t^0}{N_t^1} \cdot \frac{N_0^1}{N_0^0}. \] This ensures that the ratio of any asset price \( S_t \) to the numéraire \( N_t^1 \) is a martingale under \( \mathbb{Q}^1 \).
Change of Numéraire Formula: Let \( V_t \) be the price of a derivative at time \( t \) with payoff \( V_T \) at maturity \( T \). Under the risk-neutral measure \( \mathbb{Q}^0 \) with numéraire \( N_t^0 \), the price is: \[ V_t = N_t^0 \cdot \mathbb{E}^{\mathbb{Q}^0} \left[ \frac{V_T}{N_T^0} \bigg| \mathcal{F}_t \right]. \] If we switch to a new numéraire \( N_t^1 \) and the corresponding measure \( \mathbb{Q}^1 \), the price can be rewritten as: \[ V_t = N_t^1 \cdot \mathbb{E}^{\mathbb{Q}^1} \left[ \frac{V_T}{N_T^1} \bigg| \mathcal{F}_t \right]. \]
Derivation: Change of Numéraire
We start with the pricing formula under the original numéraire \( N_t^0 \): \[ V_t = N_t^0 \cdot \mathbb{E}^{\mathbb{Q}^0} \left[ \frac{V_T}{N_T^0} \bigg| \mathcal{F}_t \right]. \] Using the Radon-Nikodym derivative, we can express the expectation under \( \mathbb{Q}^1 \): \[ \mathbb{E}^{\mathbb{Q}^0} \left[ \frac{V_T}{N_T^0} \bigg| \mathcal{F}_t \right] = \mathbb{E}^{\mathbb{Q}^1} \left[ \frac{V_T}{N_T^0} \cdot \frac{d\mathbb{Q}^0}{d\mathbb{Q}^1} \bigg| \mathcal{F}_t \right]. \] From the Radon-Nikodym derivative formula, we have: \[ \frac{d\mathbb{Q}^0}{d\mathbb{Q}^1} \bigg|_T = \frac{N_T^1}{N_T^0} \cdot \frac{N_0^0}{N_0^1}. \] Substituting this into the expectation: \[ \mathbb{E}^{\mathbb{Q}^0} \left[ \frac{V_T}{N_T^0} \bigg| \mathcal{F}_t \right] = \mathbb{E}^{\mathbb{Q}^1} \left[ \frac{V_T}{N_T^0} \cdot \frac{N_T^1}{N_T^0} \cdot \frac{N_0^0}{N_0^1} \bigg| \mathcal{F}_t \right] = \frac{N_0^0}{N_0^1} \cdot \mathbb{E}^{\mathbb{Q}^1} \left[ \frac{V_T N_T^1}{(N_T^0)^2} \bigg| \mathcal{F}_t \right]. \] However, this approach is not directly leading to the desired result. Instead, we use the fact that the ratio \( \frac{V_t}{N_t^1} \) must be a martingale under \( \mathbb{Q}^1 \). Thus, we can write: \[ \frac{V_t}{N_t^1} = \mathbb{E}^{\mathbb{Q}^1} \left[ \frac{V_T}{N_T^1} \bigg| \mathcal{F}_t \right], \] which implies: \[ V_t = N_t^1 \cdot \mathbb{E}^{\mathbb{Q}^1} \left[ \frac{V_T}{N_T^1} \bigg| \mathcal{F}_t \right]. \] This completes the derivation.
Example: Pricing a Foreign Exchange Option
Consider a foreign exchange (FX) option where the domestic currency is USD and the foreign currency is EUR. Let \( X_t \) be the exchange rate (units of domestic currency per unit of foreign currency). We want to price a call option on EUR with strike \( K \) in USD.
Step 1: Choose Numéraires
- Original numéraire: Domestic money-market account \( B_t^d = e^{r_d t} \).
- New numéraire: Foreign money-market account \( B_t^f = e^{r_f t} \).
Step 2: Express the Option Price
Under the domestic risk-neutral measure \( \mathbb{Q}^d \), the option price is: \[ V_t = B_t^d \cdot \mathbb{E}^{\mathbb{Q}^d} \left[ \frac{(X_T - K)^+}{B_T^d} \bigg| \mathcal{F}_t \right]. \] Using the change of numéraire to the foreign risk-neutral measure \( \mathbb{Q}^f \), we have: \[ V_t = X_t B_t^f \cdot \mathbb{E}^{\mathbb{Q}^f} \left[ \frac{(X_T - K)^+}{X_T B_T^f} \bigg| \mathcal{F}_t \right] = X_t \cdot \mathbb{E}^{\mathbb{Q}^f} \left[ \frac{(X_T - K)^+}{X_T} \bigg| \mathcal{F}_t \right]. \] Simplifying the payoff: \[ \frac{(X_T - K)^+}{X_T} = \left(1 - \frac{K}{X_T}\right)^+, \] which is the payoff of a put option on the inverse exchange rate \( \frac{1}{X_T} \) with strike \( \frac{1}{K} \). This is the Garman-Kohlhagen formula for FX options.
Step 3: Final Pricing Formula
The price of the call option on EUR is: \[ V_t = X_t e^{-r_f (T-t)} N(d_1) - K e^{-r_d (T-t)} N(d_2), \] where: \[ d_1 = \frac{\ln(X_t / K) + (r_d - r_f + \sigma^2 / 2)(T-t)}{\sigma \sqrt{T-t}}, \quad d_2 = d_1 - \sigma \sqrt{T-t}. \]
Forward Measure: A special case of the change of numéraire technique where the numéraire is a zero-coupon bond \( P(t, T) \) maturing at time \( T \). The corresponding risk-neutral measure is called the forward measure \( \mathbb{Q}^T \). The price of a derivative \( V_t \) is: \[ V_t = P(t, T) \cdot \mathbb{E}^{\mathbb{Q}^T} \left[ V_T \bigg| \mathcal{F}_t \right]. \] This is particularly useful for pricing interest rate derivatives.
Example: Pricing a Caplet Using Forward Measure
A caplet is an option on an interest rate (e.g., LIBOR). Let \( L(T, T + \delta) \) be the LIBOR rate at time \( T \) for the period \( [T, T + \delta] \). The payoff of a caplet with strike \( K \) is: \[ V_T = \delta \cdot (L(T, T + \delta) - K)^+. \] Using the forward measure \( \mathbb{Q}^{T + \delta} \) with numéraire \( P(t, T + \delta) \), the price at time \( t \) is: \[ V_t = P(t, T + \delta) \cdot \mathbb{E}^{\mathbb{Q}^{T + \delta}} \left[ \delta \cdot (L(T, T + \delta) - K)^+ \bigg| \mathcal{F}_t \right]. \] Under \( \mathbb{Q}^{T + \delta} \), the forward LIBOR rate \( F(t; T, T + \delta) \) is a martingale. Assuming log-normal dynamics: \[ dF(t; T, T + \delta) = \sigma F(t; T, T + \delta) dW_t^{\mathbb{Q}^{T + \delta}}, \] the caplet price can be derived using the Black-76 formula: \[ V_t = \delta \cdot P(t, T + \delta) \left[ F(t; T, T + \delta) N(d_1) - K N(d_2) \right], \] where: \[ d_1 = \frac{\ln(F(t; T, T + \delta) / K) + \sigma^2 (T - t) / 2}{\sigma \sqrt{T - t}}, \quad d_2 = d_1 - \sigma \sqrt{T - t}. \]
Practical Applications
- Foreign Exchange Options: As shown in the example, the change of numéraire simplifies the pricing of FX options by converting them into options on the inverse exchange rate.
- Interest Rate Derivatives: The forward measure is extensively used for pricing caps, floors, swaptions, and other interest rate derivatives.
- Basket Options: For options on a basket of assets, choosing an appropriate numéraire (e.g., one of the assets in the basket) can simplify the correlation structure.
- Commodity Derivatives: The change of numéraire can be used to switch between spot and forward measures, which is useful for pricing options on commodities with convenience yields or storage costs.
- Credit Derivatives: In reduced-form credit models, the change of numéraire can be used to switch between the risk-neutral measure and the survival measure, simplifying the pricing of credit default swaps (CDS) and other credit derivatives.
Common Pitfalls and Important Notes
- Martingale Property: Always ensure that the ratio of the asset price to the numéraire is a martingale under the new measure. Failure to do so can lead to incorrect pricing formulas.
- Numéraire Invariance: The choice of numéraire does not affect the price of a derivative, but it can simplify the calculation. Always verify that the final price is consistent across different numéraires.
- Volatility and Correlation: When changing numéraires, the dynamics of asset prices (e.g., volatility, correlation) may change under the new measure. For example, the volatility of the exchange rate in the FX example depends on the measure.
- Dividends and Costs: If the numéraire asset pays dividends or has carrying costs (e.g., storage costs for commodities), these must be accounted for in the change of measure.
- Forward Measure Limitations: The forward measure is only applicable for pricing derivatives with payoffs at the maturity of the numéraire bond. For path-dependent options, other techniques (e.g., change of numéraire to the annuity) may be required.
- Girsanov's Theorem: The change of numéraire is closely related to Girsanov's theorem, which describes how the drift of a stochastic process changes under a change of measure. Understanding this theorem is crucial for deriving the dynamics of asset prices under the new measure.
Girsanov's Theorem: Let \( W_t \) be a Brownian motion under the measure \( \mathbb{Q}^0 \). If we change to a new measure \( \mathbb{Q}^1 \) with Radon-Nikodym derivative: \[ \frac{d\mathbb{Q}^1}{d\mathbb{Q}^0} \bigg|_t = \exp \left( -\int_0^t \gamma_s dW_s - \frac{1}{2} \int_0^t \gamma_s^2 ds \right), \] then the process \( \tilde{W}_t = W_t + \int_0^t \gamma_s ds \) is a Brownian motion under \( \mathbb{Q}^1 \).
Key Takeaways
- The change of numéraire technique is a powerful tool for simplifying the pricing of complex derivatives by switching to a more convenient risk-neutral measure.
- The choice of numéraire should be guided by the structure of the derivative's payoff and the underlying assets.
- Always verify the martingale property of the ratio of the asset price to the numéraire under the new measure.
- Practical applications include FX options, interest rate derivatives, basket options, commodity derivatives, and credit derivatives.
Topic 49: Forward LIBOR Model (BGM) and Its SDEs
Forward LIBOR Model (BGM/Jamshidian Model): The Forward LIBOR Model, also known as the Brace-Gatarek-Musiela (BGM) model or Jamshidian model, is a market model used for pricing interest rate derivatives, particularly caps, floors, and swaptions. It models the evolution of forward LIBOR rates directly under their associated forward measures, ensuring that the rates remain lognormal and thus avoiding negative interest rates in a Black-Scholes-like framework.
Forward LIBOR Rate: The forward LIBOR rate \( L(t, T_i, T_{i+1}) \) at time \( t \) for the period \([T_i, T_{i+1}]\) is the fixed rate agreed at time \( t \) that makes the value of a forward rate agreement (FRA) zero. It is defined as: \[ L(t, T_i, T_{i+1}) = \frac{1}{\delta_i} \left( \frac{P(t, T_i)}{P(t, T_{i+1})} - 1 \right), \] where \( \delta_i = T_{i+1} - T_i \) is the day count fraction, and \( P(t, T) \) is the price at time \( t \) of a zero-coupon bond maturing at time \( T \).
Forward Measure: The forward measure \( \mathbb{Q}^{T_{i+1}} \) is the equivalent martingale measure associated with the numéraire \( P(t, T_{i+1}) \). Under this measure, the discounted prices of tradable assets are martingales.
Key Formulas and SDEs
Dynamics of Forward LIBOR Rates under Their Forward Measures: Under the forward measure \( \mathbb{Q}^{T_{i+1}} \), the forward LIBOR rate \( L(t, T_i, T_{i+1}) \) follows a lognormal process: \[ dL(t, T_i, T_{i+1}) = L(t, T_i, T_{i+1}) \sigma_i(t) \cdot dW^{T_{i+1}}(t), \] where:
- \( \sigma_i(t) \) is the volatility vector of the forward LIBOR rate,
- \( W^{T_{i+1}}(t) \) is a Brownian motion under \( \mathbb{Q}^{T_{i+1}} \),
- \( \cdot \) denotes the dot product.
Change of Numéraire and Drift Adjustment: When changing from the forward measure \( \mathbb{Q}^{T_{i+1}} \) to another forward measure \( \mathbb{Q}^{T_j} \), the Brownian motion undergoes a drift adjustment. The relationship between Brownian motions under different forward measures is given by: \[ dW^{T_j}(t) = dW^{T_{i+1}}(t) - \left( \sum_{k=i+1}^{j-1} \frac{\delta_k L(t, T_k, T_{k+1})}{1 + \delta_k L(t, T_k, T_{k+1})} \sigma_k(t) \right) dt, \quad \text{for } j > i+1. \] This ensures that the dynamics of \( L(t, T_i, T_{i+1}) \) under \( \mathbb{Q}^{T_j} \) include a drift term: \[ dL(t, T_i, T_{i+1}) = L(t, T_i, T_{i+1}) \sigma_i(t) \cdot \left( dW^{T_j}(t) + \sum_{k=i+1}^{j-1} \frac{\delta_k L(t, T_k, T_{k+1})}{1 + \delta_k L(t, T_k, T_{k+1})} \sigma_k(t) dt \right). \]
Caplet Pricing Formula: A caplet paying \( \delta_i (L(T_i, T_i, T_{i+1}) - K)^+ \) at time \( T_{i+1} \) can be priced at time \( t \) using the Black formula: \[ \text{Caplet}(t) = \delta_i P(t, T_{i+1}) \left[ L(t, T_i, T_{i+1}) N(d_1) - K N(d_2) \right], \] where: \[ d_1 = \frac{\ln \left( \frac{L(t, T_i, T_{i+1})}{K} \right) + \frac{1}{2} v(t, T_i)^2}{v(t, T_i)}, \quad d_2 = d_1 - v(t, T_i), \] and \( v(t, T_i)^2 = \int_t^{T_i} \|\sigma_i(s)\|^2 ds \) is the integrated variance.
Derivations
Derivation of Forward LIBOR SDE under Its Forward Measure:
- Define the Forward LIBOR Rate: The forward LIBOR rate is given by: \[ L(t, T_i, T_{i+1}) = \frac{1}{\delta_i} \left( \frac{P(t, T_i)}{P(t, T_{i+1})} - 1 \right). \]
- Express as a Martingale: Under the forward measure \( \mathbb{Q}^{T_{i+1}} \), the ratio \( \frac{P(t, T_i)}{P(t, T_{i+1})} \) is a martingale. Thus, we can write: \[ \frac{P(t, T_i)}{P(t, T_{i+1})} = \mathbb{E}^{T_{i+1}} \left[ \frac{P(T_i, T_i)}{P(T_i, T_{i+1})} \bigg| \mathcal{F}_t \right] = 1 + \delta_i L(t, T_i, T_{i+1}). \]
- Apply Itô's Lemma: Let \( f(x) = \frac{1}{\delta_i} (x - 1) \). Then: \[ L(t, T_i, T_{i+1}) = f \left( \frac{P(t, T_i)}{P(t, T_{i+1})} \right). \] Applying Itô's Lemma to \( f \), and noting that \( \frac{P(t, T_i)}{P(t, T_{i+1})} \) is a martingale (hence driftless), we obtain: \[ dL(t, T_i, T_{i+1}) = \frac{1}{\delta_i} d \left( \frac{P(t, T_i)}{P(t, T_{i+1})} \right) = L(t, T_i, T_{i+1}) \sigma_i(t) \cdot dW^{T_{i+1}}(t), \] where \( \sigma_i(t) \) is derived from the volatility of the bond price ratio.
- Solution to the SDE: The SDE is a geometric Brownian motion, and its solution is: \[ L(T_i, T_i, T_{i+1}) = L(t, T_i, T_{i+1}) \exp \left( -\frac{1}{2} \int_t^{T_i} \|\sigma_i(s)\|^2 ds + \int_t^{T_i} \sigma_i(s) \cdot dW^{T_{i+1}}(s) \right). \]
Derivation of Drift Adjustment for Change of Measure:
- Numéraire Change: Consider changing from \( \mathbb{Q}^{T_{i+1}} \) to \( \mathbb{Q}^{T_j} \). The Radon-Nikodym derivative is: \[ \frac{d\mathbb{Q}^{T_j}}{d\mathbb{Q}^{T_{i+1}}} \bigg|_t = \frac{P(t, T_j) / P(0, T_j)}{P(t, T_{i+1}) / P(0, T_{i+1})}. \]
- Girsanov's Theorem: The Brownian motion under \( \mathbb{Q}^{T_j} \) is related to that under \( \mathbb{Q}^{T_{i+1}} \) by: \[ dW^{T_j}(t) = dW^{T_{i+1}}(t) - \gamma(t) dt, \] where \( \gamma(t) \) is the market price of risk. To find \( \gamma(t) \), we use the fact that \( \frac{P(t, T_k)}{P(t, T_j)} \) is a martingale under \( \mathbb{Q}^{T_j} \): \[ d \left( \frac{P(t, T_k)}{P(t, T_j)} \right) = \frac{P(t, T_k)}{P(t, T_j)} \left( (\sigma_k(t) - \sigma_j(t)) \cdot dW^{T_j}(t) \right). \] Substituting \( dW^{T_j}(t) = dW^{T_{i+1}}(t) - \gamma(t) dt \) and setting the drift to zero gives: \[ (\sigma_k(t) - \sigma_j(t)) \cdot \gamma(t) = 0. \] For this to hold for all \( k \), \( \gamma(t) \) must be: \[ \gamma(t) = \sigma_j(t). \] However, a more precise derivation using the dynamics of forward LIBOR rates yields: \[ \gamma(t) = \sum_{k=i+1}^{j-1} \frac{\delta_k L(t, T_k, T_{k+1})}{1 + \delta_k L(t, T_k, T_{k+1})} \sigma_k(t). \]
Practical Applications
Pricing a Cap: A cap is a series of caplets. Using the BGM model, each caplet can be priced using the Black formula as derived above. The price of the cap is the sum of the prices of the individual caplets: \[ \text{Cap}(t) = \sum_{i=0}^{n-1} \delta_i P(t, T_{i+1}) \left[ L(t, T_i, T_{i+1}) N(d_{1,i}) - K N(d_{2,i}) \right], \] where \( d_{1,i} \) and \( d_{2,i} \) are computed for each caplet using the corresponding forward LIBOR rate and volatility.
Calibrating the Model: The BGM model is calibrated to market data by choosing the volatility functions \( \sigma_i(t) \) such that the model prices of caps and swaptions match the market prices. Common choices for \( \sigma_i(t) \) include:
- Piecewise constant volatilities,
- Time-dependent volatilities (e.g., \( \sigma_i(t) = a_i + b_i (T_i - t) \)),
- Stochastic volatilities (though this complicates the model).
Risk Management: The BGM model allows for the computation of sensitivities (Greeks) of interest rate derivatives. For example:
- Delta: Sensitivity to changes in the underlying forward LIBOR rates,
- Vega: Sensitivity to changes in volatility,
- Gamma: Second-order sensitivity to changes in the underlying rates.
Common Pitfalls and Important Notes
1. Measure Consistency: It is crucial to ensure that the dynamics of forward LIBOR rates are specified under the correct forward measure. Mixing measures can lead to incorrect drift terms and mispricing. Always verify the measure under which the SDE is written.
2. Correlation Structure: The BGM model assumes a correlation structure between different forward LIBOR rates. This correlation is not directly observable and must be estimated or calibrated. Incorrect correlation assumptions can lead to mispricing of multi-LIBOR derivatives like swaptions.
3. Negative Rates: The lognormal assumption in the BGM model prevents negative rates, which can be a limitation in low or negative interest rate environments. Extensions of the model (e.g., displaced diffusion) can address this issue.
4. Volatility Smile: The BGM model, in its basic form, does not account for the volatility smile observed in the market. To fit the smile, more sophisticated volatility structures (e.g., stochastic volatility or local volatility) are required.
5. Numerical Implementation: When implementing the BGM model numerically, care must be taken with the discretization of the SDEs, especially when dealing with multiple forward rates and their correlations. Monte Carlo simulation is a common approach, but it can be computationally intensive.
6. Day Count Conventions: The day count fraction \( \delta_i \) is critical in the definition of forward LIBOR rates. Ensure that the correct day count convention (e.g., Actual/360, 30/360) is used to match market conventions.
7. Initial Term Structure: The BGM model is initialized with the current term structure of interest rates. This term structure must be bootstrapped accurately from market data to ensure that the model's initial conditions are correct.
Topic 50: Common Quant Interview Questions on SDEs and Itô's Lemma (Tricky Cases)
Stochastic Differential Equation (SDE): An SDE is a differential equation in which one or more terms are stochastic processes, resulting in a solution that is also a stochastic process. The general form is:
\[ dX_t = \mu(X_t, t) dt + \sigma(X_t, t) dW_t \]where \(X_t\) is the stochastic process, \(\mu(X_t, t)\) is the drift term, \(\sigma(X_t, t)\) is the diffusion term, and \(W_t\) is a Wiener process (Brownian motion).
Itô’s Lemma: Itô’s Lemma is the stochastic calculus counterpart of the chain rule in ordinary calculus. It provides a way to compute the differential of a function of a stochastic process. For a twice-differentiable function \(f(X_t, t)\), Itô’s Lemma states:
\[ df(X_t, t) = \left( \frac{\partial f}{\partial t} + \mu \frac{\partial f}{\partial x} + \frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2} \right) dt + \sigma \frac{\partial f}{\partial x} dW_t \]Key Formulas:
- Geometric Brownian Motion (GBM): \[ dS_t = \mu S_t dt + \sigma S_t dW_t \]
- Ornstein-Uhlenbeck Process: \[ dX_t = \theta (\mu - X_t) dt + \sigma dW_t \]
- Itô’s Lemma for \(f(S_t, t)\): \[ df(S_t, t) = \left( \frac{\partial f}{\partial t} + \mu S_t \frac{\partial f}{\partial S} + \frac{1}{2} \sigma^2 S_t^2 \frac{\partial^2 f}{\partial S^2} \right) dt + \sigma S_t \frac{\partial f}{\partial S} dW_t \]
- Itô’s Product Rule: For two Itô processes \(X_t\) and \(Y_t\), \[ d(X_t Y_t) = X_t dY_t + Y_t dX_t + dX_t dY_t \]
Example 1: Deriving the SDE for \( \ln(S_t) \) under GBM
Let \(S_t\) follow a GBM:
\[ dS_t = \mu S_t dt + \sigma S_t dW_t \]Define \(f(S_t) = \ln(S_t)\). Apply Itô’s Lemma:
- Compute partial derivatives: \[ \frac{\partial f}{\partial S} = \frac{1}{S}, \quad \frac{\partial^2 f}{\partial S^2} = -\frac{1}{S^2}, \quad \frac{\partial f}{\partial t} = 0 \]
- Substitute into Itô’s Lemma: \[ d(\ln S_t) = \left( 0 + \mu S_t \cdot \frac{1}{S_t} + \frac{1}{2} \sigma^2 S_t^2 \cdot \left(-\frac{1}{S_t^2}\right) \right) dt + \sigma S_t \cdot \frac{1}{S_t} dW_t \] \[ d(\ln S_t) = \left( \mu - \frac{1}{2} \sigma^2 \right) dt + \sigma dW_t \]
Note: This shows that the log of a GBM follows an arithmetic Brownian motion with drift \(\mu - \frac{1}{2} \sigma^2\). This is a key result in quantitative finance, particularly for modeling stock prices and deriving the Black-Scholes formula.
Example 2: Itô’s Lemma for \( f(S_t) = S_t^2 \)
Let \(S_t\) follow a GBM as above. Apply Itô’s Lemma to \(f(S_t) = S_t^2\):
- Compute partial derivatives: \[ \frac{\partial f}{\partial S} = 2S, \quad \frac{\partial^2 f}{\partial S^2} = 2, \quad \frac{\partial f}{\partial t} = 0 \]
- Substitute into Itô’s Lemma: \[ d(S_t^2) = \left( 0 + \mu S_t \cdot 2S_t + \frac{1}{2} \sigma^2 S_t^2 \cdot 2 \right) dt + \sigma S_t \cdot 2S_t dW_t \] \[ d(S_t^2) = \left( 2\mu S_t^2 + \sigma^2 S_t^2 \right) dt + 2\sigma S_t^2 dW_t \] \[ d(S_t^2) = S_t^2 \left( (2\mu + \sigma^2) dt + 2\sigma dW_t \right) \]
Note: This result is useful in deriving moments of \(S_t\) or in applications like variance swaps.
Example 3: Tricky Case - Itô’s Lemma for \( f(S_t, t) = e^{S_t + \alpha t} \)
Let \(S_t\) follow an arithmetic Brownian motion:
\[ dS_t = \mu dt + \sigma dW_t \]Apply Itô’s Lemma to \(f(S_t, t) = e^{S_t + \alpha t}\):
- Compute partial derivatives: \[ \frac{\partial f}{\partial S} = e^{S_t + \alpha t}, \quad \frac{\partial^2 f}{\partial S^2} = e^{S_t + \alpha t}, \quad \frac{\partial f}{\partial t} = \alpha e^{S_t + \alpha t} \]
- Substitute into Itô’s Lemma: \[ d(f) = \left( \alpha e^{S_t + \alpha t} + \mu e^{S_t + \alpha t} + \frac{1}{2} \sigma^2 e^{S_t + \alpha t} \right) dt + \sigma e^{S_t + \alpha t} dW_t \] \[ d(f) = e^{S_t + \alpha t} \left( \alpha + \mu + \frac{1}{2} \sigma^2 \right) dt + \sigma e^{S_t + \alpha t} dW_t \]
Note: This example highlights the importance of correctly identifying the drift and diffusion terms when the function depends explicitly on time. A common mistake is to forget the \(\frac{\partial f}{\partial t}\) term.
Example 4: Itô’s Product Rule for \(X_t Y_t\)
Let \(X_t\) and \(Y_t\) be two Itô processes:
\[ dX_t = \mu_X dt + \sigma_X dW_t \] \[ dY_t = \mu_Y dt + \sigma_Y dW_t \]Apply Itô’s product rule to find \(d(X_t Y_t)\):
- Write the product rule: \[ d(X_t Y_t) = X_t dY_t + Y_t dX_t + dX_t dY_t \]
- Substitute the differentials: \[ d(X_t Y_t) = X_t (\mu_Y dt + \sigma_Y dW_t) + Y_t (\mu_X dt + \sigma_X dW_t) + (\mu_X dt + \sigma_X dW_t)(\mu_Y dt + \sigma_Y dW_t) \]
- Simplify, noting that \(dt \cdot dW_t = 0\) and \(dt^2 = 0\): \[ d(X_t Y_t) = (X_t \mu_Y + Y_t \mu_X + \sigma_X \sigma_Y) dt + (X_t \sigma_Y + Y_t \sigma_X) dW_t \]
Note: The term \(dX_t dY_t = \sigma_X \sigma_Y dt\) is often overlooked but is crucial in stochastic calculus. This rule is essential for deriving the dynamics of products of stochastic processes, such as in foreign exchange or quanto options.
Common Quant Interview Questions (Tricky Cases)
Question 1: What is the SDE for \( \frac{1}{S_t} \) if \(S_t\) follows a GBM?
Solution: Let \(f(S_t) = \frac{1}{S_t} = S_t^{-1}\). Apply Itô’s Lemma:
- Compute partial derivatives: \[ \frac{\partial f}{\partial S} = -S_t^{-2}, \quad \frac{\partial^2 f}{\partial S^2} = 2S_t^{-3}, \quad \frac{\partial f}{\partial t} = 0 \]
- Substitute into Itô’s Lemma: \[ d\left(\frac{1}{S_t}\right) = \left( 0 + \mu S_t \cdot (-S_t^{-2}) + \frac{1}{2} \sigma^2 S_t^2 \cdot 2S_t^{-3} \right) dt + \sigma S_t \cdot (-S_t^{-2}) dW_t \] \[ d\left(\frac{1}{S_t}\right) = \left( -\mu S_t^{-1} + \sigma^2 S_t^{-1} \right) dt - \sigma S_t^{-1} dW_t \] \[ d\left(\frac{1}{S_t}\right) = \frac{1}{S_t} \left( (\sigma^2 - \mu) dt - \sigma dW_t \right) \]
Note: This result is useful in fixed-income modeling or when dealing with inverse contracts (e.g., USD/JPY vs. JPY/USD).
Question 2: Derive the SDE for \( \sqrt{S_t} \) where \(S_t\) follows a GBM.
Solution: Let \(f(S_t) = \sqrt{S_t} = S_t^{1/2}\). Apply Itô’s Lemma:
- Compute partial derivatives: \[ \frac{\partial f}{\partial S} = \frac{1}{2} S_t^{-1/2}, \quad \frac{\partial^2 f}{\partial S^2} = -\frac{1}{4} S_t^{-3/2}, \quad \frac{\partial f}{\partial t} = 0 \]
- Substitute into Itô’s Lemma: \[ d(\sqrt{S_t}) = \left( 0 + \mu S_t \cdot \frac{1}{2} S_t^{-1/2} + \frac{1}{2} \sigma^2 S_t^2 \cdot \left(-\frac{1}{4} S_t^{-3/2}\right) \right) dt + \sigma S_t \cdot \frac{1}{2} S_t^{-1/2} dW_t \] \[ d(\sqrt{S_t}) = \left( \frac{\mu}{2} S_t^{1/2} - \frac{\sigma^2}{8} S_t^{1/2} \right) dt + \frac{\sigma}{2} S_t^{1/2} dW_t \] \[ d(\sqrt{S_t}) = \sqrt{S_t} \left( \left( \frac{\mu}{2} - \frac{\sigma^2}{8} \right) dt + \frac{\sigma}{2} dW_t \right) \]
Note: This is relevant in volatility modeling (e.g., square-root diffusion processes like the Cox-Ingersoll-Ross model).
Question 3: Suppose \(X_t\) follows \(dX_t = \mu dt + \sigma dW_t\). What is the SDE for \(Y_t = e^{X_t} + t\)?
Solution: Let \(f(X_t, t) = e^{X_t} + t\). Apply Itô’s Lemma:
- Compute partial derivatives: \[ \frac{\partial f}{\partial X} = e^{X_t}, \quad \frac{\partial^2 f}{\partial X^2} = e^{X_t}, \quad \frac{\partial f}{\partial t} = 1 \]
- Substitute into Itô’s Lemma: \[ dY_t = \left( 1 + \mu e^{X_t} + \frac{1}{2} \sigma^2 e^{X_t} \right) dt + \sigma e^{X_t} dW_t \] \[ dY_t = \left( 1 + e^{X_t} \left( \mu + \frac{1}{2} \sigma^2 \right) \right) dt + \sigma e^{X_t} dW_t \]
Note: This question tests the ability to handle both the stochastic and deterministic components of the function. A common mistake is to ignore the \(\frac{\partial f}{\partial t}\) term.
Question 4: Let \(S_t\) follow a GBM. What is the SDE for \( \frac{S_t}{1 + S_t} \)?
Solution: Let \(f(S_t) = \frac{S_t}{1 + S_t}\). Apply Itô’s Lemma:
- Compute partial derivatives: \[ \frac{\partial f}{\partial S} = \frac{(1 + S_t) - S_t}{(1 + S_t)^2} = \frac{1}{(1 + S_t)^2} \] \[ \frac{\partial^2 f}{\partial S^2} = -\frac{2}{(1 + S_t)^3} \] \[ \frac{\partial f}{\partial t} = 0 \]
- Substitute into Itô’s Lemma: \[ d\left(\frac{S_t}{1 + S_t}\right) = \left( 0 + \mu S_t \cdot \frac{1}{(1 + S_t)^2} + \frac{1}{2} \sigma^2 S_t^2 \cdot \left(-\frac{2}{(1 + S_t)^3}\right) \right) dt + \sigma S_t \cdot \frac{1}{(1 + S_t)^2} dW_t \] \[ d\left(\frac{S_t}{1 + S_t}\right) = \frac{1}{(1 + S_t)^2} \left( \mu S_t - \frac{\sigma^2 S_t^2}{1 + S_t} \right) dt + \frac{\sigma S_t}{(1 + S_t)^2} dW_t \] \[ d\left(\frac{S_t}{1 + S_t}\right) = \frac{S_t}{(1 + S_t)^2} \left( \mu - \frac{\sigma^2 S_t}{1 + S_t} \right) dt + \frac{\sigma S_t}{(1 + S_t)^2} dW_t \]
Note: This type of transformation appears in utility functions or when normalizing prices. The key is to carefully compute the second derivative, which can be error-prone.
Question 5: Suppose \(dX_t = \mu X_t dt + \sigma X_t dW_t\) (GBM) and \(Y_t = X_t^3\). What is \(dY_t\)?
Solution: Let \(f(X_t) = X_t^3\). Apply Itô’s Lemma:
- Compute partial derivatives: \[ \frac{\partial f}{\partial X} = 3X_t^2, \quad \frac{\partial^2 f}{\partial X^2} = 6X_t, \quad \frac{\partial f}{\partial t} = 0 \]
- Substitute into Itô’s Lemma: \[ dY_t = \left( 0 + \mu X_t \cdot 3X_t^2 + \frac{1}{2} \sigma^2 X_t^2 \cdot 6X_t \right) dt + \sigma X_t \cdot 3X_t^2 dW_t \] \[ dY_t = \left( 3\mu X_t^3 + 3\sigma^2 X_t^3 \right) dt + 3\sigma X_t^3 dW_t \] \[ dY_t = 3X_t^3 (\mu + \sigma^2) dt + 3\sigma X_t^3 dW_t \] \[ dY_t = 3Y_t (\mu + \sigma^2) dt + 3\sigma Y_t dW_t \]
Note: This question tests the ability to handle higher-order polynomials. The drift term includes an additional \(\sigma^2\) term due to the second derivative in Itô’s Lemma.
Practical Applications
- Option Pricing: Itô’s Lemma is fundamental in deriving the Black-Scholes PDE. For example, applying Itô’s Lemma to the price of a European call option \(C(S_t, t)\) leads to the Black-Scholes equation.
- Interest Rate Modeling: In the Vasicek or Cox-Ingersoll-Ross models, Itô’s Lemma is used to derive the dynamics of bond prices or yields.
- Stochastic Volatility Models: In models like Heston, Itô’s Lemma is applied to the variance process to derive the joint dynamics of the asset price and its volatility.
- Risk Management: Itô’s Lemma helps in computing the Greeks (delta, gamma, vega) for hedging purposes.
- Portfolio Optimization: In Merton’s portfolio problem, Itô’s Lemma is used to derive the dynamics of the wealth process.
Common Pitfalls and Important Notes
1. Forgetting the Second Derivative Term: A frequent mistake is to apply the ordinary chain rule and ignore the \(\frac{1}{2} \sigma^2 \frac{\partial^2 f}{\partial x^2}\) term in Itô’s Lemma. This term arises from the quadratic variation of the Brownian motion.
2. Misidentifying the Drift and Diffusion Terms: When the function \(f\) depends on both \(X_t\) and \(t\), it is easy to overlook the \(\frac{\partial f}{\partial t}\) term. Always check if the function has explicit time dependence.
3. Incorrectly Handling Products of Stochastic Processes: When dealing with products like \(X_t Y_t\), remember to include the \(dX_t dY_t\) term in Itô’s product rule. This term is often non-zero and can significantly affect the drift.
4. Confusing Arithmetic and Geometric Brownian Motion: The SDE for \(\ln(S_t)\) under GBM is an arithmetic Brownian motion, not a GBM. This distinction is crucial for solving problems involving log-returns.
5. Overlooking the Initial Conditions: When solving SDEs, the initial condition \(X_0\) is often critical. For example, the solution to \(dX_t = \mu X_t dt + \sigma X_t dW_t\) is \(X_t = X_0 e^{(\mu - \frac{1}{2} \sigma^2)t + \sigma W_t}\). Forgetting \(X_0\) leads to incorrect results.
6. Itô vs. Stratonovich Calculus: In some physical sciences, Stratonovich calculus is used, where the chain rule resembles the ordinary calculus rule. However, in finance, Itô calculus is standard. Mixing the two can lead to errors.
7. Numerical Instability: When discretizing SDEs for simulation (e.g., Euler-Maruyama method), the choice of time step \(\Delta t\) is critical. Too large a step can lead to instability or incorrect convergence, especially for processes with high volatility or mean-reverting properties.