# On Our Way to Reduction: Substitution

## Introduction

The ability to properly substitute is vital to reduction. In this post, I will show proper and improper definitions of substitution.

## Semantics Primer

From my earlier post on currying you would have seen a substitution syntax like this, (λx.t1)t2 ↦ [x ↦ t2]t1 where [x ↦ t2]t1 means the term obtained by replacing all free occurrences of x in t1 by t2. See my last post for the definition of free variables.

## Some Wrong Ways to Do It

The definition of the λ-calculus is simple. We have a term broken into three additional parts: variable, abstraction, and application. To define substitution, we must define it for the each part of a term. For variables, it must be defined when there is a free variable and when there is not. ###Naive Recursive Definition

# Replace all free occurrences of x in x with s
[x ↦ s]x        = s
# Replace all free occurrences of x in y with s
[x ↦ s]y        = y     if x ≠ y
# Replace all free occurrences of x in λy.t1 with s
[x ↦ s](λy.t1)  = λy.[x ↦ s]t1
# Replace all free occurrences of x in t1 t2 with s
[x ↦ s](t1 t2)  = ([x ↦ s]t1)([x ↦ s]t2)


The problem with the definition from above is that we are relying on the name of the variable or term to define substitution. This works fine if we are careful with our choice of bound variable names, however take the example [x ↦ y](λx.x). If we replace all free x terms within the abstraction λx.x with y we would end up with the following: λx.y, clearly this is no longer the identity function.

What should be taken away from this? The names of bound variables do not matter

### Almost Fixed Definition

Let’s ensure that names of bound variables don’t matter.

# Replace all free occurrences of x in x with s
[x ↦ s]x        = s
# Replace all free occurrences of x in y with s
[x ↦ s]y        = y     if x ≠ y
# Replace all free occurrences of x in λy.t1 with s
[x ↦ s](λy.t1)  = λy.t1         if y = x
[x ↦ s](λy.t1)  = λy.[x ↦ s]t1  if y ≠ x
# Replace all free occurrences of x in t1 t2 with s
[x ↦ s](t1 t2)  = ([x ↦ s]t1)([x ↦ s]t2)


So, in this case, [x ↦ y](λx.x) = λy.y which is indeed the identity function. However, this is still wrong. Take this example: [x ↦ z](λz.x). Using the definition above, we would be substituting z for all free variables in (λz.x) which would leave us with λz.z, the identity function. But λz.x is not the identity function. So names are still an issue.

## Variable Capture and Capture Avoidance

When a free variable in a term s is bound when s is substituted into a term t naively is called variable capture. We want to avoid this. We do this by using a substitution operation that avoids mixing bound variable names of t and free variable names of s. This is called capture-avoiding substitution and is often what is implicitly meant by the term substitution. It is easily achieved by one more condition on the abstraction case:

# Replace all free occurrences of x in x with s
[x ↦ s]x        = s
# Replace all free occurrences of x in y with s
[x ↦ s]y        = y     if x ≠ y
# Replace all free occurrences of x in λy.t1 with s
[x ↦ s](λy.t1)  = λy.t1         if y = x
[x ↦ s](λy.t1)  = λy.[x ↦ s]t1  if y ≠ x and y ∉ FV(s)
# Replace all free occurrences of x in t1 t2 with s
[x ↦ s](t1 t2)  = ([x ↦ s]t1)([x ↦ s]t2)


This however, is not a complete definition, since it has now changed substitution into a partial operation. For example: [x ↦ y z](λy.xy) should equal λy.yzy, but because it does not appear free in (y z) it never hits one of our definitions.

## A Convention: Alpha-Conversion

In order to fix the issue of bound and free variable names, we must decide to work with terms “up to renaming of bound variables” or what Church called alpha-conversion. This is the operation of consistently renaming a bound variable in a term. In other words, “Terms that differ only in the names of bound variables are interchangeable in all contexts”.

This actually simplifies our definition, since we can change names as soon as we get to a place where we are trying to apply the substitution to arguments where it is undefined. This means we can completely drop the first clause of the abstraction section.

## Final Definition

# Replace all free occurrences of x in x with s
[x ↦ s]x        = s
# Replace all free occurrences of x in y with s
[x ↦ s]y        = y     if x ≠ y
# Replace all free occurrences of x in λy.t1 with s
[x ↦ s](λy.t1)  = λy.[x ↦ s]t1  if y ≠ x and y ∉ FV(s)
# Replace all free occurrences of x in t1 t2 with s
[x ↦ s](t1 t2)  = ([x ↦ s]t1)([x ↦ s]t2)