Composition of functions is associative

This post was first published on my WordPress blog on April 1 2012 with the title Is composition of functions associative?
In fact, at the time of writing, that post is still my most popular blog post so far, for some reason!
Because of the date of the post, it was mistaken by some people as a joke. I certainly doubt that many people share my concerns at all. Nevertheless, for anyone who is even remotely worried about the theory hidden behind the mathematical interpretation of nested brackets, the approach here does give a slightly different way to think about this proof. I may be almost alone in finding this reassuring! Meanwhile, I still continue to give the standard proof of associativity of composition of functions in my first-year introductory pure maths module at the University of Nottingham.

Is composition of functions associative?

Well, of course it is.

[Notes in italics added 30/7/12: In spite of the [original] date of this post, it is not intended to be a joke (except in as much as my concerns here may appear amusing!). However, the [original] title of the post might be somewhat misleading. I am not suggesting that there is actually any doubt about the truth of these standard results, namely that composition of functions is associative, and that evaluation of expressions with nested brackets is unambiguous. What I am really asking is whether the standard proof of the fact that composition of functions is associative is a fully convincing one.]

I'm sure you have seen the standard proof that composition of functions is associative, but let me remind you how it goes.

Let \(W,X,Y\) and \(Z\) be sets, and suppose that we are given functions

\(h:W\to X\), \(g:X\to Y\) and \(f:Y\to Z\)

We show that \((f\circ g)\circ h = f\circ (g\circ h)\) as follows.

Let \(w\in W\). Then

\(((f\circ g)\circ h)(w) = (f\circ g)(h(w)) = f(g(h(w)))\)

and

\((f\circ (g\circ h))(w) = f( (g\circ h)(w)) = f(g(h(w)))\).

Since \(((f\circ g)\circ h)(w) = (f\circ (g\circ h))(w)\)

for all \(w\in W\), we have

\((f\circ g)\circ h = f\circ (g\circ h)\,.\)

Now perhaps it is just me, but I suspect that seeing so many brackets around here might make for some initial confusion for the students. But is there more to it than that? As mathematicians, we know how to evaluate expressions involving lots of brackets. In fact we all learned how to prioritize brackets over the other mathematical operations when we were at school. But somewhere implicit in this is a confidence that we know how to evaluate expressions involving nested brackets and that these expressions are unambiguous.

But how do we know that evaluating expressions with nested brackets is unambiguous? Are we assuming here that composition of functions is associative? [Or maybe we need, at some earlier stage, to perform a check that is essentially equivalent to proving that composition of functions is associative?] That would then lead to circularity in the argument above. [Or at least it might render the standard proof less satisfactory?]

Here is my suggestion for breaking out of this possible problem.

First of all, how else could we define the function \(f\circ g\). Perhaps we are happy at least with the definition \((f\circ g)(x) = f(g(x))\): after all, there doesn't seem to be any ambiguity there. Still, let's see this another way.

Given \(x\in X\), set \(y = g(x)\in Y\), and then set \(z = f(y)\in Z\). Then the function \(f\circ g\) is the function \(x\mapsto z\), where \(z\) is defined in terms of \(x\) as above.

How can this possibly help? Well, if we return to the composition of functions, let's see what the new definitions of

\((f\circ g)\circ h\) and \(f\circ (g\circ h)\)

look like.

Let \(w\in W\). Set \(x=h(w)\in X\), set \(y=g(x)\in Y\) and set \(z=f(y)\in Z\).

Examining our definitions above, we discover that

\(z= (f\circ g)(x)\) where \(x=h(w)\), and so

\(z = ((f\circ g)\circ h)(w)\)

(and you could rewrite this in longhand in English if you want to avoid some of the brackets round the functions!)

Similarly

\(y = (g\circ h)(w)\) and \(z = f(y)\), giving us

\(z= (f\circ (g\circ h))(w)\)

(and again, we can rewrite everything in English to avoid some of the brackets round the functions).

The rest of the proof is as before.

I'm not really sure whether this helps, or hinders in general! But perhaps there are a few more people like me out there who might be glad to see that there isn't really any circularity in the proof after all. That is, if I haven't introduced some new problems above ...

[Most people I have spoken to think that it is obvious that unambiguous evaluation of nested brackets can't possibly depend on associativity of function composition. It is not so obvious to me. I would have to look carefully at the inductive definition needed to explain how to evaluate complex mathematical expressions in order to be sure what was used where. I am sure that it is fine in the end, but that inductive construction is not something that a first-year undergraduate will have met. 
See also comments on the original WordPress post
https://explainingmaths.wordpress.com/2012/04/01/is-composition-of-functions-associative/]



Comments

Popular posts from this blog

An introduction to the Hahn-Banach extension theorem: Part VI

Discussion of the proof that the uniform norm really is a norm

An introduction to the Weierstrass M-test: Part I