Exponential functions have the form bx, where b > 0 is the base and x the exponent.

Exponential functions are easily introduced as growth processes. The comparison of x² and 2^x is an eye-opener, with the stories of duckweed or the grain on the chess board. The introduction of the exponential number e is a next step. What intuitions can we use for smooth didactics on e ?

The “discover-e” plot

There is the following “intuitive graph” for the exponential number e = 2,71828…. The line y = e is found by requiring that the inclines (tangents) to bx all run through the origin at {0, 0}. The (dashed) value at x = 1 helps to identify the function ex itself. (Check that the red curve indicates 2^x).

Functions 2^x, e^x and 4^x, and tangents through {0, 0}

2^x, e^x and 4^x, and inclines through {0, 0}

Remarkably, Michael Range (2016:xxix) also looks at such an outcome = 2^(1 / c), where is the derivative of = 2^x at x = 0, or c = ln[2]. NB. Instead of the opaque term “logarithm” let us use “recovered exponent”, denoted as rex[y].

Perhaps above plot captures a good intuition of the exponential number ? I am not convinced yet but find that it deserves a fair chance.

NB. Dutch mathematics didactician Hessel Pot, in an email to me of April 7 2013, suggested above plot. There appears to be a Wolfram Demonstrations Project item on this too. Their reference is to Helen Skala, “A discover-e,” The College Mathematics Journal, 28(2), 1997 pp. 128–129 (Jstor), and it has been included in the “Calculus Collection” (2010).


The point-slope version of the incline (tangent) of function f[x] at x = a is:

y – f[a] = s (x a)

The function b^x has derivative rex[b] b^x. Thus at arbitrary a:

y – b^a = rex[b] b^a (x a)

This line runs through the origin {xy} = {0, 0} iff

0 – b^a = rex[b] b^a (0 – a)

1 = rex[ba

Thus with H = -1, a = rex[b]H = 1 / rex[b]. Then also:

yf[a] = b^a = b^rex[b]H = e^(rex[b]  rex[b]H) = e^1 = e

The inclines running through {0, 0} also run through {rex[b]H, e}. Alternatively put, inclines can thus run through the origin and then cut y = e .

For example, in above plot, with 2^x as the red curve, rex[2] ≈ 0.70 and ≈ 1.44, and there we find the intersection with the line y = e.

Subsequently also at a = 1, the point of tangency is {1, e}, and we find with e that rex[e] = 1,

The drawback of this exposition is that it presupposes some algebra on e and the recovered exponents. Without this deduction, it is not guaranteed that above plot is correct. It might be a delusion. Yet since the plot is correct, we may present it to students, and it generates a sense of wonder what this special number e is. Thus it still is possible to make the plot and then begin to develop the required math.

Another drawback of this plot is that it compares different exponential functions and doesn’t focus on the key property of e^x, namely that it is its own derivative. A comparison of different exponential functions is useful, yet for what purpose exactly ?


Our recent weblog text discussed how Cartesius used Euclid’s criterion of tangency of circle and line to determine inclines to curves. The following plots use this idea for e^x at point x = a, for a = 0 and a = 1.

Incline to e^x at x = 0 (left) and x = 1 (right)

Incline to e^x at x = 0 (left) and x = 1 (right)

Let us now define the number e such that the derivative of e^x is given by e^x itself. At point x = a we have s = e^a. Using the point-slope equation for the incline:

y – f[a] = s (x a)

y – e^ae^a (x a)

y e^a (x – (a – 1))

Thus the inclines cut the horizontal axis at {x, y} = {a – 1, 0}, and the slope indeed is given by the tangent s = (f[a] – 0) / (a – (a – 1)) = f[a] / 1 = e^a.

The center {u, 0} and radius r of the circle can be found from the formulas of the mentioned weblog entry (or Pythagoras), and check e.g. a = 0:

u = a + s f[a] = a + (e^a

r = f[a] √ (1 + s²) = e^a √ (1 + (e^a)²)

A key problem with this approach is that the notion of “derivative” is not defined yet. We might plug in any number, say e^2 = 10 and e^3 = 11. For any location the Pythagorean Theorem allows us to create a circle. The notion of a circle is not essential here (yet). But it is nice to see how Cartesius might have done it, if he had had e = 2.71828….

Conquest of the Plane (COTP) (2011)

Conquest of the Plane (2011:167+), pdf online, has the following approach:

  • §12.1.1 has the intuition of the “fixed point” that the derivative of e^x is given by e^x itself. For didactics it is important to have this property firmly established in the minds of the students, since they tend to forget this. This might be achieved perhaps in other ways too, but COTP has opted for the notion of a fixed point. The discussion is “hand waiving” and not intended as a real development of fixed points or theory of function spaces.
  • §12.1.2 defines e with some key properties. It holds by definition that the derivative of e^x is given by e^x itself, but there are also some direct implications, like the slope of 1 at x = 0. Observe that COTP handles integral and derivative consistently as interdependent notions. (Shen & Lin (2014) use this approach too.)
  • §12.1.3 gives the existence proof. With the mentioned properties, such a number and function appears to exist. This compares e^x with other exponential functions b^x and the recovered exponents rex[y] – i.e. logarithm ln[y].
  • §12.1.4 uses the chain rule to find the derivatives of b^x in general. The plot suggested by Hessel Pot above would be a welcome addition to confirm this deduction and extension of the existence proof.
  • §12.1.5-7 have some relevant aspects that need not concern us here.
  • § shows that the definition is consistent with the earlier formal definition of a derivative. Application of that definition doesn’t generate an inconsistency. No limits are required.
  • § gives the numerical development of = 2.71828… There is a clear distinction between deduction that such a number exists and the calculation of its value. (The approach with limits might confuse these aspects.)
  • § shows that also the notion of the dynamic quotient (COTP p57)  is consistent with above approach to e. Thus, the above hasn’t used the dynamic quotient. Using it, we can derive that 1 = {(e^h – 1) // h, set h = 0}. Thus the latter expression cannot be simplified further but we don’t need to do so since we can determine that its value is 1. If we would wish so, we could use this (deduced) property to define e as well (“the formal approach”).

The key difference between COTP and above “approach of Cartesius” is that COTP shows how the (common) numerical development of e can be found. This method relies on the formula of the derivative, which Cartesius didn’t have (or didn’t want to adopt from Fermat).

Difference of COTP and a textbook introduction of e

In my email of March 27 2013 to Hessel Pot I explained how COTP differed from a particular Dutch textbook on the introduction of e.

  • The textbook suggests that f ‘[0] = 1 would be an intuitive criterion. This is only partly true.
  • It proceeds in reworking f ‘[0] = 1 into a more general formula. (I didn’t mention unstated assumptions in 2013.)
  • It eventually boils down to indeed positing that e^x has itself as its derivative, but this definition thus is not explicitly presented as a definition. The clarity of positing this is obscured by the path leading there. Thus, I feel that the approach in COTP is a small but actually key innovation to explicitly define e^x as being equal to its derivative.
  • It presents e only with three decimals.

There are more ways to address the intuition for the exponential number, like the growth process or the surface area under 1 / x. Yet the above approaches are more fitting for the algebraic approach. Of these, COTP has a development that is strong and appealing. The plots by Cartesius and Pot are useful and supportive but no alternatives.

The Appendix contains a deduction that was done in the course of writing this weblog entry. It seems useful to include it, but it is not key to above argument.

Appendix. Using the general formula on factor x a

The earlier weblog entry on Cartesius and Fermat used a circle and generated a “general formula” on a factor x a. This is not really factoring, since the factor only holds when the curve lies on a circle.

Using the two relations:

f[x] – f[a]  = (x a)  (2u – x – a) / (f[x] + f[a])    … (* general)

u = a + s f[a]       … (for a tangent to a circle)

we can restate the earlier theorem that s defined in this manner generates the slope that is tangent to a circle. 

f[x] – f[a]  = (x a)  (2 s f[a](x – a)) / (f[x] + f[a]) 

It will be useful to switch to x a = h:

f[a + h] – f[a]  = h (2 s f[a] – h) / (f[a + h] + f[a]) 

Thus with the definition of the derivative via the dynamic quotient we have:

df / dx = {Δf // Δx, set Δx = 0}

= {(f[a + h] – f[a]) // h, set h = 0}

= { (2 s f[a] – h) / (f[a + h] + f[a]), set h = 0}

= s

This merely shows that the dynamic quotient restates the earlier theorem on the tangency of a line and circle for a curve.

This holds for any function and thus also for the exponential function. Now we have s = e^a by definition. For e^x this gives:

ea + hea  = h (2 s eah) / (ea + h + ea)

For COTP § we get, with Δx = h:

df / dx = {Δf // Δx, set Δx = 0}

= {(ea + hea  ) // h, set h = 0}

= {(2 s eah) / (ea + h + ea) , set h = 0}

= s

This replaces Δf // Δx by the expression from the general formula, while the general formula was found by assuming a tangent circle, with s as the slope of the incline. There is the tricky aspect that we might choose any value of s as long as it satisfies u = a + s f[a]. However, we can refer to the earlier discussion in § on the actual calculation.

The basic conclusion is that this “general formula” enhances the consistency of § The deduction however is not needed, since we have §, but it is useful to see that this new elaboration doesn’t generate an inconsistency. In a way this new elaboration is distractive, since the conclusion that 1 = {(e^h – 1) // h, set h = 0} is much stronger.

A number is what satisfies the axioms of its number system. For elementary and secondary education we use the real numbers R. It suffices to take their standard form as: sign, a finite sequence of digits (not starting with zero unless there is a single zero and no other digits), a decimal point, and a finite or infinite sequence of digits. We also use the isomorphism with the number line.

Thus a limited role for group theory

Group theory creates different number systems, from natural numbers N, to integers Z, to rationals Q, to reals R, and complex plane C, and on to higher dimensions. For elementary and secondary education it is obviously useful to have the different subsets of R. But we don’t do group theory, for the notion of number is given by R.

It should be possible to agree on this (*):

  1. that N ⊂ Z ⊂ Q R,
  2. that the elements in R are called numbers,
  3. whence the elements in the subsets are called numbers too.

Timothy Gowers has an exposition, though with some group theory , and thus we would do as much group theory as Gowers needs. There is also my book Foundations of mathematics. A neoclassical approach to infinity (FMNAI) (2015) (pdf online) so that highschool students need not be overly bothered by complexities of infinity. FMNAI namely distinguishes:

  • potential infinity with the notion of a limit to infinity
  • actual infinity created by abstraction, with the notion of “bijection by abstraction”.

There arises a conceptual knot. When A is a subset of B, or A ⊂ B, then saying that x is in A implies that it is in B, but not necessarily conversely. Who focuses on A, and forgets about B, may protest against a person who discusses B. When we say that the rational numbers are “numbers” because they are in R, then group theorists might protest that the rationals are “only” numbers because (1) Q is an extension of Z by including division, and (2) then we decide that these can be called “number” too. Group theorists who reason like this are advised to consider the dictum that “after climbing one can throw the ladder away”. In the real world there are points of view. When Putin took the Crimea, then his argument was that it already belonged to Russia, while others called it an annexation. In mathematics, it may be that mathematicians are people and have their own personal views. Yet above (*) should be acceptable.

It should suffice to adopt this approach for primary and secondary education. Research mathematicians are free to do what they want at the academia, but let they not meddle in this education.

Division as a procept

The expression 1 / 2 represents both the operation of division and the resulting number. This is an example of the “procept“, the combination of process and concept.

The procept property of y / x is the cause of a lot of confusion. The issue has some complexity of itself and we need even more words to resolve the confusion. Wikipedia (a portal and no source) has separate entries for “division“, “quotient“, “fraction“, “ratio“, “proportionality“.

In my book Conquest of the Plane (COTP) (2011), p47-58, I gave a consistent nomenclature (pdf online):

“Ratio is the input of division. Number is the result of division, if it succeeds.” (COTP p51)

This is not a definition of number but a distinction between input and output of division. My suggestion is to use the word (static) quotient also for the form with numerator y divided by denominator x.

(static) quotient[y, x] = y / x

This fits the use in calculus of “difference and differential quotients”. The form doesn’t have to use a bar. Also a computer statement Div[numerator y, denominator x] would be a quotient.

This suggestion differs a bit from another usage in which the quotient would be the outcome of the division process, potentially with a remainder. We saw this usage for the polynomials. This convention is not universal, see the use of “difference quotient”. However, if there would be confusion between outcome and form, then use “static quotient” for the form. This is in opposition to the dynamic quotient that is relevant for the derivative, as Conquest of the Plane shows.

Proportionality and number

Check also the notion of proportionality in COTP, page 77-78 with the notion of proportion space: {denominator x, numerator y}. Division as a process is a multidimensional notion. The wikipedia article (of today) on proportionality fits this exposition, remarkably with also a diagram of proportion space, with the denominator (cause) on the horizontal axis and the numerator (effect) on the vertical axis (instead of reversed), as it should be because of the difference quotient in calculus. In Conquest of the Plane there is also a vertical line at x = 1, where the numerators give our numbers (a.k.a. slope or tangent).

Conquest of the Plane, p78

Conquest of the Plane, p78

Avoiding the word “fraction”

My nomenclature uses the quotient and the distinction in subsets of numbers, and I tend to avoid the word fraction because of apparent confusions that people have. When someone gives a potential confusing definition of fractions, my criticism doesn’t consist of providing a proper definition for fractions, but I point out the confusion, and then refer to the above.

Below, I will also refer to the suggestion by Pierre van Hiele (1973) to abolish fractions (i.e. what people call these), and I will mention a neat trick that provides a much better alternative.

Number means also satisfying a standard form

Number means also satisfying a standard form. Thus “number” is not something mysterious but is a form, like the other forms, yet standardised.

For example, we have 2 / 4 = 1 / 2, yet 1 / 2 has the standard form of the rationals so that 2 / 4 needs to be simplified by eliminating common prime factors. The algebra of 2 / (2 2) = 1 / 2 can be seen as “rewriting the form”.

What the standard is, depends upon the context. We can do sums on natural numbers, integers, rationals, reals. In education students have to learn how to rewrite particular forms into a particular standard. Student need to know the standard forms, not the group theory about the subset of numbers they are working in.

The equality sign in a is ambiguous. Computer algebra tends to avoid ambiguity. For example in Mathematica: Set (=) vs Equal (==) vs (identically) SameQ (===). Doing computer algebra would help students to become more precise, compared to current textbooks. Learning is going from vague to precise.

The equality sign in highschool tends to mean “of equal value”, which is above “==”. But two expressions can only be of equal value when they represent the identically same value. Thus x == a would amount to Num[x] === Num[a]. The standard mathematical phrase is “equivalence class” for a number in whichever format, e.g. with the numerical value at the vertical position at line at x = 1 (also for the denominator 1).

The standard form takes one element of an “equivalence class” (depending upon the context of what numbers are on the table, e.g. 1 / 2 for the rationals and 0.5 for the reals). (See COTP p45-48 for issues of “approximation”.)

Multiplication is no procept

Multiplication is no procept. For multiplication there is a clear distinction between the operation 2 * 3 and the resulting number 6. When your teacher asks you to calculate 2 * 3 then the answer of 2 * 3 is correct but likely not accepted. The smart-aleck answer 2 * 3 = 3 * 2 is also correct, but then the context better be group theory.

It is a pity that group theory adopted the name “group theory”. My proposal for elementary school is to replace the complicated word “multiplication” by “group, grouping”. With 12 identical elements, you can make 4 groups of 3. (With identical elements this isn’t combinatorics.) See A child wants nice and no mean numbers (CWNN) (2015). If this use of “group, grouping” is confusing for group theory, then they better change to something like “generalised arithmetic”.

The hijack of number by group theory

The world originally had the notion of number, like counting fingers or measuring distance, but then group theory hijacked the word, and assigned it with a generalised meaning, whence communication has become complicated. Their use of language might cause the need for the term numerical value. I would like to say that 2 is identically the same number in N, Z, Q and R, but group theorists tend to pedantically assert that the notion of number is relative to the set of axioms. In the Middle Ages, people didn’t know negative numbers, and they couldn’t even think about -2. Only by defining -2 as a number too, it could be included as a number. This sounds like Baron von Muenchhausen lifting himself from the swamp. The answer to this is rather that -2 is still a number even though it wasn’t recognised as this. I would like to insist that we use the term “number” for the numerical value in R, so that we can use the word “number” in elementary school in this safe sense. Group theorists then must invent a word of their own, e.g. “generalised number” or “gnumber”, for their systems.

Changing the meaning of words is like that your car is stolen, given another colour, and parked in front of your house as if it isn’t your car. Group theorists tend to focus on group theory. They tend not to look at didactics and teaching. When group theorists hear teachers speaking about numbers, and how 2 is the same number in N and R, then group theorists might smile arrogantly, for they “know better” that N and R are different number systems. This would be misplaced behaviour, for it are the group theorists themselves who hijacked the notion of number and changed its meaning. When research mathematicians have the idea that teachers of mathematics have no training about group theory, then they better read Richard Skemp (1971, 1975), The psychology of learning mathematics, first. This was written with an eye on teaching mathematics (and training teachers) and contains an extensive discussion of group theory. (Though I don’t need to agree with all that Skemp writes.)

Quote on human folly

Peter van ‘t Riet edited Vredenduin (1991) “De geschiedenis van positief en negatief“, Wolters-Noordhoff, on the history of positive and negative numbers. Van ‘t Riet allows himself a concluding observation:

“Kijken wij er achteraf op terug, dan kan een gevoel van verwondering opkomen, dat begrippen die ons zo vanzelfsprekend en helder lijken, zo’n lange ontwikkelingsgeschiedenis hebben gehad waarin vooruitgang, terugval en nieuwe vooruitgang elkaar afwisselden. Opmerkelijk is dat begrippen zich soms pas echt ontwikkelen als zij bevrijd worden van een dominerende idee die eeuwenlang hun ontwikkeling in de weg stond. Dat is bij de negatieve getallen het geval geweest met de geometrisering van de algebra: de gedachte dat getallen representanten waren van meetkundige grootheden is eeuwen achtereen een obstakel geweest teneinde tot een helder begrip van negatieve getallen te komen. Achteraf vraag men zich af: hoe was het mogelijk dat eeuwenlang deze idee de algebra bleef domineren?” (p121)

Since we sometimes check Google Translate for the fun ways of its expressions, it is nice to let the machine speak again:

If we look afterwards back, then bring up a sense of wonder that concepts which seem to us so obvious and clear, have had such a long history in which progress, relapse and further progress alternating. Remarkably concepts sometimes only really develop as they freed from a dominant idea that for centuries had their development path that is in the negative numbers was the case with the geometrization of algebra:. the idea that numbers representatives were of geometric quantities is centuries successively been an obstacle in order to achieve a clear understanding of negative numbers retrospect one question himself:. how was it possible that for centuries the idea continued to dominate the algebra?” (Google Translate)

Just to be sure: analytic geometry has the number line with negative numbers too. Van ‘t Riet means the line section, that always has a nonnegative length.

A step to answering his question is that mathematicians focus on abstraction, whence they are more guided by their own concepts rather than by empirical applications or the observations in didactics. I included this quote in the hope that group theorists reading this will again grow aware of human folly, and realise that they should support empirical didactics and not block it.

More sources for confusion on formats

More noise is generated by the different “number formats” that have been developed over the course of history. We have forms 2 + ½ = 2½ = 5 / 2 = 25 / 10 = 2.5 = 2 + 2-1 (neglecting the Egyptians and such). We should not forget that the decimals are actually also a form or result of division. Another example is 0.365 = 3 / 10 + 6 / 100 + 5 / 1000. Only the infinite decimals present a problem, since then we need an infinite series of divisions, yet this can be solved. The various formats have their uses, and thus education must teach students what these are.

An approach might be to only use numbers in decimal notation. However, the expression 1 / 3 is often easier than 0.33333…. Students must learn algebra. Compare 1 / 2 + 1 / 3 with 1 / a + 1 / b.

“But to understand algebra without ever really understood arithmetic is an impossibility, for much of the algebra we learn at school is a generalized arithmetic. Since many pupils learn to do the manipulations of arithmetic with a very imperfect understanding of the underlying principles, it is small wonder that mathematics remain a closed book to them.” (Skemp, p35)

The KNAW 2009 study on arithmetic education and its evidence and research is invalid. It forgot that pupils in elementary school have to learn particular algorithms in arithmetic in preparation for algebra in secondary education. It scored answers to sums as true / false and didn’t assign points to the intermediate steps, so that pupils who used trial and error also had the option to score well. In a 2011 thesis on the psychometrics of arithmetic, the word “algebra” isn’t mentioned, and various of its research results are invalid. There is a rather big Dutch drama on failure of education on arithmetic, failure of supervision, and breaches of integrity of science.

Irrational numbers started as a ratio. Consider a triangle with perpendicular sides 1 and then consider the ratio of the hypothenuse to one of those sides. The input √2 : 1 reduces to number √2.

Standard form for the rationals

There are students who do 2 + ½ = 2½ = 2 ½ = 1, because in handwriting there might appear to be a space that indicates multiplication, compare 2a or 2√2 or 2 km where such a space can be inserted without problem. See the earlier weblog text how Jan van de Craats tortures students. A proposal of mine since 2008 is to use 2 + ½ and stop using 2½.

Yesterday I discovered Poisard & Barton (2007) who compare the teaching of fractions in France and New Zealand, and who also advise 2 + ½. The German wikipedia has also a comment on the confusing notation of 2½. I haven’t looked at the thesis by Rollnik yet.

For a standard form for the rationals, the rules are targeted at facilitating the location on the number line, while we distinguish the operation minus from the sign of a negative number (as -2 = negative 2).

  1. If a rational number is equal to an integer, it is written as this integer, and otherwise:
  2. The rational number is written as an integer plus or minus a quotient of natural numbers.
  3. The integer part is not written when it is 0, unless the quotient part is 0 too (and then the whole is the integer 0).
  4. The quotient part has a denominator that isn’t 0 or 1.
  5. The quotient part is not written when the numerator is 0 (and then the whole is an integer).
  6. The quotient part consists of a quotient (form) with an (absolute) value smaller than 1.
  7. The quotient part is simplified by elimination of common primes.
  8. When the integer part is 0 then plus is not written and minus is transformed into the negative sign written before the quotient part.
  9. When the integer part is nonzero then there is plus or minus for the quotient part in the same direction as the sign of the integer part (reasoning in the same direction).

Thus (- 2 – ½) = (-3 + ½) but only the first is the standard form.

PM 1. Mathematica has the standard form 5 / 2. Conquest of the Plane p54 provides the routine RationalHold[expr] that puts all Rational[x, y] in expr into HoldForm[IntegerPart[expr] + FractionalPart[expr]].

PM 2. Digits are combined into numbers, so that we don’t have 28 = 2 * 8 = 16 = 6. Nice is:

“For example, 7 (4 + a) is equal to 28 + 7a and no 74 + 7a.” (Skemp, p230)

H = -1

A new suggestion is to use = -1. Then we get 2 + ½ = 2 + 2H= 5 2H. Pierre van Hiele (1973) suggested to abolish fractions as we know them. He observed that y / x is a tedious notation, and students have to learn powers anyhow. I agree that the notation y / x generates so-called “mathematics” which is no real mathematics but only is forced by the notation. Using the power of -1 can be confusing because students might think of subtraction, but the use of (abstract) H for the inverse clinches it. See here and my sheets for a workshop of NVvW November 2016.

Above quotient form then becomes (y xH) and the dynamic quotient (y xD), in which the brackets may be required in the dynamic case to indicate the scope of the simplification process.

There are students who struggle with a – (-b) = a – (-1) b, perhaps because subtraction actually is a form of multiplication. Curiously, this is another issue of inversion that is made easier by using H, with a – (-b) = a H b = a + H H b = a + b. See the last weblog entry that division is repeated subtraction. The only requirement is that each number has also an inverse, zero excluded, so that these inverses can be subtracted too. For example 4 3H = (3 + 1) 3H = 1 + 3H translates as repeated subtraction (not for the classroom but for reasons of current exposition):

4 – (1 + 3H) – (1 + 3H) – (1 + 3H) = 4 – 3 (1 + 3H) = 4 – 3 – 3 3H = 4 – 3 – 1 = 0

Group theory is for numbers. It is not for education on number formats

The last weblog entry on group theory showed that group theory concentrates on numbers, whence it (cowardly) avoids the perils of education on the various number formats.

Group theory mathematicians will tend to say that 1 / 2 = 2 / 4 = 50 / 100 = .. .are all member of the same “equivalence class” of the number 1 / 2, whence their formats are no longer interesting and can be neglected.

In itself it is a laudable achievement that mathematics has developed a framework that starts with the natural numbers, extends with negative integers, develops the rationals, and finally creates the reals (and then more dimensions). This construction comes along with algorithms, so that we know what works and what doesn’t work for what kind of number. For example, there are useful prime numbers, that help for simplifying rationals. For example 3 * (1 / 3) = 1 whence 3 * 0.3333… = 0.9999… = 1.000… = 1. (Thus the decimal representation is not quite unique, and this is another reason to keep on using rational formats (when possible).)

When these group theory research mathematicians design a training course for aspiring teachers of mathematics, they tend to put most emphasis on group theory, and forget about the various number formats. This has the consequences:

  • Teachers from their training become deficient in knowledge about number formats (e.g. Timothy Gowers’s article), even though those are more relevant to teachers because these are relevant for their students.
  • There is also conditioning for a future lack of knowledge. The aspiring teachers are trained on abstraction and they will tend to grow blind on the problems that students have when dealing with the various formats.
  • All this supports the delusion:

“We should teach group theory so that the students will have less problems with the algebra w.r.t. the various number formats. (For, they can neglect much algebra, like we do, since most forms are all in the same equivalence classes.)” (No quote)

Bas Edixhoven chairs the delusion

Bas Edixhoven (Leiden) is chair of the executive board of Mastermath, a joint Dutch universities effort for the academic education of mathematicians. They also do remedial teaching for students who want to enroll into the regular training for teacher of mathematics but who have deficiencies in terms of mathematics. Think about a biologist who wants to become a teacher of mathematics. For those students the background in empirical science is important, because didactics is an empirical science too. Such students are an asset to education, and they should not be scared away by treating them as if they want to become research mathematicians. Obviously there are high standards of mathematical competence, but this standard is not the same as for doing research in mathematics.

  • The “Foundations” syllabus for remedial teaching 2015 written by Edixhoven indeed looks at group theory with the neglect of number formats. The term “fraction” (Dutch “breuk”) is used without definition, while there is also the expression “fraction form” (Dutch “breukvorm”). I get the impression that Edixhoven uses fraction and fraction format as identical. Perhaps he means the procept ? The fractions are not the rationals since apparently π / 2 has a fractional form too.
  • At a KNAW conference in 2014 on the education of arithmetic Edixhoven presented standard group theory, presumably thinking that his audience had never heard about it and hadn’t already decided that its role for non-university education is limited. Edixhoven insulted his audience (including me) by not first studying what didacticians like Skemp had already said before about group theory in education.

I find it quite bizarre that mathematics courses at university for training aspiring teachers would neglect the number formats and treat these (remedial) student-teachers as if they want to become research mathematicians. Obviously I cannot really judge on this since I am no research mathematician so that I don’t know what it takes to become one. I only know that I have a serious dislike of it. Yet, the group theory taught is out of focus for what would be helpful for mathematics for teaching mathematics.

PM 1. The Edixhoven 2014 approach at KNAW fits Van Hiele (1973) who also suggests to have a bit of group theory in highschool. Yet, there is the drawback of confusion about the power -1 that students might read as subtraction. I would agree on this idea of having some group theory, but with the use of H = -1 and not without it. Let us first introduce the universal constant H = -1, thus also in elementary school where pupils should learn about division, and then proceed with some group theory in junior highschool.

PM 2. Edixhoven wrote this “Foundations” syllabus together with Theo van den Bogaard who wrote his thesis with Edixhoven. Van den Bogaard has only a few years of experience as teacher of mathematics. Van den Bogaard was secretary of a commission cTWO that redesigned mathematics education in Holland, with a curious idea about “mathematical think activities” (MTA). Van den Bogaard has an official position as trainer of teachers of mathematics but failed to see the error by the psychometrians in the KNAW 2009 study on education on arithmetic. I informed him about my comments on cTWO, MTA and KNAW 2009 but he didn’t respond. Now there is the additional issue of this curious “Foundations” syllabus. Four counts down on didactics and still training aspiring teachers.

Letter to Mastermath

These and other considerations caused me to write this letter to Mastermath.

The following indicates that research mathematicians can have their own subgroups or individuals who meddle with education. None is qualified for education, and one wonders whether they can keep each other in check.

Research mathematicians are at a distance from didactics

Research mathematicians may develop a passion for education and interfere in education, and then start to invent their own interpretations, and then teach those to elementary schools and their aspiring teachers. These mathematicians are not qualified for primary education and apparently think that elementary school allows loose standards (since they can observe errors indeed). Then we get the blind (research mathematicians) helping the deaf (elementary school teachers), but the blind can also be arrogant, and lead the two of them into the abyss.

A September 2015 protest concerned Jan van de Craats, now emeritus at UvA. For the topic of division, his name pops up again. In this lecture on fractions for a workshop of 2010 for primary education Van de Craats for example argues as follows (my translation). It is unfair to have criticism on this since these are only sheets. Yet, even sheets should have a consistent set of definitions behind them. These sheets contribute to confusion. Remember that I didn’t give a definition of “fraction”, and that I propose an abolition of what many people apparently call “fraction”.

  • Sheet 3: “Three sorts of numbers: integers, decimals, fractions”.
    (a) The main problem is the word “sort”. If he merely means “form” (with the decimals as the standard form that gives “the” number) then this is okay, but if he means that there are really differences (as in group theory) then this is problematic. A professor of mathematics should try to be accurate, and I don’t see why Van de Craats regards “sorts of” as accurate.
    (b) If he identifies fractions with the rationals (but see sheet 26) then we might agree that Z Q ⊂ R, though there are group theorists who argue that these are different number systems, and it is not clear whether Van de Craats would ask the group theorists not to meddle in education as he himself is doing.
    (c) My answer: for education it seems best to stick to “various forms, one number (for standard form)”.
  • Sheet 30: “A fraction is the outcome of a division.”
    (a) As fraction is a number (Sheet 3), presumable 8 : 4 → 4 / 2 might be acceptable: (i) It is an outcome, (ii) the answer is numerically correct (as it belongs to the equivalence class), (iii) there is no requirement on a standard form (here).
    (b)This doesn’t imply the converse, that the outcome of a division is always a fraction. Then it is either an integer (but then also a fraction (Sheet 25)) or decimal (but then also fraction (Sheet 26)). Thus fraction iff outcome from division.
    (c) PM. My definition was: “Ratio is the input of division. Number is the result of division, if it succeeds.” (COTP p51), which doesn’t define number but distinguishes input and output.
  • Sheet 8: “Cito doesn’t test (mixed) fractions anymore in the primary school final examination.” As an observation this might be correct, but if Van de Craats had had proper background in didactics, then he should have been able to spot the error by the psychometricians in the KNAW 2009 report, which should have been sufficient to effect change, instead of setting up this “course in fractions” (that he isn’t qualified for).
  • Sheet 18: Pizza model. Didactics shows that students find this difficult. Use a rectangle.
  • Sheet 25: “Integers are also fractions (with denominator 1).” On form, students must know the difference between integers and fractions (whatever those might be, see Sheet 30). The answer of (3 – 1) / (2 – 1) = ? better be 2 and not 2 / 1 because the latter can be simplified.
  • Sheet 26: “Decimals are also fractions.” Thus fractions are not the rational numbers. The example is that √2 is irrational, also in decimal expansion (a “fraction”). Van de Craats apparently holds fractions and the decimals as identical, only written in different form. Thus also an infinite sum of fractions still is a fraction. A fraction is not just the form of the quotient as defined in Conquest of the Plane and above (though perhaps it can be written like this ?).
  • Sheet 27: “However, not all fractions are also decimals.” This is a mystery. There are only three “sorts of” numbers, and w.r.t. Sheet 30 we found that fraction iff division, and all numbers should be divisible by 1. Also, the real numbers contain all numbers we have seen till now (not the complex numbers). Thus there would be phenomena called “fractions” (but still numbers, not algebra) not in the reals ? It cannot be 0 / 0 since the latter would be a result that cannot be accepted. Division 0 : 0 might be a proper question with the answer that the result is undefined. Perhaps he means to say that “1 / 2” doesn’t have the form of “0.5”, and that the expressions differ ? But then we are speaking about form again, and Van de Craats spoke about “sorts of numbers” and not about “same numbers with different forms”.
  • Sheet 28: “This course doesn’t offer an one-to-one-model for discussion at school.” It sounds modest but I don’t know what this means. Perhaps he means that the sheets aren’t a textbook.
  • Sheet 30: “A fraction is the outcome of a division.”  (I moved this up.)
  • Sheet 33: “4 : 7 = 4 / 7”. Apparently the ” : ” stands for the operation of division and “4 / 7” for the result. Apparently Van de Craats wants to get rid of the procept. The equality sign cannot mean identically the same, because otherwise there would be no difference between input and output. Is only 4 / 7 the right answer or is 8 / 14 allowed too ? Perhaps one can teach students that 4 : 7 is a proper question and that 8 / 14 is unacceptable since this must be 4 / 7. However, 4 : 1 would be a proper question too, and then Van de Craats also argues that 4 / 1 would be a fraction (and result of division).
  • Sheet 65: “Actually 2 4/5 means 2 + 4/5.” (Van de Craats read an article of mine.) It would have been better if he stated that the first is a horrible convention, and that he proceeded with the second. He calls the form a “mixed fraction” while the English has “mixed number“. Lawyers might have to decide whether “fractions are numbers” implies that a “mixed fraction” is also a “mixed number”.

If a professor of mathematics becomes confused on such an “elementary (school)” issue of fractions (I still don’t know that is meant by this), why would the student believe that anyone can master this apparently superhumanly difficult subject ?

Will the ivory tower stop the blind ?

Would research mathematicians who do group theory be able to correct Van de Craats ?

Let us consider Bas Edixhoven again, see again his sheets.

Or would Edixhoven argue that he himself looks at natural numbers, integers, rationals and reals, so that he has no view on “fractions”, as apparently defined by Van de Craats ? Though the “Foundations” syllabus refers to the word without definition and Edixhoven might presume that aspiring teachers of mathematics know what those fractions are.

Edixhoven in the 2014 lecture only suggests that there better be more proofs and axiomatics in the highschool programme, and he gives the example of a bit of group theory for arithmetic. He also explains  modestly that he speaks “from his own ivory tower” (quote). Thus we can only infer that Edixhoven will remain in this ivory tower and will not stop the blind (but also arrogant) Van de Craats from leading (or at least trying to lead) the deaf (elementary school teachers) into the abyss.

However, professor Edixhoven also left the ivory tower and and joined the real world. At Mastermath he is involved in training aspiring teachers. Since February 2015 he is member of the Scientific Advisory Board of the mathematics department of the University of Amsterdam, where professor Van de Craats still has his homepage with this confusing “course on fractions”. I informed this board in Autumn 2015 about the problematic situation that Van de Craats propounds on primary and secondary education but is not qualified for this. I have seen no correction yet. Apparently Edixhoven doesn’t care or is too busy scaring aspiring teachers away. Apparently, when a teacher of mathematics criticises him, then this teacher obviously must be deficient in mathematics, and should follow a course for due indoctrination in the neglect of didactics of mathematics.

Jan van de Craats, Workshop 2010, page 28

Jan van de Craats, Workshop 2010, page 28

When we take a ring and include division then we get a field For example, the integers Z = { … -3, -2, -1, 0, 1, 2, 3, … } form a ring, and with division we get the rational numbers Q and also (with completion) the real numbers R. These are concepts from “group theory“. I have always wondered what the use of this group theory actually is.

The change from ring Z to field R is not quite the inclusion of division – since the ring already has implied division namely as repeated subtraction – but the change consists of extending the set with “accepted numbers” by inverse elements xH for H = -1. In that case the results of division are also included in the same set. In terms of Z the expression 2H is not a number, but for Q and R we accept this.

If the ring has variables and expressions, then we can form the expression 1 = 2 z, and we effectively have z = 2H, and then we might wonder whether it actually matters much whether this z belongs to Z or not.

Part of the confusion in this discussion is caused by that we might regard 2H as the operation 1 / 2, while we might also regard it as the number. Thus when some people say that the difference between the ring and the field concerns the operation of division, another perspective is that the field already has an implied notion of division but merely lacks the numbers to fit all answers.

The discussion within group theory might be a victim of the phenomenon of the procept. When the discussion is confused, perhaps group theory itself is confused. We should get enhanced clarity by removing the ambiguity of operation and result, but perhaps textbooks then become thicker.

Subsequently, we get a distinction between:

  • Mathematics for which group theory isn’t so relevant – such that there is a logical sequence from natural numbers to integers, to rationals, to reals, to multidimensional reals, for, all is implied by logic and algebra, and only the end result matters,
  • Mathematics for models for which group theory is relevant – i.e. for models for which it is crucial that e.g. Z has no z such that 1 = 2 z. The crux lies in the elements of the sets, as the operations themselves are actually implied.

A model might be the number of people. Take an empty building. A biologist, physicist and mathematician watch the events. Two people enter the building, and some time later three people leave the building. The biologist says: “They have reproduced.” The physicist says: “There was a quantum fluctuation.” The mathematician says: “There is -1 person in the bulding.”

The following develops the example of implied division. This discussion has been inspired by both the recent discussion of the “ring of polynomials” (thus without division but still with divisor and remainder) and the observation that “realistic mathematics education” (RME) allows students to avoid long division and allows “partial quotients” (repeated subtraction).

An example from Z, the integers

Z rewrites repeated addition 3 + 3 + 3 + 3 = 12 as multiplication 4 * 3 = 12.

Z allows the converse 12 – 3 – 3 – 3 – 3 = 0 and also the expression 12 – 4 * 3 = 0.

Z doesn’t allow the rewrite of the latter into 12 / 4 = 3.

Yet 12 – 4 * 3 = 0 gives the notion of “implied division”, namely, find the z such that 12 – 4 * z = 0.

This notion of “implied division” is well defined, but the only problem is that we cannot find a number in that satisfies 1 – 2z = 0.

If we extend Z with basic elements nH for n ≠ 0 then we can find a z that satisfies 1 = 2z but the extension generates a new set of elements that we call Q, the rational numbers. Since we cannot list all these numbers, it is not irrational of mathematicians to say that they actually include the operation itself.

The following discusses this with formulas.

A ring has implied division

Multiplication is repeated addition. The ring of integers has the notion of subtraction. Define “implied division” of y by x as the repeated subtraction from y of some quantity z, for x times with remainder 0. For x ≠ 0:

y – x z = 0                   (* definition)

To refer to this property, we use abstract symbol H, though we later use H = -1.

xH y =  z    ⇔    y = x z          (** notation)

For x itself:

xH x = x xH = 1

For zero

We have 0 z = 0 for all z in the ring. Then for implied division by zero we have:

y – 0 z = 0    ⇒   y = 0

 As above, for y = 0:

00 z = 0   for any z

0H 0 = z    for any z

Thus the rule is: For implied division within the ring, the denominator cannot be 0, unless the numerator is 0 too, in which case any number would satisfy the equation.

This is not necessarily “infinity” or “undefined” but rather “any z in Z“. The solution set is equal to Z itself. There is a difference between functions (only one answer) and correspondences (more answers).

Compare to the common definition

A ring is commonly turned into a field by including the normal definition of division:

  x ≠ 0     ⇒     xH x x xH = 1 

With this definition we get (multiplying left or right):

xH y  ⇔     x xH y x z     ⇔    y = x z

The curious observation is that a definition of division seems superfluous, since we already have implied division. The operation (*) already exists within the ring. We included a special notation for it, but this should not distract from this basic observation. If you have a left foot then it doesn’t matter whether you call it George or Harry.

An aspect is the algorithm

The natural numbers can be factored into prime numbers. When we solve 6 / 3 = 2, then we mean that 6 can be factored as 2 times 3, and that we can eliminate the common factor.

6 / 3 = z    ⇔    6 = 3    ⇔   2 3 = 3 z    ⇔   3 (2 – z) = 0     ⇔   

3 = 0   or    (2 – z) = 0         

But, again, this algorithm doesn’t work for a case like 1 = 2 z.

The “problem” are the elements

Let us consider the implied division of 1 by 2. This generates:

2H 1 = z

2H = z

1 = 2 z

Thus we don’t actually need to know what this z is, since we have the relevant expressions to deal with it.

The point is: when we run through all elements in Z = { … -3, -2, -1, 0, 1, 2, 3, … }, then we can prove that none of these satisfies 1 = 2 z.

Thus the core of group theory are the elements of the sets, and less the operations, since these are implied.

The basic notion is that 0 has successor 1 = s[0], and so on, and this gives us N. That 0 is a predecessor of s[0] generates the idea of inversion that s[H] = 0. This gives us Z. Addition leads to subtraction, to multiplication, to division. The core of addition doesn’t change, only the “numbers”.

Thus, group theory might have a confusing language that focuses on the operations, while the actual discussion is about the numbers (since the operations are already available and implied).

The fundamental impact of algebra

Thus, once we accept algebra, then the real numbers can be developed logically, and it is a bit silly to speak about “group theory”, since there are only steps, and all is implied. It only makes sense for applications to models, such as the notion that there aren’t half people and such.

It remains relevant that some algorithms may only apply to some domains and not others. Factoring natural numbers into prime numbers still works for the natural numbers embedded in the reals, yet, it is not clear whether such a notion of factoring would be relevant for other real numbers.

Appendix. Potential extension with an inverse for zero ?

We might consider to include the element 0H in the ring, to create 〈ring, 0H〉.

(1) If we maintain that 0 z = 0 for all z in 〈ring, 0H〉 then:

0H 0 = 0   with 0H in 〈ring, 0H

Observe that this is not a deduction, but a definition that 0 z = 0 for all z.

One viewpoint is that there is a conflict between “any z” and “only z = 0″ so that we cannot adopt this definition. Another viewpoint is that the latter uses the freedom of the former.

(2) When we write 0H as ∞ then it might be clearer that 0H 0 remains a problematic form.

If we create the 〈ring, 0H〉, then we might also hold: 0 z = 0 for all numbers except 0H. In that case, the result is maintained that

0H 0 = z    for any z

(3) An option is to slightly revise the definition as repeated subtraction by z until the remainder equals that very quantity z again. Thus:

y – (x – 1) z = z                   (*** definition 2)

xH y = y – (x – 1) z = z                  (**** definition and notation 2)

For = 0 we would now use z – z = 0 which might be less controversial.

0H y = y – (0 – 1) z = z

yz – z = 0

0H y = 0H 0 z

However, the more common approach is that 0H isand that is undefined too, while we cannot exclude that the answer would be z∞.

PM. Partial quotients

PM. See also the earlier discussion on this weblog.

I wouldn't want to be caught before a blackboard like that (Screenshot UChicago)

I wouldn’t want to be caught before a blackboard like that (Screenshot UChicago)

Our protagonists are Cartesius (1596-1650) and Fermat (1607-1665). As Judith Grabiner states, in a recommendable text:

“One could claim that, just as the history of Western philosophy has been viewed as a series of footnotes to Plato, so the past 350 years of mathematics can be viewed as a series of footnotes to Descartes’ Geometry.”  (Grabiner) (But remember Michel Onfray‘s observation that followers of Plato have been destroying texts by opponents. (Dutch readers check here.))

Both Cartesius and Fermat were involved in the early development of calculus. Both worked on the algebraic approach without limits. Cartesius developed the method of normals and Fermat the method of adequality.

Fermat and Δf / Δx

Fermat’s method was algebraic itself, but later has been developed into the method of limits anyhow. When asked what the slope of a ray y = s x is at the point x = 0, then the answer y / x = s runs into problems, since we cannot use 0 / 0. The conventional answer is to use limits. This problem is more striking when one considers the special ray that is defined everywhere except at the origin itself. The crux of the problem lies in the notion of slope Δf / Δthat obviously has a problematic division. With set theory we can now define the “dynamic quotient”, so that we can use Δf // Δx = s even when Δx = 0, so that Fermat’s problem is resolved, and his algebraic approach can be maintained. This originated in 2007, see Conquest of the Plane (2011).

Cartesius and Euclid’s notion of tangency

Cartesius followed Euclid’s notion of tangency. Scholars tend to assign this notion to Cartesius as well, since he embedded the approach within his new idea of analytic geometry.

I thank Roy Smith for this eye-opening question:

“Who first defined a tangent to a circle as a line meeting it only once? From googling, it seems commonly believed that Euclid did this, but it seems nowhere in Euclid does he even state this property of a tangent line explicitly. Rather Euclid gives 4 other equivalent properties, that the line does not cross the circle, that it is perpendicular to the radius, that is a limit of secant lines, and that it makes an angle of zero with the circle, the first of which is his definition, the others being in Proposition III.16. I am wondering where the “meets only once” definition got started. I presume once it got going, and people stopped reading Euclid, (which seems to have occurred over 100 years ago), the currently popular definition took over. Perhaps I should consult Legendre or Hadamard? Thank you for any leads.” (Roy Smith, at StackExchange)

In this notion of tangency there is no problematic division, whence there is no urgency to use limits.

The reasoning is:

  • (Circle & Line) A line is tangent to a circle when there is only one common point (or the two intersecting points overlap).
  • (Circle & Curve) A smooth curve is tangent to a circle when the  two intersecting points overlap (but the curve might cross the circle at that point so that the notion of “two points” is even more abstract).
  • (Curve & Line) A curve is tangent to a line when the above two properties hold (but the line might cross the curve, whence we better speak about incline rather than tangent).
Example of line and circle

Consider the line y f[x] = c + s x and the point {a, f[a]}. The line can also be written with c = f[a] – s a:

y – f[a] = s (x a)

The normal has slope –sHwhere we use = -1. The formula for the normal is the line y – f[a] = –sH  (xa). We can choose the center of the circle anywhere on this line. A handy choice is {u, 0}, so that we choose the center on the horizontal axis. (If we looked at a ray and point {0, 0}, then the issue would be similar for {0, c} for nonzero c and thus the approach remains general.) Substituting the point into the normal gives

0 – f[a] = –sH  (ua)

s = (u – a) / f[a]

u + s f[a]

The circle has the formula (x u)² + y² = r². Substituting {a, f[a]} generates the value for the radius r² = (a – (a + s f[a]))² + f[a]² = (1 + s²) f[a]² . The following diagram has {c, s, a} = {0, 2, 3} and thus u = 15 and r = 6√5.


descartesMethod of normals

For the method of normals and arbitrary function f[x], Cartesius’s trick is to substitute y = f[x] into the formula for the circle, and then solve for the unknown center of the circle.

(x u)² + (y – 0)² = r²

(x u)² + f[x]² – r² = 0         … (* circle)

This expression is only true for x = a, but we treat it as if it were more general. The key property is:

Since {a, f[a]} satisfies the circle, this equation has a solution for x = a with a double root.

Thus there might be some g such that the root can be isolated:

(x ag [x, u] = 0         … (* roots)

Thus, if we succeed in rewriting the formula for the circle into the form of the formula with the two roots, then we can use information about the structure of the latter to say something about u.

The method works for polynomials, that obviously have roots, but not necessarily for trigonometry and the exponential function.


The algorithm thus is: (1) Substitute f[x] in the formula for the circle. (2) Compare with the expression with the double root. (3) Derive u. (4) Then the line through {a, f[a]} and {u, 0} will give slope –sH. Thus s = (ua) / f[a] gives the slope of the incline (tangent) of the curve. (5) If f[a] = 0, add a constant or choose center {u, v}.

Application to the line itself

Consider the line y f[x] = c + s x again. Let us apply the algorithm. The formula for the circle gives:

(x u)² + (c + s x)² – r² = 0

x² – 2ux + u² + c² + 2csx + s²x² – r² = 0

(1 + s²) x² – 2 (u cs) x +  u² + c² – r² = 0

This is a polynomial. It suffices to choose g [x, u] = 1 + s²  so that the coefficients of are the same. Also the coefficient of must be the same. Thus expanding (xa)²:

(1 + s²) (x² – 2ax +  a²) = 0

– 2 (u cs)  = -2 a (1 +)

u = a (1 +) + cs = a + s (c + sa) = a + s f[a]

which is the same result as above.

A general formula with root x – a

We can deduce a general form that may be useful on occasion. When we substitute the point {af[a]} into the formula for the circle, then we can find r, and actually eliminate it.

(x u)² + f[x]² = r² = (a u)² + f[a

f[x f[a = (a u)² – (x u

(f[x] f[a](f[x] + f[a])  = ((a u) – (x u))  ((a u) + (x u))

(f[x] f[a](f[x] + f[a]) = (a x)   (a + x 2u)

f[x] f[a]  = (a x)  (a + x 2u) / (f[x] + f[a])

f[x] f[a]  = (x a)  (2u – x – a) / (f[x] + f[a])       … (* general)

f[x] f[a]  = (x a) q[x, a, u]

We cannot do much with this, since this is basically only true for x = a and f[x] – f[a] = 0. Yet we have this “branch cut”:

(1)      q[x, a, u] = f[x] – f[a]  / (a x)        if x ≠ a

(2)      q[a, a, u]      potentially found by other means

If it is possible to “simplify” (1) into another expression Simplify[q[x, a, u]] without the division, then the tantalising question becomes whether we can “simply” substitute x = a. Or, if we were to find q[a, a, u] via other means in (2), whether it links up with (1). These are questions of continuity, and those are traditionally studied by means of limits.

Theorem on the slope

We can still use the general formula to state a theorem.

Theorem. If we can eliminate factors without division, then there is an expression q[x, a, u] such that evaluation at x = a gives the slope s of the line, or q[a, a, u] = s, such that at this point both curve and line are touching the same circle.

Proof. Eliminating factors without division in above general formula gives:

q[x, a, u] (2u – x – a) / (f[x] + f[a])

Setting x = a gives:

q[a, a, u] = (u – a) / f[a]

And the above s = (u – a) / f[a] implies that q[a, a, u] = s. QED

This theorem gives us the general form of the incline (tangent).

y[x, a, u] = (x – a) q[a, a, u] + f[a]       …  (* incline)

y[x, a, u] = (x – a) (u – a) / f[a] + f[a

PM. Dynamic division satisfies the condition “without division” in the theorem. For, the term “division” in the theorem concerns the standard notion of static division.

Corollary. Polynomials as the showcase

Polynomials are the showcase. For polynomials p[x], there is the polynomial remainder theorem:

When a polynomial p[x] is divided by (x a) then the remainder is p[a].
(Also, x – a is called a “divisor” of the polynomial if and only if p[a] = 0.)

Using this property we now have a dedicated proof for the particular case of polynomials.

Corollary. For polynomials q[a] = s, with no need for u.

Proof. Now, p[x] – p[a] = 0 implies that – is a root, and then there is a “quotient” polynomial q[x] such that:

p[x] – p[a] = (x a) q[x]

From the general theorem we also have:

p[x] – p[a]  = (x a) q[x, a, u]

Eliminating the common factor (x – a) without division and then setting x = a gives q[a] = q[a, a, u] = s. QED

We now have a sound explanation why this polynomial property gives us the slope of the polynomial at that point. The slope is given by the incline (tangent), and it must also be slope of the polynomial because of the mutual touching of the same circle.

See the earlier discussion about techniques to eliminate factors of polynomials without division. We have seen a new technique here: comparing the coefficients of factors.

Second corollary

Since q[x] is a polynomial too, we can apply the polynomial remainder theorem again, and thus we have q[x] = (x a) w[x] + q[a] for some w[x]. Thus we can write:

p[x] = (x a) q[x] + p[a

p[x] = (x a) ( (x – a) w[x] + q[a] ) + p[a]       … (* Ruffini’s Rule twice)

p[x] = (x a w[x] + (x – a) q[a] + p[a]           … (* Range’s proof)

p[x] = (x a w[x] + y[x, a]                             … (* with incline)

We see two properties:

  • The repeated application of Ruffini’s Rule uses the indicated relation to find both s = q[a] and constant f[a], as we have seen in last discussion.
  • Evaluating f[x] / (x a)² gives the remainder y[x, a], which is the formula for the incline.
Range’s proof method

Michael Range proves q[a] = s as follows (in this article (p406) or book (p32)). Take above (*) and determine the error by substracting the line y = s (x a) + p[a] :

error = p[x] – y = (x a w[x] + (x – a) q[a] – s (x a)

= (x a w[x] + (x – a) (q[a] – s)

The error = 0 has a root x = a with multiplicity greater than one if and only if s = q[a].

Direct application to the incline itself

Now that we have established this theory, there may be no need to refer to the circle explicitly. It can suffice to use the property of the double root. Michael Range (2014) gives the example of the incline (tangent) at x² at {a, a²}. The formula for the incline is:

f[x] – f[a]  = s (x – a)

x² a² – s (x – a) = 0

 (x – a) (x + a s) = 0

There is only a double root or (xa)² when s = 2a.

Working directly on the line allows us to focus on s, and we don’t need to determine q[x] and plug in x = a.

Michael Range (2011) clarifies – with thanks to a referee – that the “point-slope” form of a line was introduced by Gaspard Monge (1746-1818), and that Descartes apparently did not think about this himself and thus neither to plug in y = f [x] here. However, observe that we only can maintain that there must be a double root on this line form too, since {a, f[a]} still lies on a tangent circle.

[Addendum 2017-01-10: The later argument in a subsequent weblog entry becomes: If the function can be factored twice, then there is no need to refer to the circle. But when this would be equivalent to the circle then such a distinction is immaterial.]

Addendum. Example of function crossing a circle

When a circle touches a curve, it still remains possible that the curve crosses the circle. The original idea of two points merging together into an overlapping point then doesn’t apply anymore, since there is only one intersecting point on either side if the circle were smaller or bigger.

An example is the spline function g[x] = {If x < 0 then 4 – x² / 4 else 4 + x² / 4}. This function is C1 continuous at 0, meaning that the sections meet and that the slopes of the two sections are equal at 0, while the second and higher derivatives differ. The circle with center at {0, 0} and radius 4 still fits the point {0, 4}, and the incline is the line y = 4.


An application of above algorithm would look at the sections separately and paste the results together. Thus this might not be the most useful example of crossing.

In this example there might be no clear two “overlapping” points. However, observe:

  • Lines through {0, 4} might have three points with the curve, so that the incline might be seen as having three overlapping points.
  • Points on the circle can always be seen as the double root solutions for tangency at that point.
Addendum. Discussion

There is still quite a conceptual distance between (i) the story about the two overlapping points on the circle and (ii) the condition of double roots in the error between line and polynomial.

The proof given by Range uses the double root to infer the slope of the incline. This is mathematically fine, but this deduction doesn’t contain a direct concept that identifies q[a] as the slope of an incline (tangent): it might be any line.

We see this distinction between concept and algorithm also in the direct application to Monge’s point-slope formulation of the line. Requiring a double root works, but we can only do so because we know about the theory about the tangent circle.

The combination of circle and line remains the fundamental reason why there are two roots. Thus the more general proof given above, that reasons from the circle and unpacks f[x]² – f[a]² into the conditions for incline and its normal, is conceptually more attractive. I am new to this topic and don’t know whether there are references for this general proof.


(1) We now understand where the double root comes from. See the earlier discussion on polynomials, Ruffini’s rule and the meaning of division (see the section on “method 2”).

(2) There, we referred to polynomial division, with the comment: “Remarkably, the method presumes x ≠ a, and still derives q[a]. I cannot avoid the impression that this method still has a conceptual hole.” However, we now observe that we can compare the values of the coefficients of the powers of x, whence we can avoid also polynomial division.

(3) There, we had a problem that developing p[x] = (x aw[x] + y[x, a] didn’t have a notion of tangency, in terms of Δf / Δx. However, we actually have a much older definition of tangency.

(4) The above states an algorithm and a general theorem with the requirements that must be satisfied.

(5) Cartesius wins from Fermat on this issue of the incline (tangent), and actually also on providing an exact method for polynomials, where Fermat introduced the problem of error.

(6) For trigonometry and exponentials we know that these can be written as power series, and thus the Cartesian method would also apply. However, the power series are based upon derivatives, and this would introduce circularity. However, the method of the dynamic quotient from 2007 still allows an algebraic result. The further development from Fermat into the approach with limits would become relevant for more complex functions.

PM. The earlier discussion referred to Peter Harremoës (2016) and John Suzuki (2005) on this approach. New to me (and the book unread) are: Michael Range (2011), the recommendable Notices, or the book (2015) – review Ruane (2016) – and Shen & Lin (2014).

Cartesius, Portrait by Frans Hals 1648

Cartesius, Portrait by Frans Hals 1648



We continue the earlier discussion on (1) differentials and (2) polynomials. There is also this earlier discussion about (static or dynamic) division.

At issue is: Can we avoid the use of limits when determining the derivative of a polynomial ?

A sub-issue is: Can we avoid division that requires a limit ?

We use the term incline instead of tangent (line), since this line can also cross a function and not just touch it.

We use H = -1, so that we can write x xH = xH x = 1 for x ≠ 0. Check that xH = 1 / x, that the use of H is much more effective and efficient. The use of 1 / x is superfluous since students must learn about exponents anyway.

Ruffini’s Rule

Ruffini’s Rule is a method not only to factor polynomials but also to isolate the factors. A generalised version is called “synthetic division” for the reason that it isn’t actually division. On wikipedia, Ruffini’s Rule is called “Horner’s Method“. On mathworld, the label “Horner’s Method” is used for something else but related again. My suggestion is to stick to mathworld.

Thus, the issue at hand would seem to have been answered by Ruffini’s Rule already. When we can avoid division then we don’t need a limit around it. However, our discussion is about whether this really answers our question and whether we really understand the answer.

Historical note

I thank Peter Harremoēs for informing me about both Ruffini’s Rule and some neat properties that we will see below. His lecture note in Danish is here. Surprising for me, he traced the history back to Descartes. Following this further, we can find this paper by John Suzuki, who identifies two key contributions by Jan Hudde in Amsterdam 1657-1658. Looking into my copy of Boyer “The history of the calculus” now, page 186, I must admit that this didn’t register to me when I read this originally, as it registers now. We see the tug and push of history with various authors and influences, and thus we should be cautious about claiming who did what when. Suzuki’s statement remains an eye-opener.

“We examine the evolution of the lost calculus from its beginnings in the work of Descartes and its subsequent development by Hudde, and end with the intriguing possibility that nearly every problem of calculus, including the problems of tangents, optimization, curvature, and quadrature, could have been solved using algorithms entirely free from the limit concept.” (John Suzuki)

Apparently Newton dropped the algebra because it didn’t work on trigonometry and such, but with modern set theory we can show that the algebraic approach to the derivative works there too. For the discussion below: check that limits can be avoided.

Division is also a way to isolate factors

When we have 2 x = 6, then we can determine 2 x = 2 3, and recognize the common factor 2. By the human eye, we can see that x = 3 and then we have isolated the factor 3. But in mathematics, we must follow procedures as if we were a computer programme. Hence, we have the procedure of eliminating 2, which is called division:

2H 2 x = 2H 2 3

x = 3

The latter example abuses the property that 2 is nonzero. We must actually check that the divisor is nonzero. If we don’t check then we get:

4 x = 9 x

4 x xH = 9 xH 

4 = 9

Checking for zero is not as simple as it seems. Also expressions with only numbers might contain zero in hidden format, as for example  (4 + 2 – 6)H. Thus it would seem to be an essential part of mathematics to develop a sound theory for the algebra of expressions and the testing on zero.

Calculus uses the limit around the difference quotient to prevent division by zero. But the real question might rather be whether we can isolate a factor. When we can isolate that factor without division that requires a limit, then we hopefully have a simpler exposition. Polynomials are a good place to start this enquiry.

Shifting to rings without division ?

The real numbers form a “field” and when we drop the idea of division, then we get a “ring“. Above 2 x = 6 might also be solved in a ring without division. For we can do:

2 x – 2 3 = 6 – 2 3

2 (x – 3) = 0

2 = 0    or    x – 3 = 0

We again use that 2 ≠ 0. Thus x = 3.

This example doesn’t show a material difference w.r.t. the assumption of division by 2. We also used that 6 can be factored and that 2 was a common factor. Perhaps this is the more relevant notion. Whatever the case, it doesn’t seem to be so useful to leave the realm of the real numbers.

Properties of polynomials

Our setup has a polynomial p[x] with focus of attention at x = a with point {a, b} = {a, p[a]}. When we regard (xa) as a factor, then we get a “quotient” q[x] and a “remainder” r[x].

p[x] = (xa) q[x] + r[x]

It is a nontrivial issue that q and r are polynomials again (proof of polynomial division algorithm, or proofwiki). These proofs don’t use limits but assume that the divisor is nonzero. Thus we might be making a circular argument when we use that q and are polynomials to argue that limits aren’t needed. Examples can be given of polynomial long division. Such examples tend not to mention explicitly that the divisor cannot be zero. Nevertheless, let us proceed with what we have.

Since (xa) has degree 1, the remainder must be a constant, and thus be equal to p[a]. Thus the “core equation” is:

p[x] = (xa) q[x] + p[a]      …  (* core)

p[x] – p[a] = (xa) q[x]

At x = a we get 0 = 0 q[a], whence we are at a loss about how to isolate q[x] or q[a].

When we have defined derivatives via other ways, then we can check that the derivative of (*) is:

p’ [x] = q[x] + (xa) q’ [x]

p’ [a] = q[a]

We can also rewrite (*) so that it indeed looks like an difference quotient.

q[x] = (p[x] – p[a])  (xa)H       …. (** slope = tan[θ], see Spiegel’s diagram)

We cannot divide by (x a) for x = a, for this factor would be zero.

PM. In the world of limits, we could define the derivative of p at a by taking the Limit[x → a, q[x]]. This generates again (Spiegel’s diagram):

q[a] = tan[α]

But our issue is that we want to avoid limits.


The incline of the polynomial at point {a, b} = {a, p[a]} is the line, with the same slope as the polynomial.

y – p[a] = s (x a)    …  (*** incline)

The difference between polynomial and incline might be called the error. Thus:

error = p[x] – y = (p[x] – p[a]) – (y – p[a])

= (x a) q[x] – s (x a)

= (x a) (q[x] – s)

When we take s = q[a] then:

error = p[x] – y = (x a) (q[x] – q[a])

Key question

A key question becomes: can we isolate q[x] by some method ? We already have (**), but this format  contains the problematic division. Is there another way to isolate q ? There appear to be three ways. Likely these ways are essentially the same but emphasize different aspects.

Method 1. Dynamic quotient

The dynamic quotient manipulates the domain and relies on algebraic simplification. Instead of H we use D, with y xD = y // x.

q[x] = (p[x] – p[a])  (xa)D

means: we first take x ≠ a,

then take D = H, so that this is normal division again,

then simplify,

and then declare the result also valid for x = a.

The idea was presented in ALOE 2007 while COTP 2011 is a proof of concept. COTP shows that it works for polynomials, trigonometry, exponentials and recovered exponents (logarithms). For polynomials it is shown by means of recursion.

Looking at this from the current perspective of the polynomial division algorithm, then we can say that the method also works because division of a polynomial of degree n > 0 by a polynomial of degree m = 1 generates a neat polynomial of degree n m. Thus we can isolate q[x] indeed. Since q[x] is polynomial, substitution of x = a provides no problem.

The condition on manipulating the domain nicely plugs the hole in the polynomial division algorithm. It is actually necessary to prevent circularity.

Method 2. Incline

Via Descartes (and Suzuki’s article above) we understand that perpendicular to the incline (tangent) there is a line on which there is a circle that touches the incline too. This implies that (x a) must be a double root of the polynomial.

We may consider p[x] / (x a)2 and determine the remainder v[x]. The line y = v[x] then is the incline. Or, the equation of the tangent of the polynomial at point {a, p[a]}. It is relatively easy to determine the slope of this line, and then we have q[a].

Check the wikipedia example. In Mathematica we get PolynomialRemainder[x^3 – 12 x^2  – 42, (x – 1)^2, x] = -21 x – 32 indeed. At = 1, q[a] = -21.

This method assumes “algebraic ways” to separate quotient and remainder. We can find the slope for polynomials without using the limit for the derivative. Potentially the same theory is required for the simplification used in the dynamic quotient.

Remarkably, the method presumes x ≠ a, and still derives q[a]. I cannot avoid the impression that this method still has a conceptual hole.

Addendum 2017-01-11: By now we have identified these methods to isolate a factor “algebraically”:

  1. Look at the form (powers) and coefficients. This is basically Ruffini’s rule, see below. Michael Range works directly with coefficients.
  2. Dynamic quotient that relies on the algebra of expressions.
  3. Divide away nonzero factors so that only the problematic factor remains that we need to isolate. (This however is a version of the dynamic quotient, so why not apply it directly ?)

An example of the latter is p[x] = x^3 – 6 x^2 + 11 x – 6. Trial and error or a graph indicates that zero’s are at 1 and 2. Assuming that those points don’t apply we can isolate p[x] / ((x – 1) (x – 2)) = (x – 3) by means of long division. Subsequently we have identified the separate factors, and the total is p[x] = (x – 1) (x – 2) (x – 3).

Check also that “division” is repeated subtraction, whence the method is fairly “algebraic” by itself too.

Addendum 2016-12-26: However, check the next weblog entry.

PM 1. General method to find the slope

The traditional method is to use the derivative p'[x] = 3 x^2 – 24 x, find slope p‘[1] = -21, and construct the line y = -21 (x – 1) + p[1]. This method remains didactically preferable since it applies to all functions.

PM 2. Double root in error too

If p[x] = 0 has solution x = a, then the latter is called a root, and we can factor p[x] = (x a) q[x] with remainder zero.

For example, p[x] – p[a] = 0 has solution x = a. Thus p[x] – p[a] = (x a) q[x] with remainder zero.

Also q[x] – q[a] = 0 has solution x = a. Thus q[x] – q[a] = (x a) u[x] with remainder zero.

Thus the error has a double root.

error = p[x] – y = (x a)2 u[x]

Unfortunately, this insight only allows us to check a given line y = s x + c, for then we can eliminate y.

Method 3. Ruffini’s Rule

See above for the summary of Ruffini’s Rule and the links. For the application below you might want to become more familiar with it. Check why it works. Check how it works, or here.

The observation of the double root generates the idea of applying Ruffini’s Rule twice.

I don’t think that it would be so useful to teach this method in highschool. Mathematics undergraduates and teachers better know about its existence, but that is all. The method might be at the core of efficient computer programmes, but human beings better deal with computer algebra at the higher level of interface.

The assumption that x a goes without saying, but it remains useful to say it, because at some stage we still use q[a], and we better be able to explain the paradox.

Application of Ruffini’s Rule to the derivative

Let us use the example of Ruffini’s Rule at MathWorld  to determine the incline (tangent) to their example polynomial 3 x^3 – 6 x + 2, at x = 2. They already did most of the work, and we only include the derivative.

The first round of application gives us p[a] = p[2] = 14, namely the constant following from MathWorld.

A second round of application gives the slope, q[a] = 30.

2 |  3   6    6
            6  24
       3 12  30

Using the traditional method, the derivative is p’ [x] = 9 x^2 – 6, with p‘[2] = 30.

The incline (tangent) in both cases is y = 30 (x – 2) + 14 = 30 x – 46.

The major conceptual issue

The major conceptual issue is: while s is the slope of a line, and we take s = q[a], why would we call q[a] the slope of the polynomial at x = a ? Where is the element of “inclination” ? We might have just a formula of a line, without the notion of slope that fits the function. In other words, q[a] is just a number and no concept.

The key question w.r.t. this issue of the limit – and whether division causes a limit – is not quite w.r.t. Ruffini’s Rule but with the definition of slope, first for the line itself, secondly now for the incline of  a function. We represent the incline of a function with a line, but only because it has the property of having a slope and angle with the horizontal axis.

The only reason to speak about an incline is the recognition that above equation (**) generates a slope. We are only interested in q[a] = tan[α] since this is the special case at the point x a itself.

It is only after this notion of having a slope has been established, that Ruffini’s Rule comes into play. It focuses on “factoring as synthetic division” since that is how it has been designed. There is nothing in Ruffini’s Rule that clarifies what the calculation is about. It is an algorithm, no more.

Thus, for the argument that q[a] provides the slope at x = a, we still need the reasoning that first x ≠ a, then find a general expression q[x] and only then find x = a.

And this is what the algebraic approach to the derivative was designed to accomplish.

Addendum 2016-12-26: See the next weblog entry for another approach to the notion of the incline (tangency).

Ruffini’s Rule corroborates that the method works, but that it works had already been shown. However, it is likely a mark of mathematics that all these approaches are actually quite related. In that perspective, the algebraic approach to the derivative supplements the application of Ruffini’s Rule to clarify what it does.

Obviously, mathematicians have been working in this manner for ages, but implicitly. It really helps to state explicitly that the domain of a function can be manipulated around (supposed) singularities. The method can be generalised as

f ‘[x] = {Δf x)Dthen set Δx = 0} = {Δf // Δx, then set Δx = 0} 

It also has been shown to work for trigonometry and the exponential function.

Joost Hulshof & Ronald Meester (2010) suggest to introduce the derivative in highschool by means of polynomials (pdf p16-17). My problem is that they first hide the limit and then let it ambush the student. Thus:

  • When they say that “you can present the derivative for polynomials without limits” then they mean this only for didactics and not for mathematics.
  • But they are not trained in didactics, so they are arguing this as a hobby, as mathematicians with a peculiar view on didactics. They provide a course for mathematics teachers, but this concerns mathematics and not didactics.
  • They only hide the limit, but they do not deny that fundamentally you must refer to limits.
  • Eventually they still present the limit to maintain exactness, but then it has no other role than to link up to a later course (perhaps only for mathematicians).
  • Thus, they make the gap between “didactics” and proper “mathematics” larger on purpose.
  • This is quite different from the algebraic approach (see here), that really avoids limits, and also argues that limits are fundamentally irrelevant (for the functions used in highschool).

I have invited Hulshof since at least 2013 (presentation at the NVvW study day) to look at the algebraic approach to the derivative. He refuses to look into it and write a report on it, though he was so kind to look at this recent skirmish.

Hulshof refers to his approach perhaps as sufficient. It is quite unclear what he thinks about all this, since he doesn’t discuss the proposal of the algebraic approach to the derivative.

Let me explain what is wrong with their approach with the polynomials.

Please let mathematicians stop infringing upon didactics of mathematics. It is okay to check the quality of mathematics in texts that didacticians produce, but stop this “hobby” of second-guessing.

PM. A recent text is Hulshof & Meester (2015), “Wiskunde in je vingers“, VU University Press (EUR 29.95). Potentially they have improved upon the exposition in the pdf, but I am looking at the pdf only. Meester lists this book as “books mathematics” (p14). Hulshof calls it “concepts from mathematics” with “uncommon viewpoints” for “teacher, student” and for “education and curriculum”. When you address students then it is didactics. It is unclear why VU University Press thinks that he and Meester are qualified for this.

The incline

A standard notation for a line is y = c + s x, for constant c and slope s.

The line gives us the possibility of a definition of the incline (Dutch: richtlijn). An incline is defined for a function and a point. An incline of a function f at a point {a, f[a]} is a line that gives the slope of that function at that point.

It is wrong to say that the incline “has the same slope”. You are not comparing two lines. You are looking at the slope. You only know the slope of the function because of the incline (the line with that slope).

Incline versus tangent

The incline is often called the tangent. Students tend to think that tangents cannot cross the function, while tangents actually can. Thus incline can be a better term.

Hulshof & Meester refer in horror to the Oxford Advance Learner’s Dictionary, that has:

ERROR “Tangent: (geometry) a straight line that touches the outside of a curve but does not cross it. The cart track branches off at a tangent.”

I don’t think that “incline” will quickly replace “tangent”. But it is useful to discuss the issue with students and offer them an alternative word if “tangent” continues to confuse them. It is useful to start a discussion with students by mentioning the (quite universal) intuition of not-crossing. An orange touches a table, and doesn’t cross it. But mathematically it would be quite complex to test whether there is any crossing or not. Thus it is simpler to focus on the idea of incline, straight course, alignment.

When you swing a ball and then let go, then the ball will continue in the incline of the last moment. The incline captures that idea, by giving the line with that very slope.

I thank Peter Harremoës for a discussion on this (quite universal) confusion by students (and the OALD) and potential alternative terms. (Incline is still a suggestion.) (The word “directive” was rejected as too confusing with “derivative”. But Dutch “richtlijn” is better than raaklijn.)

Polynomials and their division

A polynomial of degree n has powers of x of size n:

p[x] = c + s x + c2 x² + … + cn xn.

In this, we take c = c0 and s = c1. For n = 1 we get the line again. We allow that the line has s = 0, so that we can have a horizontal line, which would strictly be a polynomial of n = 0. There is also the vertical line, that cannot be represented by a polynomial.

If p[a] = 0 then x = a is called a zero of the polynomial. Then (x a) is called a factor, and the polynomial can be written as

p[x] = (x aq[x]

where q[x] is a polynomial of a lower degree.

If p[a] ≠ 0 then we can still try to factor with (x a) but then there will be a remainder, as p[x] = (x aq[x] + r[x]. When we consider p[x] – r[x] then x = a is a zero of this. Thus:

 p[x] – r[x] = (x aq[x]

With polynomials we can do long division as with numbers. The following example is the division of x³ – 7x – 6 by x – 4 that generates a remainder.

purplemath-divisionIncline or tangent at a polynomial

Regard the polynomial p[x] at x = a, so that bp[a]. We consider point {a, b}. What incline does the curve have ?

(A) For the incline we have the line in {a, b}:

y b = s (x a)

(B) We have p[a] – b = 0 and thus x = a is a zero of the polynomial p[x] – b. Thus:

p[x] – b = (x aq[x]

(C) Thus (A) and (B) allow to assume y ≈  p[x] and to look at the common term x – a, “so that” (quotes because this is problematic):

s = q[a]

The example by Hulshof & Meester is p[x] = – 2 at the point {a, b} = {1, -1}.

p[x] – b = (x² – 2) – (-1) = – 1

Factor:  ( – 1) =  (x – 1) q[x]

Or divide: q[x] = ( – 1) / (x – 1)  = x + 1

Substituting the value = a = 1 in x + 1 gives q[a] = q[1] = 2.

H&M apparently avoid division by using the process of factoring.

Later they mention the limiting process for the division: Limit[x → 1, q[x]] = Limit[x → 1, ( – 1) / (x – 1)] = 2.


As said, the H&M approach is convoluted. They have no background in didactics and they hide the limit (rather than explaining its relevance since they still deem it relevant).

Mathematically, they might argue that they don’t divide but only factor polynomials.

  • But, when you are “factoring systematically” then you are actually dividing.
  • When you use “realistic mathematics education” then you can approximate division by trial and error of repeated subtraction, but I don’t think that they propose this. See the “partial quotient method” and my comments.
  • Addendum December 22: there is a way to look only at coefficients, Ruffini’s Rule, in wikipedia called Horner’s method. A generalisation is known as synthetic division, which expresses that it is no real division, but a method of factoring. (MathWorld has a different entry on “Horner’s method“.) See the next weblog entry.

When dividing systematically, you are using algebra, and you are assuming that a denominator like x – 1 isn’t zero but an abstract algebraic term. Well, this is precisely what the algebraic approach to the derivative has been proposing. Thus, their suggestion provides support for the algebraic approach, be it, that they do it somewhat crummy and non-systematically, whence it is little use to refer to this kind of support.

Didactically, their approach is undeveloped. They compare the slopes of the polynomial and the line, but there is no clear discussion why this would be a slope, or why you would make such a comparison. Basically, you can compare polynomials of order n with those of order m, and this would be a mathematical exercise, but devoid of interpretation. For didactics it does make sense to discuss: (a) the notion of “slope” of a function is given by the incline, (b) we want to find the incline of a polynomial for a particular reason (e.g. instantaneous velocity), (c) we can find it by a procedure called “derivative”. NB. My book Conquest of the Plane starts with surface and integral, and only later looks at slopes.

A main criticism however is also that H&M overlooked the fundamental problem with the notion of a slope of a line itself. They rely on some hidden issues here too. I discussed this recently, and repeat this below.

PM. See a discussion of approximating a function by polynomials. Observe that we are not “approximating” a function by its incline now. At {a, b} the values and slope are exactly the same, and there is nothing approximate about this. Only at other points we might say that there is an “error” by looking at the incline rather than the polynomial, but we are not looking at such errors now, and this would be a quite different topic of discussion.

Copy of December 8 2016: Ray through an origin

Let us first consider a ray through the origin, with horizontal axis x and vertical axis y. The ray makes an angle α with the horizontal axis. The ray can be represented by a function as y =  f [x] = s x, with the slope s = tan[α]. Observe that there is no constant term (c = 0).


The quotient y / x is defined everywhere, with the outcome s, except at the point x = 0, where we get an expression 0 / 0. This is quite curious. We tend to regard y / x as the slope (there is no constant term), and at x = 0 the line has that slope too, but we seem unable to say so.

There are at least three responses:

(i) Standard mathematics then takes off, with limits and continuity.

(ii) A quick fix might be to try to define a separate function to find the slope of a ray, but we can wonder whether this is all nice and proper, since we can only state the value s at 0 when we have solved the value elsewhere. If we substitute y when it isn’t a ray, or example x², then we get a curious construction, and thus the definition isn’t quite complete since there ought to be a test on being a ray.




(iii) The algebraic approach uses the following definition of the dynamic quotient:

y // x ≡ { y / x, unless x is a variable and then: assume x ≠ 0, simplify the expression y / x, declare the result valid also for the domain extension x = 0 }

Thus in this case we can use y // x = s x // x = s, and this slope also holds for the value x = 0, since this has now been included in the domain too.

Line with constant

When we have a line y = c + s x, then a hidden part of the definition is that the slope is s everywhere, even though we cannot compute (y c) / x when x = 0. (One might say: “This is what it means to be linear.”)

When we look at x = a and determine the slope by taking a difference Δx, then we get:

b = c + s a

b + Δy = c + s (a + Δx)

Δy = Δx

The slope at would be s but is also Δy / Δx, undefined for Δx = 0

Thus, the slope of a line is either given as s for all points (or, critically for x = 0 too) (perhaps with a rule: if you find a slope somewhere then it holds everywhere), or we must use limits.

The latter can be more confusing when s has not been given and must be calculated from other resources. In the case of differentials dy = s dx, the notation dy / dx causes conceptual problems when s itself is found by a limit on the difference quotient.

  1. The H&M claim that polynomials can be used without limits is basically a didactic claim since they evidently still rely on limits (perhaps to fend of fellow mathematicians). This didactic claim is a wild-goose chase since they are not involved in didactics research.
  2. If they really would hold that factoring can be done systematically without division, then they might have a point, but then they still must give an adequate explanation how you get from (A) & (B) to (C). Saying that differences are “small” is not enough (not even for polynomials). Addendum December 22: see the next weblog entry on Ruffini’s rule.
  3. They present this for a “reminder course in mathematics” for teachers of mathematics, but it isn’t really mathematics and it is neither useful for teaching mathematics.
  4. A serious development that avoids limits and relies on algebraic methods, that covers the same area of polynomials but also trigonometry and exponential functions, is the algebraic approach to the derivative, available since 2007 with a proof of concept in Conquest of the Plane in 2011.
  5. It is absurd that Hulshof & Meester neglect the algebraic approach. But they are mathematicians, and didactics is not their field of research. I think that the algebraic method provides a fundamental redefinition of calculus, but I prefer the realm of didactics above the realm of mathematics with its culture of contempt for empirical science.
  6. The H&M exposition and neglect is just an example of Holland as Absurdistan, and the need to boycott Holland till the censorship of science by the directorate of the Dutch Central Planning Bureau has been lifted.
I wouldn't want to be caught before a blackboard like that (Screenshot UChicago)

I wouldn’t want to be caught before a blackboard like that (Screenshot UChicago)

I am looking for a story on continuity and limits that can be told in junior highschool and still makes sense. We would like an isomorphy between space and numbers. For some aspects, mathematical theory sends us to number theory, and for isomorph aspects, mathematical theory sends us to topology. It is awkward to have to translate similar notions, and to eliminate the overload of notions that are not directly relevant for this search for this junior highschool story.

For example, topology has rephrased results into statements on open and closed sets and boundaries, but I am wondering whether that is an effective manner of communication, when the relevant distinction is whether you are assuming a well-ordening or not. But I am not at home in number theory or topology. These comments on continuity and limits have been caused precisely because I am feeling the water.

Basically, I already designed such a story on continuity and limits (pdf, weblog), but now I am noticing that I can include a question mark on infinitesimals.

Addendum December 16

This weblog text is a rewrite of yesterday’s text.


The framework contains both the handling of real numbers on the calculator and a development of theory.

Example 1

Also in junior highschool, we want students to be aware that 0.999…. = 1.000…. so that these are the same number. You can see this by checking 3 * 1/3 = 1.

Example 2

When we approximate numbers with n the number of decimals, then these basically are like the natural numbers, and there remains a well ordering. Numbers are δ[n] = 10^(-n) apart.

When we shift to the use of the infinite number of decimals then we lose this “infinitesimal”. At issue is now whether the infinitesimal can be retained in some manner.

Standard definition of density causes contradiction

Discussing the continuum and the set of real numbers R recently, I suggested (here, property (a)) that R would be a dense set, according to the standard definition of density. This definition is that for any two elements x < y there would be at least one z between, as x < z < y. This would allow you to make cuts everywhere.

Oops. I retract.

Wikipedia (no source but a portal) has:

“From the ZFC axioms of set theory (including the axiom of choice) one can show that there is a well order of the reals.”

I don’t know quite what to think about this. Elsewhere I deduced that ZFC is inconsistent. But perhaps in a revised set theory, the well order can be retained.

We would like R to have a well-order for finite intervals too. Thus every number x has a next number x’. When you select y = x’  then you couldn’t find anything between x and y. This contradicts above statement on density.

Thus, the standard definition of density doesn’t fit a well-ordered R.

Designing a new definition for density of the reals R

We can design a new definition of density.

  • The standard definition is useful for the rationals Q. If we restrict your freedom to making cuts along Q, then we are safe again. In this manner, the distinction between rational and irrational numbers is useful to explain a property of R.
  • R is defined as “more dense” because Q is dense w.r.t. that original definition ((a)).
  • This proposal is quite similar to the Dedekind cut, with the distinction that we now allow that R might retain a well-ordering. That is, this issue on the ordering is no longer forced by the standard definition of density.
Surprise consequence as a bonus

Switching to another notion of density, generates the bonus that we have more scope to introduce the infinitesimal.

When every number x has a next x’, then we can define the infinitesimal as the difference:

δ = x’ x

It also means that an open set (a, b) can also be seen as a closed set [a + δ, b – δ].

Wikipedia (no source but a portal) claims today:

“The standard ordering ≤ of any real interval is not a well ordering, since, for example, the open interval (0, 1) ⊆ [0,1] does not contain a least element.”

Yet now we have (0, 1) = [δ, 1 – δ] and the least element is δ. Only the intervals with negative infinity might be excluded, check (-∞, ∞).

Properties of these infinitesimals

Some properties are:

(1) We still have 0.999…. = 1.000…..

(2) A current statement is that a line consists of points, and each point is a co-ordinate without length. We now can better express that length consists of a sum of short lengths. A sum of these infinitesimals Σδ makes sense if we regard it as the sum from x = 0 to x = 1 for x‘ – x. The trick is that the length is determined by the statement on x and not by the coefficient of δ.

(3) Using H = -1, then δ δH = 1. That is, δ ≠ 0, and thus there is no problem with division. The discussion about differentials is quite different from the discussion about these (new) infinitesimals. Much time has been spent in history in looking whether there might be a connection, but there isn’t.

Separate arithmetic for infinity and infinitesimal

Students already know that they cannot apply the rules of arithmetic to infinity. E.g. ∞ + ∞ = ∞. The same now holds for above hypothetical notion of the infinitesimal.

Property (2) carries over from δ[n] with n the number of decimals. Property (1) arises when n → ∞ . Potentially, these notions cannot be combined without some conflict.

We are accustomed to think that any real number can be divided. But e.g. δ / 2 is nonsense because it gives the distance between two numbers, and there is nothing smaller. Thus, the normal rules for arithmetic only hold for reals that are not these infinitesimals.

With δ = x’ x we also want to consider y = (x / n). When the numbers are halved for n = 2, is the distance halved or isn’t it halved ? In the approximation δ[n] the distance can become smaller when more digits are included. For an infinite number of digits, presumably, the distance cannot be halved. Thus δ = y’ y. Multiplication by n gives nδ = n (x / n)x. Thus x’ = n (x / n)‘ – nδ + δ for any n. This would make (most) sense by the choice δ = 0 and x’ / n = (x / n)‘. But then we are back in the classical approach again, without the well ordering. (The next number is the number itself.)

Persumably, we can argue that n * δ is as problematic as δ / n, though. The notion of Σδ namely has been solved by putting the consideration of length into the Σ sign.

I don’t know yet whether it is sufficient (consistent) to state δ = x’ and that the rules for arithmetic don’t apply to δ like they don’t apply to ∞. Potentially, we might write δH → ∞ (and this doesn’t mean that δ ∞ → 1).

All this depends upon whether we can develop a consistent set of definitions. Students at junior highschool might agree that they aren’t much interested in that.

Thus, it might only be in senior highschool, when we discuss the “classical” approach to the reals, that has (a, b) as an open interval only. We would be forced to this not because of the definition of density but because of the rules of arithmetic.


In itself, notions like these are not world shocking but they would tend to fit the intuitions of space and number for junior highschool.

At some point of history, the main stream in mathematics has opted for an approach to the reals so that they have no well ordering. The obstacle of the standard definition of density can be removed, as shown above. A problem still resides in arithmetic with δ = x’ x. It is not clear to me whether this can be resolved. It is not clear to me neither whether it is okay to have benign neglect till senior highschool, and face the consequences of losing the well ordering only there.