Archive

Tag Archives: Algebraic approach

Joost Hulshof & Ronald Meester (2010) suggest to introduce the derivative in highschool by means of polynomials (pdf p16-17). My problem is that they first hide the limit and then let it ambush the student. Thus:

  • When they say that “you can present the derivative for polynomials without limits” then they mean this only for didactics and not for mathematics.
  • But they are not trained in didactics, so they are arguing this as a hobby, as mathematicians with a peculiar view on didactics. They provide a course for mathematics teachers, but this concerns mathematics and not didactics.
  • They only hide the limit, but they do not deny that fundamentally you must refer to limits.
  • Eventually they still present the limit to maintain exactness, but then it has no other role than to link up to a later course (perhaps only for mathematicians).
  • Thus, they make the gap between “didactics” and proper “mathematics” larger on purpose.
  • This is quite different from the algebraic approach (see here), that really avoids limits, and also argues that limits are fundamentally irrelevant (for the functions used in highschool).

I have invited Hulshof since at least 2013 (presentation at the NVvW study day) to look at the algebraic approach to the derivative. He refuses to look into it and write a report on it, though he was so kind to look at this recent skirmish.

Hulshof refers to his approach perhaps as sufficient. It is quite unclear what he thinks about all this, since he doesn’t discuss the proposal of the algebraic approach to the derivative.

Let me explain what is wrong with their approach with the polynomials.

Please let mathematicians stop infringing upon didactics of mathematics. It is okay to check the quality of mathematics in texts that didacticians produce, but stop this “hobby” of second-guessing.

PM. A recent text is Hulshof & Meester (2015), “Wiskunde in je vingers“, VU University Press (EUR 29.95). Potentially they have improved upon the exposition in the pdf, but I am looking at the pdf only. Meester lists this book as “books mathematics” (p14). Hulshof calls it “concepts from mathematics” with “uncommon viewpoints” for “teacher, student” and for “education and curriculum”. When you address students then it is didactics. It is unclear why VU University Press thinks that he and Meester are qualified for this.

The incline

A standard notation for a line is y = c + s x, for constant c and slope s.

The line gives us the possibility of a definition of the incline (Dutch: richtlijn). An incline is defined for a function and a point. An incline of a function f at a point {a, f[a]} is a line that gives the slope of that function at that point.

It is wrong to say that the incline “has the same slope”. You are not comparing two lines. You are looking at the slope. You only know the slope of the function because of the incline (the line with that slope).

Incline versus tangent

The incline is often called the tangent. Students tend to think that tangents cannot cross the function, while tangents actually can. Thus incline can be a better term.

Hulshof & Meester refer in horror to the Oxford Advance Learner’s Dictionary, that has:

ERROR “Tangent: (geometry) a straight line that touches the outside of a curve but does not cross it. The cart track branches off at a tangent.”

I don’t think that “incline” will quickly replace “tangent”. But it is useful to discuss the issue with students and offer them an alternative word if “tangent” continues to confuse them. It is useful to start a discussion with students by mentioning the (quite universal) intuition of not-crossing. An orange touches a table, and doesn’t cross it. But mathematically it would be quite complex to test whether there is any crossing or not. Thus it is simpler to focus on the idea of incline, straight course, alignment.

When you swing a ball and then let go, then the ball will continue in the incline of the last moment. The incline captures that idea, by giving the line with that very slope.

I thank Peter Harremoës for a discussion on this (quite universal) confusion by students (and the OALD) and potential alternative terms. (Incline is still a suggestion.) (The word “directive” was rejected as too confusing with “derivative”. But Dutch “richtlijn” is better than raaklijn.)

Polynomials and their division

A polynomial of degree n has powers of x of size n:

p[x] = c + s x + c2 x² + … + cn xn.

In this, we take c = c0 and s = c1. For n = 1 we get the line again. We allow that the line has s = 0, so that we can have a horizontal line, which would strictly be a polynomial of n = 0. There is also the vertical line, that cannot be represented by a polynomial.

If p[a] = 0 then x = a is called a zero of the polynomial. Then (x a) is called a factor, and the polynomial can be written as

p[x] = (x aq[x]

where q[x] is a polynomial of a lower degree.

If p[a] ≠ 0 then we can still try to factor with (x a) but then there will be a remainder, as p[x] = (x aq[x] + r[x]. When we consider p[x] – r[x] then x = a is a zero of this. Thus:

 p[x] – r[x] = (x aq[x]

With polynomials we can do long division as with numbers. The following example is the division of x³ – 7x – 6 by x – 4 that generates a remainder.

purplemath-divisionIncline or tangent at a polynomial

Regard the polynomial p[x] at x = a, so that bp[a]. We consider point {a, b}. What incline does the curve have ?

(A) For the incline we have the line in {a, b}:

y b = s (x a)

(B) We have p[a] – b = 0 and thus x = a is a zero of the polynomial p[x] – b. Thus:

p[x] – b = (x aq[x]

(C) Thus (A) and (B) allow to assume y ≈  p[x] and to look at the common term x – a, “so that” (quotes because this is problematic):

s = q[a]

The example by Hulshof & Meester is p[x] = – 2 at the point {a, b} = {1, -1}.

p[x] – b = (x² – 2) – (-1) = – 1

Factor:  ( – 1) =  (x – 1) q[x]

Or divide: q[x] = ( – 1) / (x – 1)  = x + 1

Substituting the value = a = 1 in x + 1 gives q[a] = q[1] = 2.

H&M apparently avoid division by using the process of factoring.

Later they mention the limiting process for the division: Limit[x → 1, q[x]] = Limit[x → 1, ( – 1) / (x – 1)] = 2.

Critique

As said, the H&M approach is convoluted. They have no background in didactics and they hide the limit (rather than explaining its relevance since they still deem it relevant).

Mathematically, they might argue that they don’t divide but only factor polynomials.

  • But, when you are “factoring systematically” then you are actually dividing.
  • When you use “realistic mathematics education” then you can approximate division by trial and error of repeated subtraction, but I don’t think that they propose this. See the “partial quotient method” and my comments.
  • Addendum December 22: there is a way to look only at coefficients, Ruffini’s Rule, in wikipedia called Horner’s method. A generalisation is known as synthetic division, which expresses that it is no real division, but a method of factoring. (MathWorld has a different entry on “Horner’s method“.) See the next weblog entry.

When dividing systematically, you are using algebra, and you are assuming that a denominator like x – 1 isn’t zero but an abstract algebraic term. Well, this is precisely what the algebraic approach to the derivative has been proposing. Thus, their suggestion provides support for the algebraic approach, be it, that they do it somewhat crummy and non-systematically, whence it is little use to refer to this kind of support.

Didactically, their approach is undeveloped. They compare the slopes of the polynomial and the line, but there is no clear discussion why this would be a slope, or why you would make such a comparison. Basically, you can compare polynomials of order n with those of order m, and this would be a mathematical exercise, but devoid of interpretation. For didactics it does make sense to discuss: (a) the notion of “slope” of a function is given by the incline, (b) we want to find the incline of a polynomial for a particular reason (e.g. instantaneous velocity), (c) we can find it by a procedure called “derivative”. NB. My book Conquest of the Plane starts with surface and integral, and only later looks at slopes.

A main criticism however is also that H&M overlooked the fundamental problem with the notion of a slope of a line itself. They rely on some hidden issues here too. I discussed this recently, and repeat this below.

PM. See a discussion of approximating a function by polynomials. Observe that we are not “approximating” a function by its incline now. At {a, b} the values and slope are exactly the same, and there is nothing approximate about this. Only at other points we might say that there is an “error” by looking at the incline rather than the polynomial, but we are not looking at such errors now, and this would be a quite different topic of discussion.

Copy of December 8 2016: Ray through an origin

Let us first consider a ray through the origin, with horizontal axis x and vertical axis y. The ray makes an angle α with the horizontal axis. The ray can be represented by a function as y =  f [x] = s x, with the slope s = tan[α]. Observe that there is no constant term (c = 0).

2016-12-08-ray

The quotient y / x is defined everywhere, with the outcome s, except at the point x = 0, where we get an expression 0 / 0. This is quite curious. We tend to regard y / x as the slope (there is no constant term), and at x = 0 the line has that slope too, but we seem unable to say so.

There are at least three responses:

(i) Standard mathematics then takes off, with limits and continuity.

(ii) A quick fix might be to try to define a separate function to find the slope of a ray, but we can wonder whether this is all nice and proper, since we can only state the value s at 0 when we have solved the value elsewhere. If we substitute y when it isn’t a ray, or example x², then we get a curious construction, and thus the definition isn’t quite complete since there ought to be a test on being a ray.

2016-12-10-slopeofray

 

 

(iii) The algebraic approach uses the following definition of the dynamic quotient:

y // x ≡ { y / x, unless x is a variable and then: assume x ≠ 0, simplify the expression y / x, declare the result valid also for the domain extension x = 0 }

Thus in this case we can use y // x = s x // x = s, and this slope also holds for the value x = 0, since this has now been included in the domain too.

Line with constant

When we have a line y = c + s x, then a hidden part of the definition is that the slope is s everywhere, even though we cannot compute (y c) / x when x = 0. (One might say: “This is what it means to be linear.”)

When we look at x = a and determine the slope by taking a difference Δx, then we get:

b = c + s a

b + Δy = c + s (a + Δx)

Δy = Δx

The slope at would be s but is also Δy / Δx, undefined for Δx = 0

Thus, the slope of a line is either given as s for all points (or, critically for x = 0 too) (perhaps with a rule: if you find a slope somewhere then it holds everywhere), or we must use limits.

The latter can be more confusing when s has not been given and must be calculated from other resources. In the case of differentials dy = s dx, the notation dy / dx causes conceptual problems when s itself is found by a limit on the difference quotient.

Conclusions
  1. The H&M claim that polynomials can be used without limits is basically a didactic claim since they evidently still rely on limits (perhaps to fend of fellow mathematicians). This didactic claim is a wild-goose chase since they are not involved in didactics research.
  2. If they really would hold that factoring can be done systematically without division, then they might have a point, but then they still must give an adequate explanation how you get from (A) & (B) to (C). Saying that differences are “small” is not enough (not even for polynomials). Addendum December 22: see the next weblog entry on Ruffini’s rule.
  3. They present this for a “reminder course in mathematics” for teachers of mathematics, but it isn’t really mathematics and it is neither useful for teaching mathematics.
  4. A serious development that avoids limits and relies on algebraic methods, that covers the same area of polynomials but also trigonometry and exponential functions, is the algebraic approach to the derivative, available since 2007 with a proof of concept in Conquest of the Plane in 2011.
  5. It is absurd that Hulshof & Meester neglect the algebraic approach. But they are mathematicians, and didactics is not their field of research. I think that the algebraic method provides a fundamental redefinition of calculus, but I prefer the realm of didactics above the realm of mathematics with its culture of contempt for empirical science.
  6. The H&M exposition and neglect is just an example of Holland as Absurdistan, and the need to boycott Holland till the censorship of science by the directorate of the Dutch Central Planning Bureau has been lifted.
I wouldn't want to be caught before a blackboard like that (Screenshot UChicago)

I wouldn’t want to be caught before a blackboard like that (Screenshot UChicago)

Advertisements

Isaac Newton (1642-1727) invented the differentials, calling them evanescent quantities. Since then, the world has been wondering what these are. Just to be sure, Newton wrote his Principia (1687) by using the methods of Euclidean geometry, so that his results could be accepted in the standard of his day (context of reconstruction and presentation), and so that his results were not lost in a discussion about the new method of these differentials (context of discovery). However, this only increased the enigma. What can these quantities be, that are so efficient for science, and that actually disappear when mathematically interesting ?

Gottfried Leibniz (1646-1716) gave these infinitesimals their common labels dy and dx, and thus they became familiar as household names in academic circles, but this didn’t reduce their mystery.

Charles Dodgson (1832-1898) as Lewis Carroll had great fun with the Cheshire Cat, who disappears but leaves its grin.

Abraham Robinson (1918-1974) presented an interpretation called “non-standard analysis“. Many people think that he clinched it, but when I start reading then my intuition warns me that this is making things more difficult. (Perhaps I should read more though.)

In 2007, I developed an algebraic approach to the derivative. This was in the book “A Logic of Exceptions” (ALOE), later also included in “Elegance with Substance” (EWS) (2009, 2015), and a bit later there was a “proof of concept” in “Conquest of the Plane” (COTP) (2011). The pdfs are online, and a recent overview article is here. A recent supplement is the discussion on continuity.

In this new algebraic approach there wasn’t a role for differentials, yet. The notation dy / dx = f ‘[x] for y f [x] can be used to link up to the literature, but up to now there was no meaning attached to the symbolism. In my perception this was (a bit of) a pity since the notation with differentials can be useful on occasion, see the example below.

Last month, reading Joop van Dormolen (1970) on the didactics of derivatives and the differential calculus – in a book for teachers Wansink (1970) volume III – I was struck by his admonition (p213) that dy / dx really is a quotient of two differentials, and that a teacher should avoid identifying it as a single symbol and as the definition of the derivative. However, when he proceeded, I was disappointed, since his treatment didn’t give the clarity that I looked for. In fact, his treatment is quite in line with that of Murray Spiegel (1962), “Advanced calculus (Metric edition)”, Schaum’s outline series, see below. (But Van Dormolen very usefully discusses the didactic questions, that Spiegel doesn’t look into.)

Thus, I developed an interpretation of my own. In my impression this finally gives the clarity that people have been looking for starting with Newton. At least: I am satisfied, and you may check whether you are too.

I don’t want to repeat myself too much, and thus I assume that you read up on the algebraic approach to the derivative in case of questions. (A good place to start is the recent overview.)

Ray through an origin

Let us first consider a ray through the origin, with horizontal axis x and vertical axis y. The ray makes an angle α with the horizontal axis. The ray can be represented by a function as y =  f [x] = s x, with the slope s = tan[α]. Observe that there is no constant term (c = 0).

2016-12-08-ray

The quotient y / x is defined everywhere, with the outcome s, except at the point x = 0, where we get an expression 0 / 0. This is quite curious. We tend to regard y / x as the slope (there is no constant term), and at x = 0 the line has that slope too, but we seem unable to say so.

There are at least three responses:

(i) Standard mathematics then takes off, with limits and continuity.

(ii) A quick fix might be to try to define a separate function to find the slope of a ray, but we can wonder whether this is all nice and proper, since we can only state the value s at 0 when we have solved the value elsewhere. If we substitute y when it isn’t a ray, or example x², then we get a curious construction, and thus the definition isn’t quite complete since there ought to be a test on being a ray.

2016-12-10-slopeofray

 

 

(iii) The algebraic approach uses the following definition of the dynamic quotient:

y // x ≡ { y / x, unless x is a variable and then: assume x ≠ 0, simplify the expression y / x, declare the result valid also for the domain extension x = 0 }

Thus in this case we can use y // x = s x // x = s, and this slope also holds for the value x = 0, since this has now been included in the domain too.

In a nutshell for dy / dx

In a nutshell, we get the following situation for dy / dx:

2016-12-08-dydx

Properties are exactly as Van Dormolen explained:

  • “dy” and “dx” are names for variables, and thus they have their own realm with their own axes.
  • The definition of their relationship is dy = f ‘[x] dx.

The news is:

  • The mistake in history was to write dy / dx instead of dy // dx.

The latter “mistake” can be understood, since the algebraic approach uses notions of set theory, domain and range, and dynamics as in computer algebra, and thus we can forgive Newton for not getting there yet.

To link up with history, we might define that the “symbol dy / dx as a whole” is a shortcut for dy // dx. This causes additional yards to develop the notion of “symbol as a whole” however. My impression is that it is better to use dy // dx unless it is so accepted that it might become pedantic. (You must only explain that the Earth isn’t flat while people don’t know that yet.)

Application to Spiegel 1962 gives clarity

Let us look at Spiegel (1962) p58-59, and see how above discussion can bring clarity. The key points can all be discussed with reference to his figure 4-1.

1962-murrayspiegel-p58-fig4-1

Looking at this with a critical eye, we find:

  • At the point P, there is actually the creation of two new sets of axes, namely, both the {Δx, Δy} plane and the {dx, dy} plane.
  • These two new planes have both rays through the origin, one with angle θ and one with angle α.
  • The two planes help to define the error. An error is commonly defined from the relation “true value = estimate + error”. The true value of the angle is θ and our estimate is α.
  • Thus we get absolute error Δf = s Δx + ε where s = dy / dx. This error is a function of Δx, or ε = ε[Δx]. It solves as ε = Δf – s Δx.
  • The relative error is Δf / Δx =  dy / dx + r which solves as r = Δf / Δx – dy / dx. This is still a function rx]. We use the quotient of the differentials instead of the true quotient of the differences.
  • We better re-consider the error in terms of the dynamic quotient, replacing / by // in the above, because at P we like the error to be zero. Thus in above figure we have ε = Δf  s Δx, where s = dy // dx.
  • A source of confusion is that Spiegel suggests that d≈ Δx or even dx = Δx but this is numerically true only sometimes and conceptually there surely is no identity since these are different axes.
  • In the algebraic approach, Δx is set to zero to create the derivative, in particular the value of f ‘[x] = tan[α] at point P.  In this situation, Δx = 0 thus clearly differs from the values of dx that are still available on dx ‘s whole own axis. This explains why the creation of the differentials is useful. For, while Δx is set to 0, then the differentials can take any value, including 0.

Just to be sure, the algebraic approach uses this definition:

f ’[x] = {Δf // Δx, then set Δx = 0}

Subsequently, we define dy = f ‘[x] dx, so that we can discuss the relative error r = Δf // Δx – dy // dx.

PM. Check COTP p224 for the discussion of (relative) error, with the same notation. This present discussion still replaces the statement on differentials in COTP p155, step number 10.

A subsequent point w.r.t. the standard approach

Our main point thus is that the mistake in history was to write dy / dx instead of dy // dx. There arises a subsequent point of didactics. When you have real variables and z, then these have their own axes, and you don’t put them on the same axis just because they are both reals.

See Appendix A for a quote from Spiegel (1962), and check that it is convoluted at times.

Appendix B contains a quote from p236 from Adams & Essex (2013). We can see the same confusions as in Spiegel (1962). It really is a standard approach, and convoluted.

The standard approach takes Δx = dx and joins the axis for the variable Δy with the axis for the variable dy, with the common idea of “a change from y“. The idea of this setup is that it shows the error for values of Δx = dx.

2016-12-10-delta-and-d

It remains an awkward setup. It may well be true that John from Los Angeles is called Harry in New York, but when John calls his mother back home and introduces himself as “Mom, this is Harry”, then she will be confused. Eventually she can get used to this kind of phonecalls, but it remains awkward didactics to introduce students to these new concepts in this manner. (Especially when John adds: “Mom, actually I don’t exist anymore because I have been set to zero.”)

Thus, in good didactics we should drop this Δx = dx.

Alternatively put: We might define dy = f ’[x] Δx = f // Δx, then set Δx = 0} Δx. In the latter expression Δx occurs twice: both as a local and bound variable within { … } and as a global free variable outside of { … }. This is okay. In the past, mathematicians apparently thought that it might make things clearer to write dfor the free global variable: dy = f ’[x] dx. In a way this is okay too. But for didactics it doesn’t work. We should rather avoid an expression in which the same variable (name) is uses both locally bound and globally free.

Clear improvement

Remarkably, we are using 99% of the same apparatus as the standard approach, but there are clear improvements:

  • There is no use of limits. All information is contained in the algebra of both the function f and the dynamic quotient. See here for continuity.
  • There is a clear distinction between the three realms {x, y}, {Δx, Δy} and {dx, dy}.
  • There is the new tool of the {dx, dy} space that can be used for analysis of variations.
  • Didactically, it is better to first define the derivative in chapter 1, and then introduce the differentials in chapter 2, since the differentials aren’t needed to understand chapter 1.
  • There is clarity about the error, that one doesn’t take d≈ Δx but considers ε = Δf  s Δx, where s has been found from the recipe s = f ’[x] = {Δf // Δx, then set Δx = 0}.
Example by Van Dormolen (1970:219)

This example assumes the total differential of the function f[x, y]:

df = (∂f // ∂x) dx + (∂f // ∂y) dy

Question. Give the slope of the tangent in the point {3, 4} of the circle x² + y²  = 25.

Answer. The point is on the circle indeed. We write the equation as f[x, y] = x² + y²  = 25. The total differential gives 2x dx + 2y dy = 0. Thus dy // dx = – x // y. Evaluation at the point {3, 4} gives the slope – 3/4.  □

PM. We might develop y algebraically as a function of and then use the +√ rather than the -√. However, more abstractly, we can use [x], and use dy = g ‘[x] dx, so that the slope of the tangent is g ‘[x] at the point {3, 4}. Subsequently we use g ‘[x] = dy // dx.

PM. In the Dutch highschool programme, partial derivatives aren’t included, but when we can save time by a clear presentation, then they surely should be introduced.

Conclusion

The conclusion is that the algebraic approach to the derivative also settles the age-old question about the meaning of the differentials.

For texts in the past the interpretation of the differential is a mess. For the future, textbooks now have the option of above clarity.

Again, a discussion about didactics is an inspiration for better mathematics. Perhaps research mathematicians have abandoned this topic for ages, and it is only looked at by researchers on didactics.

Appendix A. Spiegel (1962)

Quote from Murray Spiegel (1962), “Advanced calculus (Metric edition)”, Schaum’s outline series, p58-59.

1962-spiegel-p58-59-gray

Appendix B. Adams & Essex (2013)

The following quote is from Robert A. Adams & Christopher Essex (2013), “Calculus. A Complete Course”, Pearson, p236.

  • It is a pity that they use c as a value of x rather than as an universal name for a constant (value on the y axis).
  • For them, the differential cannot be zero, while Spiegel conversely states that it is “not necessarily zero”.
  • They clearly show that you can take f ‘[x] Δin in {Δx, Δy} space, and that you then need a new symbol for the outcome, since Δy already has been defined differently. However, it is awkward to say: “For such an approximation, the quantity Δx is traditionally denoted as dx (…)”. It may well be true that John from Los Angeles is called Harry in New York, … etcetera, see above.

2013-03-adams-calculus-acompletecourse-p236-figure

2013-03-adams-calculus-acompletecourse-p236-text