Archive

Tag Archives: Differential

Our protagonists are Cartesius (1596-1650) and Fermat (1607-1665). As Judith Grabiner states, in a recommendable text:

“One could claim that, just as the history of Western philosophy has been viewed as a series of footnotes to Plato, so the past 350 years of mathematics can be viewed as a series of footnotes to Descartes’ Geometry.”  (Grabiner) (But remember Michel Onfray‘s observation that followers of Plato have been destroying texts by opponents. (Dutch readers check here.))

Both Cartesius and Fermat were involved in the early development of calculus. Both worked on the algebraic approach without limits. Cartesius developed the method of normals and Fermat the method of adequality.

Fermat and Δf / Δx

Fermat’s method was algebraic itself, but later has been developed into the method of limits anyhow. When asked what the slope of a ray y = s x is at the point x = 0, then the answer y / x = s runs into problems, since we cannot use 0 / 0. The conventional answer is to use limits. This problem is more striking when one considers the special ray that is defined everywhere except at the origin itself. The crux of the problem lies in the notion of slope Δf / Δthat obviously has a problematic division. With set theory we can now define the “dynamic quotient”, so that we can use Δf // Δx = s even when Δx = 0, so that Fermat’s problem is resolved, and his algebraic approach can be maintained. This originated in 2007, see Conquest of the Plane (2011).

Cartesius and Euclid’s notion of tangency

Cartesius followed Euclid’s notion of tangency. Scholars tend to assign this notion to Cartesius as well, since he embedded the approach within his new idea of analytic geometry.

I thank Roy Smith for this eye-opening question:

“Who first defined a tangent to a circle as a line meeting it only once? From googling, it seems commonly believed that Euclid did this, but it seems nowhere in Euclid does he even state this property of a tangent line explicitly. Rather Euclid gives 4 other equivalent properties, that the line does not cross the circle, that it is perpendicular to the radius, that is a limit of secant lines, and that it makes an angle of zero with the circle, the first of which is his definition, the others being in Proposition III.16. I am wondering where the “meets only once” definition got started. I presume once it got going, and people stopped reading Euclid, (which seems to have occurred over 100 years ago), the currently popular definition took over. Perhaps I should consult Legendre or Hadamard? Thank you for any leads.” (Roy Smith, at StackExchange)

In this notion of tangency there is no problematic division, whence there is no urgency to use limits.

The reasoning is:

  • (Circle & Line) A line is tangent to a circle when there is only one common point (or the two intersecting points overlap).
  • (Circle & Curve) A smooth curve is tangent to a circle when the  two intersecting points overlap (but the curve might cross the circle at that point so that the notion of “two points” is even more abstract).
  • (Curve & Line) A curve is tangent to a line when the above two properties hold (but the line might cross the curve, whence we better speak about incline rather than tangent).
Example of line and circle

Consider the line y f[x] = c + s x and the point {a, f[a]}. The line can also be written with c = f[a] – s a:

y – f[a] = s (x a)

The normal has slope –sHwhere we use = -1. The formula for the normal is the line y – f[a] = –sH  (xa). We can choose the center of the circle anywhere on this line. A handy choice is {u, 0}, so that we choose the center on the horizontal axis. (If we looked at a ray and point {0, 0}, then the issue would be similar for {0, c} for nonzero c and thus the approach remains general.) Substituting the point into the normal gives

0 – f[a] = –sH  (ua)

s = (u – a) / f[a]

u + s f[a]

The circle has the formula (x u)² + y² = r². Substituting {a, f[a]} generates the value for the radius r² = (a – (a + s f[a]))² + f[a]² = (1 + s²) f[a]² . The following diagram has {c, s, a} = {0, 2, 3} and thus u = 15 and r = 6√5.

 

descartesMethod of normals

For the method of normals and arbitrary function f[x], Cartesius’s trick is to substitute y = f[x] into the formula for the circle, and then solve for the unknown center of the circle.

(x u)² + (y – 0)² = r²

(x u)² + f[x]² – r² = 0         … (* circle)

This expression is only true for x = a, but we treat it as if it were more general. The key property is:

Since {a, f[a]} satisfies the circle, this equation has a solution for x = a with a double root.

Thus there might be some g such that the root can be isolated:

(x ag [x, u] = 0         … (* roots)

Thus, if we succeed in rewriting the formula for the circle into the form of the formula with the two roots, then we can use information about the structure of the latter to say something about u.

The method works for polynomials, that obviously have roots, but not necessarily for trigonometry and the exponential function.

Algorithm

The algorithm thus is: (1) Substitute f[x] in the formula for the circle. (2) Compare with the expression with the double root. (3) Derive u. (4) Then the line through {a, f[a]} and {u, 0} will give slope –sH. Thus s = (ua) / f[a] gives the slope of the incline (tangent) of the curve. (5) If f[a] = 0, add a constant or choose center {u, v}.

Application to the line itself

Consider the line y f[x] = c + s x again. Let us apply the algorithm. The formula for the circle gives:

(x u)² + (c + s x)² – r² = 0

x² – 2ux + u² + c² + 2csx + s²x² – r² = 0

(1 + s²) x² – 2 (u cs) x +  u² + c² – r² = 0

This is a polynomial. It suffices to choose g [x, u] = 1 + s²  so that the coefficients of are the same. Also the coefficient of must be the same. Thus expanding (xa)²:

(1 + s²) (x² – 2ax +  a²) = 0

– 2 (u cs)  = -2 a (1 +)

u = a (1 +) + cs = a + s (c + sa) = a + s f[a]

which is the same result as above.

A general formula with root x – a

We can deduce a general form that may be useful on occasion. When we substitute the point {af[a]} into the formula for the circle, then we can find r, and actually eliminate it.

(x u)² + f[x]² = r² = (a u)² + f[a

f[x f[a = (a u)² – (x u

(f[x] f[a](f[x] + f[a])  = ((a u) – (x u))  ((a u) + (x u))

(f[x] f[a](f[x] + f[a]) = (a x)   (a + x 2u)

f[x] f[a]  = (a x)  (a + x 2u) / (f[x] + f[a])

f[x] f[a]  = (x a)  (2u – x – a) / (f[x] + f[a])       … (* general)

f[x] f[a]  = (x a) q[x, a, u]

We cannot do much with this, since this is basically only true for x = a and f[x] – f[a] = 0. Yet we have this “branch cut”:

(1)      q[x, a, u] = f[x] – f[a]  / (a x)        if x ≠ a

(2)      q[a, a, u]      potentially found by other means

If it is possible to “simplify” (1) into another expression Simplify[q[x, a, u]] without the division, then the tantalising question becomes whether we can “simply” substitute x = a. Or, if we were to find q[a, a, u] via other means in (2), whether it links up with (1). These are questions of continuity, and those are traditionally studied by means of limits.

Theorem on the slope

We can still use the general formula to state a theorem.

Theorem. If we can eliminate factors without division, then there is an expression q[x, a, u] such that evaluation at x = a gives the slope s of the line, or q[a, a, u] = s, such that at this point both curve and line are touching the same circle.

Proof. Eliminating factors without division in above general formula gives:

q[x, a, u] (2u – x – a) / (f[x] + f[a])

Setting x = a gives:

q[a, a, u] = (u – a) / f[a]

And the above s = (u – a) / f[a] implies that q[a, a, u] = s. QED

This theorem gives us the general form of the incline (tangent).

y[x, a, u] = (x – a) q[a, a, u] + f[a]       …  (* incline)

y[x, a, u] = (x – a) (u – a) / f[a] + f[a

PM. Dynamic division satisfies the condition “without division” in the theorem. For, the term “division” in the theorem concerns the standard notion of static division.

Corollary. Polynomials as the showcase

Polynomials are the showcase. For polynomials p[x], there is the polynomial remainder theorem:

When a polynomial p[x] is divided by (x a) then the remainder is p[a].
(Also, x – a is called a “divisor” of the polynomial if and only if p[a] = 0.)

Using this property we now have a dedicated proof for the particular case of polynomials.

Corollary. For polynomials q[a] = s, with no need for u.

Proof. Now, p[x] – p[a] = 0 implies that – is a root, and then there is a “quotient” polynomial q[x] such that:

p[x] – p[a] = (x a) q[x]

From the general theorem we also have:

p[x] – p[a]  = (x a) q[x, a, u]

Eliminating the common factor (x – a) without division and then setting x = a gives q[a] = q[a, a, u] = s. QED

We now have a sound explanation why this polynomial property gives us the slope of the polynomial at that point. The slope is given by the incline (tangent), and it must also be slope of the polynomial because of the mutual touching of the same circle.

See the earlier discussion about techniques to eliminate factors of polynomials without division. We have seen a new technique here: comparing the coefficients of factors.

Second corollary

Since q[x] is a polynomial too, we can apply the polynomial remainder theorem again, and thus we have q[x] = (x a) w[x] + q[a] for some w[x]. Thus we can write:

p[x] = (x a) q[x] + p[a

p[x] = (x a) ( (x – a) w[x] + q[a] ) + p[a]       … (* Ruffini’s Rule twice)

p[x] = (x a w[x] + (x – a) q[a] + p[a]           … (* Range’s proof)

p[x] = (x a w[x] + y[x, a]                             … (* with incline)

We see two properties:

  • The repeated application of Ruffini’s Rule uses the indicated relation to find both s = q[a] and constant f[a], as we have seen in last discussion.
  • Evaluating f[x] / (x a)² gives the remainder y[x, a], which is the formula for the incline.
Range’s proof method

Michael Range proves q[a] = s as follows (in this article (p406) or book (p32)). Take above (*) and determine the error by substracting the line y = s (x a) + p[a] :

error = p[x] – y = (x a w[x] + (x – a) q[a] – s (x a)

= (x a w[x] + (x – a) (q[a] – s)

The error = 0 has a root x = a with multiplicity greater than one if and only if s = q[a].

Direct application to the incline itself

Now that we have established this theory, there may be no need to refer to the circle explicitly. It can suffice to use the property of the double root. Michael Range (2014) gives the example of the incline (tangent) at x² at {a, a²}. The formula for the incline is:

f[x] – f[a]  = s (x – a)

x² a² – s (x – a) = 0

 (x – a) (x + a s) = 0

There is only a double root or (xa)² when s = 2a.

Working directly on the line allows us to focus on s, and we don’t need to determine q[x] and plug in x = a.

Michael Range (2011) clarifies – with thanks to a referee – that the “point-slope” form of a line was introduced by Gaspard Monge (1746-1818), and that Descartes apparently did not think about this himself and thus neither to plug in y = f [x] here. However, observe that we only can maintain that there must be a double root on this line form too, since {a, f[a]} still lies on a tangent circle.

[Addendum 2017-01-10: The later argument in a subsequent weblog entry becomes: If the function can be factored twice, then there is no need to refer to the circle. But when this would be equivalent to the circle then such a distinction is immaterial.]

Addendum. Example of function crossing a circle

When a circle touches a curve, it still remains possible that the curve crosses the circle. The original idea of two points merging together into an overlapping point then doesn’t apply anymore, since there is only one intersecting point on either side if the circle were smaller or bigger.

An example is the spline function g[x] = {If x < 0 then 4 – x² / 4 else 4 + x² / 4}. This function is C1 continuous at 0, meaning that the sections meet and that the slopes of the two sections are equal at 0, while the second and higher derivatives differ. The circle with center at {0, 0} and radius 4 still fits the point {0, 4}, and the incline is the line y = 4.

descartes-spline

An application of above algorithm would look at the sections separately and paste the results together. Thus this might not be the most useful example of crossing.

In this example there might be no clear two “overlapping” points. However, observe:

  • Lines through {0, 4} might have three points with the curve, so that the incline might be seen as having three overlapping points.
  • Points on the circle can always be seen as the double root solutions for tangency at that point.
Addendum. Discussion

There is still quite a conceptual distance between (i) the story about the two overlapping points on the circle and (ii) the condition of double roots in the error between line and polynomial.

The proof given by Range uses the double root to infer the slope of the incline. This is mathematically fine, but this deduction doesn’t contain a direct concept that identifies q[a] as the slope of an incline (tangent): it might be any line.

We see this distinction between concept and algorithm also in the direct application to Monge’s point-slope formulation of the line. Requiring a double root works, but we can only do so because we know about the theory about the tangent circle.

The combination of circle and line remains the fundamental reason why there are two roots. Thus the more general proof given above, that reasons from the circle and unpacks f[x]² – f[a]² into the conditions for incline and its normal, is conceptually more attractive. I am new to this topic and don’t know whether there are references for this general proof.

Conclusions

(1) We now understand where the double root comes from. See the earlier discussion on polynomials, Ruffini’s rule and the meaning of division (see the section on “method 2”).

(2) There, we referred to polynomial division, with the comment: “Remarkably, the method presumes x ≠ a, and still derives q[a]. I cannot avoid the impression that this method still has a conceptual hole.” However, we now observe that we can compare the values of the coefficients of the powers of x, whence we can avoid also polynomial division.

(3) There, we had a problem that developing p[x] = (x aw[x] + y[x, a] didn’t have a notion of tangency, in terms of Δf / Δx. However, we actually have a much older definition of tangency.

(4) The above states an algorithm and a general theorem with the requirements that must be satisfied.

(5) Cartesius wins from Fermat on this issue of the incline (tangent), and actually also on providing an exact method for polynomials, where Fermat introduced the problem of error.

(6) For trigonometry and exponentials we know that these can be written as power series, and thus the Cartesian method would also apply. However, the power series are based upon derivatives, and this would introduce circularity. However, the method of the dynamic quotient from 2007 still allows an algebraic result. The further development from Fermat into the approach with limits would become relevant for more complex functions.

PM. The earlier discussion referred to Peter Harremoës (2016) and John Suzuki (2005) on this approach. New to me (and the book unread) are: Michael Range (2011), the recommendable Notices, or the book (2015) – review Ruane (2016) – and Shen & Lin (2014).

Cartesius, Portrait by Frans Hals 1648

Cartesius, Portrait by Frans Hals 1648

 

 

Advertisements

Joost Hulshof & Ronald Meester (2010) suggest to introduce the derivative in highschool by means of polynomials (pdf p16-17). My problem is that they first hide the limit and then let it ambush the student. Thus:

  • When they say that “you can present the derivative for polynomials without limits” then they mean this only for didactics and not for mathematics.
  • But they are not trained in didactics, so they are arguing this as a hobby, as mathematicians with a peculiar view on didactics. They provide a course for mathematics teachers, but this concerns mathematics and not didactics.
  • They only hide the limit, but they do not deny that fundamentally you must refer to limits.
  • Eventually they still present the limit to maintain exactness, but then it has no other role than to link up to a later course (perhaps only for mathematicians).
  • Thus, they make the gap between “didactics” and proper “mathematics” larger on purpose.
  • This is quite different from the algebraic approach (see here), that really avoids limits, and also argues that limits are fundamentally irrelevant (for the functions used in highschool).

I have invited Hulshof since at least 2013 (presentation at the NVvW study day) to look at the algebraic approach to the derivative. He refuses to look into it and write a report on it, though he was so kind to look at this recent skirmish.

Hulshof refers to his approach perhaps as sufficient. It is quite unclear what he thinks about all this, since he doesn’t discuss the proposal of the algebraic approach to the derivative.

Let me explain what is wrong with their approach with the polynomials.

Please let mathematicians stop infringing upon didactics of mathematics. It is okay to check the quality of mathematics in texts that didacticians produce, but stop this “hobby” of second-guessing.

PM. A recent text is Hulshof & Meester (2015), “Wiskunde in je vingers“, VU University Press (EUR 29.95). Potentially they have improved upon the exposition in the pdf, but I am looking at the pdf only. Meester lists this book as “books mathematics” (p14). Hulshof calls it “concepts from mathematics” with “uncommon viewpoints” for “teacher, student” and for “education and curriculum”. When you address students then it is didactics. It is unclear why VU University Press thinks that he and Meester are qualified for this.

The incline

A standard notation for a line is y = c + s x, for constant c and slope s.

The line gives us the possibility of a definition of the incline (Dutch: richtlijn). An incline is defined for a function and a point. An incline of a function f at a point {a, f[a]} is a line that gives the slope of that function at that point.

It is wrong to say that the incline “has the same slope”. You are not comparing two lines. You are looking at the slope. You only know the slope of the function because of the incline (the line with that slope).

Incline versus tangent

The incline is often called the tangent. Students tend to think that tangents cannot cross the function, while tangents actually can. Thus incline can be a better term.

Hulshof & Meester refer in horror to the Oxford Advance Learner’s Dictionary, that has:

ERROR “Tangent: (geometry) a straight line that touches the outside of a curve but does not cross it. The cart track branches off at a tangent.”

I don’t think that “incline” will quickly replace “tangent”. But it is useful to discuss the issue with students and offer them an alternative word if “tangent” continues to confuse them. It is useful to start a discussion with students by mentioning the (quite universal) intuition of not-crossing. An orange touches a table, and doesn’t cross it. But mathematically it would be quite complex to test whether there is any crossing or not. Thus it is simpler to focus on the idea of incline, straight course, alignment.

When you swing a ball and then let go, then the ball will continue in the incline of the last moment. The incline captures that idea, by giving the line with that very slope.

I thank Peter Harremoës for a discussion on this (quite universal) confusion by students (and the OALD) and potential alternative terms. (Incline is still a suggestion.) (The word “directive” was rejected as too confusing with “derivative”. But Dutch “richtlijn” is better than raaklijn.)

Polynomials and their division

A polynomial of degree n has powers of x of size n:

p[x] = c + s x + c2 x² + … + cn xn.

In this, we take c = c0 and s = c1. For n = 1 we get the line again. We allow that the line has s = 0, so that we can have a horizontal line, which would strictly be a polynomial of n = 0. There is also the vertical line, that cannot be represented by a polynomial.

If p[a] = 0 then x = a is called a zero of the polynomial. Then (x a) is called a factor, and the polynomial can be written as

p[x] = (x aq[x]

where q[x] is a polynomial of a lower degree.

If p[a] ≠ 0 then we can still try to factor with (x a) but then there will be a remainder, as p[x] = (x aq[x] + r[x]. When we consider p[x] – r[x] then x = a is a zero of this. Thus:

 p[x] – r[x] = (x aq[x]

With polynomials we can do long division as with numbers. The following example is the division of x³ – 7x – 6 by x – 4 that generates a remainder.

purplemath-divisionIncline or tangent at a polynomial

Regard the polynomial p[x] at x = a, so that bp[a]. We consider point {a, b}. What incline does the curve have ?

(A) For the incline we have the line in {a, b}:

y b = s (x a)

(B) We have p[a] – b = 0 and thus x = a is a zero of the polynomial p[x] – b. Thus:

p[x] – b = (x aq[x]

(C) Thus (A) and (B) allow to assume y ≈  p[x] and to look at the common term x – a, “so that” (quotes because this is problematic):

s = q[a]

The example by Hulshof & Meester is p[x] = – 2 at the point {a, b} = {1, -1}.

p[x] – b = (x² – 2) – (-1) = – 1

Factor:  ( – 1) =  (x – 1) q[x]

Or divide: q[x] = ( – 1) / (x – 1)  = x + 1

Substituting the value = a = 1 in x + 1 gives q[a] = q[1] = 2.

H&M apparently avoid division by using the process of factoring.

Later they mention the limiting process for the division: Limit[x → 1, q[x]] = Limit[x → 1, ( – 1) / (x – 1)] = 2.

Critique

As said, the H&M approach is convoluted. They have no background in didactics and they hide the limit (rather than explaining its relevance since they still deem it relevant).

Mathematically, they might argue that they don’t divide but only factor polynomials.

  • But, when you are “factoring systematically” then you are actually dividing.
  • When you use “realistic mathematics education” then you can approximate division by trial and error of repeated subtraction, but I don’t think that they propose this. See the “partial quotient method” and my comments.
  • Addendum December 22: there is a way to look only at coefficients, Ruffini’s Rule, in wikipedia called Horner’s method. A generalisation is known as synthetic division, which expresses that it is no real division, but a method of factoring. (MathWorld has a different entry on “Horner’s method“.) See the next weblog entry.

When dividing systematically, you are using algebra, and you are assuming that a denominator like x – 1 isn’t zero but an abstract algebraic term. Well, this is precisely what the algebraic approach to the derivative has been proposing. Thus, their suggestion provides support for the algebraic approach, be it, that they do it somewhat crummy and non-systematically, whence it is little use to refer to this kind of support.

Didactically, their approach is undeveloped. They compare the slopes of the polynomial and the line, but there is no clear discussion why this would be a slope, or why you would make such a comparison. Basically, you can compare polynomials of order n with those of order m, and this would be a mathematical exercise, but devoid of interpretation. For didactics it does make sense to discuss: (a) the notion of “slope” of a function is given by the incline, (b) we want to find the incline of a polynomial for a particular reason (e.g. instantaneous velocity), (c) we can find it by a procedure called “derivative”. NB. My book Conquest of the Plane starts with surface and integral, and only later looks at slopes.

A main criticism however is also that H&M overlooked the fundamental problem with the notion of a slope of a line itself. They rely on some hidden issues here too. I discussed this recently, and repeat this below.

PM. See a discussion of approximating a function by polynomials. Observe that we are not “approximating” a function by its incline now. At {a, b} the values and slope are exactly the same, and there is nothing approximate about this. Only at other points we might say that there is an “error” by looking at the incline rather than the polynomial, but we are not looking at such errors now, and this would be a quite different topic of discussion.

Copy of December 8 2016: Ray through an origin

Let us first consider a ray through the origin, with horizontal axis x and vertical axis y. The ray makes an angle α with the horizontal axis. The ray can be represented by a function as y =  f [x] = s x, with the slope s = tan[α]. Observe that there is no constant term (c = 0).

2016-12-08-ray

The quotient y / x is defined everywhere, with the outcome s, except at the point x = 0, where we get an expression 0 / 0. This is quite curious. We tend to regard y / x as the slope (there is no constant term), and at x = 0 the line has that slope too, but we seem unable to say so.

There are at least three responses:

(i) Standard mathematics then takes off, with limits and continuity.

(ii) A quick fix might be to try to define a separate function to find the slope of a ray, but we can wonder whether this is all nice and proper, since we can only state the value s at 0 when we have solved the value elsewhere. If we substitute y when it isn’t a ray, or example x², then we get a curious construction, and thus the definition isn’t quite complete since there ought to be a test on being a ray.

2016-12-10-slopeofray

 

 

(iii) The algebraic approach uses the following definition of the dynamic quotient:

y // x ≡ { y / x, unless x is a variable and then: assume x ≠ 0, simplify the expression y / x, declare the result valid also for the domain extension x = 0 }

Thus in this case we can use y // x = s x // x = s, and this slope also holds for the value x = 0, since this has now been included in the domain too.

Line with constant

When we have a line y = c + s x, then a hidden part of the definition is that the slope is s everywhere, even though we cannot compute (y c) / x when x = 0. (One might say: “This is what it means to be linear.”)

When we look at x = a and determine the slope by taking a difference Δx, then we get:

b = c + s a

b + Δy = c + s (a + Δx)

Δy = Δx

The slope at would be s but is also Δy / Δx, undefined for Δx = 0

Thus, the slope of a line is either given as s for all points (or, critically for x = 0 too) (perhaps with a rule: if you find a slope somewhere then it holds everywhere), or we must use limits.

The latter can be more confusing when s has not been given and must be calculated from other resources. In the case of differentials dy = s dx, the notation dy / dx causes conceptual problems when s itself is found by a limit on the difference quotient.

Conclusions
  1. The H&M claim that polynomials can be used without limits is basically a didactic claim since they evidently still rely on limits (perhaps to fend of fellow mathematicians). This didactic claim is a wild-goose chase since they are not involved in didactics research.
  2. If they really would hold that factoring can be done systematically without division, then they might have a point, but then they still must give an adequate explanation how you get from (A) & (B) to (C). Saying that differences are “small” is not enough (not even for polynomials). Addendum December 22: see the next weblog entry on Ruffini’s rule.
  3. They present this for a “reminder course in mathematics” for teachers of mathematics, but it isn’t really mathematics and it is neither useful for teaching mathematics.
  4. A serious development that avoids limits and relies on algebraic methods, that covers the same area of polynomials but also trigonometry and exponential functions, is the algebraic approach to the derivative, available since 2007 with a proof of concept in Conquest of the Plane in 2011.
  5. It is absurd that Hulshof & Meester neglect the algebraic approach. But they are mathematicians, and didactics is not their field of research. I think that the algebraic method provides a fundamental redefinition of calculus, but I prefer the realm of didactics above the realm of mathematics with its culture of contempt for empirical science.
  6. The H&M exposition and neglect is just an example of Holland as Absurdistan, and the need to boycott Holland till the censorship of science by the directorate of the Dutch Central Planning Bureau has been lifted.
I wouldn't want to be caught before a blackboard like that (Screenshot UChicago)

I wouldn’t want to be caught before a blackboard like that (Screenshot UChicago)

I am looking for a story on continuity and limits that can be told in junior highschool and still makes sense. We would like an isomorphy between space and numbers. For some aspects, mathematical theory sends us to number theory, and for isomorph aspects, mathematical theory sends us to topology. It is awkward to have to translate similar notions, and to eliminate the overload of notions that are not directly relevant for this search for this junior highschool story.

For example, topology has rephrased results into statements on open and closed sets and boundaries, but I am wondering whether that is an effective manner of communication, when the relevant distinction is whether you are assuming a well-ordening or not. But I am not at home in number theory or topology. These comments on continuity and limits have been caused precisely because I am feeling the water.

Basically, I already designed such a story on continuity and limits (pdf, weblog), but now I am noticing that I can include a question mark on infinitesimals.

Addendum December 16

This weblog text is a rewrite of yesterday’s text.

Principle

The framework contains both the handling of real numbers on the calculator and a development of theory.

Example 1

Also in junior highschool, we want students to be aware that 0.999…. = 1.000…. so that these are the same number. You can see this by checking 3 * 1/3 = 1.

Example 2

When we approximate numbers with n the number of decimals, then these basically are like the natural numbers, and there remains a well ordering. Numbers are δ[n] = 10^(-n) apart.

When we shift to the use of the infinite number of decimals then we lose this “infinitesimal”. At issue is now whether the infinitesimal can be retained in some manner.

Standard definition of density causes contradiction

Discussing the continuum and the set of real numbers R recently, I suggested (here, property (a)) that R would be a dense set, according to the standard definition of density. This definition is that for any two elements x < y there would be at least one z between, as x < z < y. This would allow you to make cuts everywhere.

Oops. I retract.

Wikipedia (no source but a portal) has:

“From the ZFC axioms of set theory (including the axiom of choice) one can show that there is a well order of the reals.”

I don’t know quite what to think about this. Elsewhere I deduced that ZFC is inconsistent. But perhaps in a revised set theory, the well order can be retained.

We would like R to have a well-order for finite intervals too. Thus every number x has a next number x’. When you select y = x’  then you couldn’t find anything between x and y. This contradicts above statement on density.

Thus, the standard definition of density doesn’t fit a well-ordered R.

Designing a new definition for density of the reals R

We can design a new definition of density.

  • The standard definition is useful for the rationals Q. If we restrict your freedom to making cuts along Q, then we are safe again. In this manner, the distinction between rational and irrational numbers is useful to explain a property of R.
  • R is defined as “more dense” because Q is dense w.r.t. that original definition ((a)).
  • This proposal is quite similar to the Dedekind cut, with the distinction that we now allow that R might retain a well-ordering. That is, this issue on the ordering is no longer forced by the standard definition of density.
Surprise consequence as a bonus

Switching to another notion of density, generates the bonus that we have more scope to introduce the infinitesimal.

When every number x has a next x’, then we can define the infinitesimal as the difference:

δ = x’ x

It also means that an open set (a, b) can also be seen as a closed set [a + δ, b – δ].

Wikipedia (no source but a portal) claims today:

“The standard ordering ≤ of any real interval is not a well ordering, since, for example, the open interval (0, 1) ⊆ [0,1] does not contain a least element.”

Yet now we have (0, 1) = [δ, 1 – δ] and the least element is δ. Only the intervals with negative infinity might be excluded, check (-∞, ∞).

Properties of these infinitesimals

Some properties are:

(1) We still have 0.999…. = 1.000…..

(2) A current statement is that a line consists of points, and each point is a co-ordinate without length. We now can better express that length consists of a sum of short lengths. A sum of these infinitesimals Σδ makes sense if we regard it as the sum from x = 0 to x = 1 for x‘ – x. The trick is that the length is determined by the statement on x and not by the coefficient of δ.

(3) Using H = -1, then δ δH = 1. That is, δ ≠ 0, and thus there is no problem with division. The discussion about differentials is quite different from the discussion about these (new) infinitesimals. Much time has been spent in history in looking whether there might be a connection, but there isn’t.

Separate arithmetic for infinity and infinitesimal

Students already know that they cannot apply the rules of arithmetic to infinity. E.g. ∞ + ∞ = ∞. The same now holds for above hypothetical notion of the infinitesimal.

Property (2) carries over from δ[n] with n the number of decimals. Property (1) arises when n → ∞ . Potentially, these notions cannot be combined without some conflict.

We are accustomed to think that any real number can be divided. But e.g. δ / 2 is nonsense because it gives the distance between two numbers, and there is nothing smaller. Thus, the normal rules for arithmetic only hold for reals that are not these infinitesimals.

With δ = x’ x we also want to consider y = (x / n). When the numbers are halved for n = 2, is the distance halved or isn’t it halved ? In the approximation δ[n] the distance can become smaller when more digits are included. For an infinite number of digits, presumably, the distance cannot be halved. Thus δ = y’ y. Multiplication by n gives nδ = n (x / n)x. Thus x’ = n (x / n)‘ – nδ + δ for any n. This would make (most) sense by the choice δ = 0 and x’ / n = (x / n)‘. But then we are back in the classical approach again, without the well ordering. (The next number is the number itself.)

Persumably, we can argue that n * δ is as problematic as δ / n, though. The notion of Σδ namely has been solved by putting the consideration of length into the Σ sign.

I don’t know yet whether it is sufficient (consistent) to state δ = x’ and that the rules for arithmetic don’t apply to δ like they don’t apply to ∞. Potentially, we might write δH → ∞ (and this doesn’t mean that δ ∞ → 1).

All this depends upon whether we can develop a consistent set of definitions. Students at junior highschool might agree that they aren’t much interested in that.

Thus, it might only be in senior highschool, when we discuss the “classical” approach to the reals, that has (a, b) as an open interval only. We would be forced to this not because of the definition of density but because of the rules of arithmetic.

Conclusion

In itself, notions like these are not world shocking but they would tend to fit the intuitions of space and number for junior highschool.

At some point of history, the main stream in mathematics has opted for an approach to the reals so that they have no well ordering. The obstacle of the standard definition of density can be removed, as shown above. A problem still resides in arithmetic with δ = x’ x. It is not clear to me whether this can be resolved. It is not clear to me neither whether it is okay to have benign neglect till senior highschool, and face the consequences of losing the well ordering only there.