Tag Archives: Polynomial

When we take a ring and include division then we get a field For example, the integers Z = { … -3, -2, -1, 0, 1, 2, 3, … } form a ring, and with division we get the rational numbers Q and also (with completion) the real numbers R. These are concepts from “group theory“. I have always wondered what the use of this group theory actually is.

The change from ring Z to field R is not quite the inclusion of division – since the ring already has implied division namely as repeated subtraction – but the change consists of extending the set with “accepted numbers” by inverse elements xH for H = -1. In that case the results of division are also included in the same set. In terms of Z the expression 2H is not a number, but for Q and R we accept this.

If the ring has variables and expressions, then we can form the expression 1 = 2 z, and we effectively have z = 2H, and then we might wonder whether it actually matters much whether this z belongs to Z or not.

Part of the confusion in this discussion is caused by that we might regard 2H as the operation 1 / 2, while we might also regard it as the number. Thus when some people say that the difference between the ring and the field concerns the operation of division, another perspective is that the field already has an implied notion of division but merely lacks the numbers to fit all answers.

The discussion within group theory might be a victim of the phenomenon of the procept. When the discussion is confused, perhaps group theory itself is confused. We should get enhanced clarity by removing the ambiguity of operation and result, but perhaps textbooks then become thicker.

Subsequently, we get a distinction between:

• Mathematics for which group theory isn’t so relevant – such that there is a logical sequence from natural numbers to integers, to rationals, to reals, to multidimensional reals, for, all is implied by logic and algebra, and only the end result matters,
• Mathematics for models for which group theory is relevant – i.e. for models for which it is crucial that e.g. Z has no z such that 1 = 2 z. The crux lies in the elements of the sets, as the operations themselves are actually implied.

A model might be the number of people. Take an empty building. A biologist, physicist and mathematician watch the events. Two people enter the building, and some time later three people leave the building. The biologist says: “They have reproduced.” The physicist says: “There was a quantum fluctuation.” The mathematician says: “There is -1 person in the bulding.”

The following develops the example of implied division. This discussion has been inspired by both the recent discussion of the “ring of polynomials” (thus without division but still with divisor and remainder) and the observation that “realistic mathematics education” (RME) allows students to avoid long division and allows “partial quotients” (repeated subtraction).

An example from Z, the integers

Z rewrites repeated addition 3 + 3 + 3 + 3 = 12 as multiplication 4 * 3 = 12.

Z allows the converse 12 – 3 – 3 – 3 – 3 = 0 and also the expression 12 – 4 * 3 = 0.

Z doesn’t allow the rewrite of the latter into 12 / 4 = 3.

Yet 12 – 4 * 3 = 0 gives the notion of “implied division”, namely, find the z such that 12 – 4 * z = 0.

This notion of “implied division” is well defined, but the only problem is that we cannot find a number in that satisfies 1 – 2z = 0.

If we extend Z with basic elements nH for n ≠ 0 then we can find a z that satisfies 1 = 2z but the extension generates a new set of elements that we call Q, the rational numbers. Since we cannot list all these numbers, it is not irrational of mathematicians to say that they actually include the operation itself.

The following discusses this with formulas.

A ring has implied division

Multiplication is repeated addition. The ring of integers has the notion of subtraction. Define “implied division” of y by x as the repeated subtraction from y of some quantity z, for x times with remainder 0. For x ≠ 0:

y – x z = 0                   (* definition)

To refer to this property, we use abstract symbol H, though we later use H = -1.

xH y =  z    ⇔    y = x z          (** notation)

For x itself:

xH x = x xH = 1

For zero

We have 0 z = 0 for all z in the ring. Then for implied division by zero we have:

y – 0 z = 0    ⇒   y = 0

As above, for y = 0:

00 z = 0   for any z

0H 0 = z    for any z

Thus the rule is: For implied division within the ring, the denominator cannot be 0, unless the numerator is 0 too, in which case any number would satisfy the equation.

This is not necessarily “infinity” or “undefined” but rather “any z in Z“. The solution set is equal to Z itself. There is a difference between functions (only one answer) and correspondences (more answers).

Compare to the common definition

A ring is commonly turned into a field by including the normal definition of division:

x ≠ 0     ⇒     xH x x xH = 1

With this definition we get (multiplying left or right):

xH y  ⇔     x xH y x z     ⇔    y = x z

The curious observation is that a definition of division seems superfluous, since we already have implied division. The operation (*) already exists within the ring. We included a special notation for it, but this should not distract from this basic observation. If you have a left foot then it doesn’t matter whether you call it George or Harry.

An aspect is the algorithm

The natural numbers can be factored into prime numbers. When we solve 6 / 3 = 2, then we mean that 6 can be factored as 2 times 3, and that we can eliminate the common factor.

6 / 3 = z    ⇔    6 = 3    ⇔   2 3 = 3 z    ⇔   3 (2 – z) = 0     ⇔

3 = 0   or    (2 – z) = 0

But, again, this algorithm doesn’t work for a case like 1 = 2 z.

The “problem” are the elements

Let us consider the implied division of 1 by 2. This generates:

2H 1 = z

2H = z

1 = 2 z

Thus we don’t actually need to know what this z is, since we have the relevant expressions to deal with it.

The point is: when we run through all elements in Z = { … -3, -2, -1, 0, 1, 2, 3, … }, then we can prove that none of these satisfies 1 = 2 z.

Thus the core of group theory are the elements of the sets, and less the operations, since these are implied.

The basic notion is that 0 has successor 1 = s[0], and so on, and this gives us N. That 0 is a predecessor of s[0] generates the idea of inversion that s[H] = 0. This gives us Z. Addition leads to subtraction, to multiplication, to division. The core of addition doesn’t change, only the “numbers”.

Thus, group theory might have a confusing language that focuses on the operations, while the actual discussion is about the numbers (since the operations are already available and implied).

The fundamental impact of algebra

Thus, once we accept algebra, then the real numbers can be developed logically, and it is a bit silly to speak about “group theory”, since there are only steps, and all is implied. It only makes sense for applications to models, such as the notion that there aren’t half people and such.

It remains relevant that some algorithms may only apply to some domains and not others. Factoring natural numbers into prime numbers still works for the natural numbers embedded in the reals, yet, it is not clear whether such a notion of factoring would be relevant for other real numbers.

Appendix. Potential extension with an inverse for zero ?

We might consider to include the element 0H in the ring, to create 〈ring, 0H〉.

(1) If we maintain that 0 z = 0 for all z in 〈ring, 0H〉 then:

0H 0 = 0   with 0H in 〈ring, 0H

Observe that this is not a deduction, but a definition that 0 z = 0 for all z.

One viewpoint is that there is a conflict between “any z” and “only z = 0″ so that we cannot adopt this definition. Another viewpoint is that the latter uses the freedom of the former.

(2) When we write 0H as ∞ then it might be clearer that 0H 0 remains a problematic form.

If we create the 〈ring, 0H〉, then we might also hold: 0 z = 0 for all numbers except 0H. In that case, the result is maintained that

0H 0 = z    for any z

(3) An option is to slightly revise the definition as repeated subtraction by z until the remainder equals that very quantity z again. Thus:

y – (x – 1) z = z                   (*** definition 2)

xH y = y – (x – 1) z = z                  (**** definition and notation 2)

For = 0 we would now use z – z = 0 which might be less controversial.

0H y = y – (0 – 1) z = z

yz – z = 0

0H y = 0H 0 z

However, the more common approach is that 0H isand that is undefined too, while we cannot exclude that the answer would be z∞.

PM. Partial quotients

I wouldn’t want to be caught before a blackboard like that (Screenshot UChicago)

Our protagonists are Cartesius (1596-1650) and Fermat (1607-1665). As Judith Grabiner states, in a recommendable text:

“One could claim that, just as the history of Western philosophy has been viewed as a series of footnotes to Plato, so the past 350 years of mathematics can be viewed as a series of footnotes to Descartes’ Geometry.”  (Grabiner) (But remember Michel Onfray‘s observation that followers of Plato have been destroying texts by opponents. (Dutch readers check here.))

Both Cartesius and Fermat were involved in the early development of calculus. Both worked on the algebraic approach without limits. Cartesius developed the method of normals and Fermat the method of adequality.

Fermat and Δf / Δx

Fermat’s method was algebraic itself, but later has been developed into the method of limits anyhow. When asked what the slope of a ray y = s x is at the point x = 0, then the answer y / x = s runs into problems, since we cannot use 0 / 0. The conventional answer is to use limits. This problem is more striking when one considers the special ray that is defined everywhere except at the origin itself. The crux of the problem lies in the notion of slope Δf / Δthat obviously has a problematic division. With set theory we can now define the “dynamic quotient”, so that we can use Δf // Δx = s even when Δx = 0, so that Fermat’s problem is resolved, and his algebraic approach can be maintained. This originated in 2007, see Conquest of the Plane (2011).

Cartesius and Euclid’s notion of tangency

Cartesius followed Euclid’s notion of tangency. Scholars tend to assign this notion to Cartesius as well, since he embedded the approach within his new idea of analytic geometry.

I thank Roy Smith for this eye-opening question:

“Who first defined a tangent to a circle as a line meeting it only once? From googling, it seems commonly believed that Euclid did this, but it seems nowhere in Euclid does he even state this property of a tangent line explicitly. Rather Euclid gives 4 other equivalent properties, that the line does not cross the circle, that it is perpendicular to the radius, that is a limit of secant lines, and that it makes an angle of zero with the circle, the first of which is his definition, the others being in Proposition III.16. I am wondering where the “meets only once” definition got started. I presume once it got going, and people stopped reading Euclid, (which seems to have occurred over 100 years ago), the currently popular definition took over. Perhaps I should consult Legendre or Hadamard? Thank you for any leads.” (Roy Smith, at StackExchange)

In this notion of tangency there is no problematic division, whence there is no urgency to use limits.

The reasoning is:

• (Circle & Line) A line is tangent to a circle when there is only one common point (or the two intersecting points overlap).
• (Circle & Curve) A smooth curve is tangent to a circle when the  two intersecting points overlap (but the curve might cross the circle at that point so that the notion of “two points” is even more abstract).
• (Curve & Line) A curve is tangent to a line when the above two properties hold (but the line might cross the curve, whence we better speak about incline rather than tangent).
Example of line and circle

Consider the line y f[x] = c + s x and the point {a, f[a]}. The line can also be written with c = f[a] – s a:

y – f[a] = s (x a)

The normal has slope –sHwhere we use = -1. The formula for the normal is the line y – f[a] = –sH  (xa). We can choose the center of the circle anywhere on this line. A handy choice is {u, 0}, so that we choose the center on the horizontal axis. (If we looked at a ray and point {0, 0}, then the issue would be similar for {0, c} for nonzero c and thus the approach remains general.) Substituting the point into the normal gives

0 – f[a] = –sH  (ua)

s = (u – a) / f[a]

u + s f[a]

The circle has the formula (x u)² + y² = r². Substituting {a, f[a]} generates the value for the radius r² = (a – (a + s f[a]))² + f[a]² = (1 + s²) f[a]² . The following diagram has {c, s, a} = {0, 2, 3} and thus u = 15 and r = 6√5.

Method of normals

For the method of normals and arbitrary function f[x], Cartesius’s trick is to substitute y = f[x] into the formula for the circle, and then solve for the unknown center of the circle.

(x u)² + (y – 0)² = r²

(x u)² + f[x]² – r² = 0         … (* circle)

This expression is only true for x = a, but we treat it as if it were more general. The key property is:

Since {a, f[a]} satisfies the circle, this equation has a solution for x = a with a double root.

Thus there might be some g such that the root can be isolated:

(x ag [x, u] = 0         … (* roots)

Thus, if we succeed in rewriting the formula for the circle into the form of the formula with the two roots, then we can use information about the structure of the latter to say something about u.

The method works for polynomials, that obviously have roots, but not necessarily for trigonometry and the exponential function.

Algorithm

The algorithm thus is: (1) Substitute f[x] in the formula for the circle. (2) Compare with the expression with the double root. (3) Derive u. (4) Then the line through {a, f[a]} and {u, 0} will give slope –sH. Thus s = (ua) / f[a] gives the slope of the incline (tangent) of the curve. (5) If f[a] = 0, add a constant or choose center {u, v}.

Application to the line itself

Consider the line y f[x] = c + s x again. Let us apply the algorithm. The formula for the circle gives:

(x u)² + (c + s x)² – r² = 0

x² – 2ux + u² + c² + 2csx + s²x² – r² = 0

(1 + s²) x² – 2 (u cs) x +  u² + c² – r² = 0

This is a polynomial. It suffices to choose g [x, u] = 1 + s²  so that the coefficients of are the same. Also the coefficient of must be the same. Thus expanding (xa)²:

(1 + s²) (x² – 2ax +  a²) = 0

– 2 (u cs)  = -2 a (1 +)

u = a (1 +) + cs = a + s (c + sa) = a + s f[a]

which is the same result as above.

A general formula with root x – a

We can deduce a general form that may be useful on occasion. When we substitute the point {af[a]} into the formula for the circle, then we can find r, and actually eliminate it.

(x u)² + f[x]² = r² = (a u)² + f[a

f[x f[a = (a u)² – (x u

(f[x] f[a](f[x] + f[a])  = ((a u) – (x u))  ((a u) + (x u))

(f[x] f[a](f[x] + f[a]) = (a x)   (a + x 2u)

f[x] f[a]  = (a x)  (a + x 2u) / (f[x] + f[a])

f[x] f[a]  = (x a)  (2u – x – a) / (f[x] + f[a])       … (* general)

f[x] f[a]  = (x a) q[x, a, u]

We cannot do much with this, since this is basically only true for x = a and f[x] – f[a] = 0. Yet we have this “branch cut”:

(1)      q[x, a, u] = f[x] – f[a]  / (a x)        if x ≠ a

(2)      q[a, a, u]      potentially found by other means

If it is possible to “simplify” (1) into another expression Simplify[q[x, a, u]] without the division, then the tantalising question becomes whether we can “simply” substitute x = a. Or, if we were to find q[a, a, u] via other means in (2), whether it links up with (1). These are questions of continuity, and those are traditionally studied by means of limits.

Theorem on the slope

We can still use the general formula to state a theorem.

Theorem. If we can eliminate factors without division, then there is an expression q[x, a, u] such that evaluation at x = a gives the slope s of the line, or q[a, a, u] = s, such that at this point both curve and line are touching the same circle.

Proof. Eliminating factors without division in above general formula gives:

q[x, a, u] (2u – x – a) / (f[x] + f[a])

Setting x = a gives:

q[a, a, u] = (u – a) / f[a]

And the above s = (u – a) / f[a] implies that q[a, a, u] = s. QED

This theorem gives us the general form of the incline (tangent).

y[x, a, u] = (x – a) q[a, a, u] + f[a]       …  (* incline)

y[x, a, u] = (x – a) (u – a) / f[a] + f[a

PM. Dynamic division satisfies the condition “without division” in the theorem. For, the term “division” in the theorem concerns the standard notion of static division.

Corollary. Polynomials as the showcase

Polynomials are the showcase. For polynomials p[x], there is the polynomial remainder theorem:

When a polynomial p[x] is divided by (x a) then the remainder is p[a].
(Also, x – a is called a “divisor” of the polynomial if and only if p[a] = 0.)

Using this property we now have a dedicated proof for the particular case of polynomials.

Corollary. For polynomials q[a] = s, with no need for u.

Proof. Now, p[x] – p[a] = 0 implies that – is a root, and then there is a “quotient” polynomial q[x] such that:

p[x] – p[a] = (x a) q[x]

From the general theorem we also have:

p[x] – p[a]  = (x a) q[x, a, u]

Eliminating the common factor (x – a) without division and then setting x = a gives q[a] = q[a, a, u] = s. QED

We now have a sound explanation why this polynomial property gives us the slope of the polynomial at that point. The slope is given by the incline (tangent), and it must also be slope of the polynomial because of the mutual touching of the same circle.

See the earlier discussion about techniques to eliminate factors of polynomials without division. We have seen a new technique here: comparing the coefficients of factors.

Second corollary

Since q[x] is a polynomial too, we can apply the polynomial remainder theorem again, and thus we have q[x] = (x a) w[x] + q[a] for some w[x]. Thus we can write:

p[x] = (x a) q[x] + p[a

p[x] = (x a) ( (x – a) w[x] + q[a] ) + p[a]       … (* Ruffini’s Rule twice)

p[x] = (x a w[x] + (x – a) q[a] + p[a]           … (* Range’s proof)

p[x] = (x a w[x] + y[x, a]                             … (* with incline)

We see two properties:

• The repeated application of Ruffini’s Rule uses the indicated relation to find both s = q[a] and constant f[a], as we have seen in last discussion.
• Evaluating f[x] / (x a)² gives the remainder y[x, a], which is the formula for the incline.
Range’s proof method

Michael Range proves q[a] = s as follows (in this article (p406) or book (p32)). Take above (*) and determine the error by substracting the line y = s (x a) + p[a] :

error = p[x] – y = (x a w[x] + (x – a) q[a] – s (x a)

= (x a w[x] + (x – a) (q[a] – s)

The error = 0 has a root x = a with multiplicity greater than one if and only if s = q[a].

Direct application to the incline itself

Now that we have established this theory, there may be no need to refer to the circle explicitly. It can suffice to use the property of the double root. Michael Range (2014) gives the example of the incline (tangent) at x² at {a, a²}. The formula for the incline is:

f[x] – f[a]  = s (x – a)

x² a² – s (x – a) = 0

(x – a) (x + a s) = 0

There is only a double root or (xa)² when s = 2a.

Working directly on the line allows us to focus on s, and we don’t need to determine q[x] and plug in x = a.

Michael Range (2011) clarifies – with thanks to a referee – that the “point-slope” form of a line was introduced by Gaspard Monge (1746-1818), and that Descartes apparently did not think about this himself and thus neither to plug in y = f [x] here. However, observe that we only can maintain that there must be a double root on this line form too, since {a, f[a]} still lies on a tangent circle.

[Addendum 2017-01-10: The later argument in a subsequent weblog entry becomes: If the function can be factored twice, then there is no need to refer to the circle. But when this would be equivalent to the circle then such a distinction is immaterial.]

Addendum. Example of function crossing a circle

When a circle touches a curve, it still remains possible that the curve crosses the circle. The original idea of two points merging together into an overlapping point then doesn’t apply anymore, since there is only one intersecting point on either side if the circle were smaller or bigger.

An example is the spline function g[x] = {If x < 0 then 4 – x² / 4 else 4 + x² / 4}. This function is C1 continuous at 0, meaning that the sections meet and that the slopes of the two sections are equal at 0, while the second and higher derivatives differ. The circle with center at {0, 0} and radius 4 still fits the point {0, 4}, and the incline is the line y = 4.

An application of above algorithm would look at the sections separately and paste the results together. Thus this might not be the most useful example of crossing.

In this example there might be no clear two “overlapping” points. However, observe:

• Lines through {0, 4} might have three points with the curve, so that the incline might be seen as having three overlapping points.
• Points on the circle can always be seen as the double root solutions for tangency at that point.

There is still quite a conceptual distance between (i) the story about the two overlapping points on the circle and (ii) the condition of double roots in the error between line and polynomial.

The proof given by Range uses the double root to infer the slope of the incline. This is mathematically fine, but this deduction doesn’t contain a direct concept that identifies q[a] as the slope of an incline (tangent): it might be any line.

We see this distinction between concept and algorithm also in the direct application to Monge’s point-slope formulation of the line. Requiring a double root works, but we can only do so because we know about the theory about the tangent circle.

The combination of circle and line remains the fundamental reason why there are two roots. Thus the more general proof given above, that reasons from the circle and unpacks f[x]² – f[a]² into the conditions for incline and its normal, is conceptually more attractive. I am new to this topic and don’t know whether there are references for this general proof.

Conclusions

(1) We now understand where the double root comes from. See the earlier discussion on polynomials, Ruffini’s rule and the meaning of division (see the section on “method 2”).

(2) There, we referred to polynomial division, with the comment: “Remarkably, the method presumes x ≠ a, and still derives q[a]. I cannot avoid the impression that this method still has a conceptual hole.” However, we now observe that we can compare the values of the coefficients of the powers of x, whence we can avoid also polynomial division.

(3) There, we had a problem that developing p[x] = (x aw[x] + y[x, a] didn’t have a notion of tangency, in terms of Δf / Δx. However, we actually have a much older definition of tangency.

(4) The above states an algorithm and a general theorem with the requirements that must be satisfied.

(5) Cartesius wins from Fermat on this issue of the incline (tangent), and actually also on providing an exact method for polynomials, where Fermat introduced the problem of error.

(6) For trigonometry and exponentials we know that these can be written as power series, and thus the Cartesian method would also apply. However, the power series are based upon derivatives, and this would introduce circularity. However, the method of the dynamic quotient from 2007 still allows an algebraic result. The further development from Fermat into the approach with limits would become relevant for more complex functions.

PM. The earlier discussion referred to Peter Harremoës (2016) and John Suzuki (2005) on this approach. New to me (and the book unread) are: Michael Range (2011), the recommendable Notices, or the book (2015) – review Ruane (2016) – and Shen & Lin (2014).

Cartesius, Portrait by Frans Hals 1648

We continue the earlier discussion on (1) differentials and (2) polynomials. There is also this earlier discussion about (static or dynamic) division.

At issue is: Can we avoid the use of limits when determining the derivative of a polynomial ?

A sub-issue is: Can we avoid division that requires a limit ?

We use the term incline instead of tangent (line), since this line can also cross a function and not just touch it.

We use H = -1, so that we can write x xH = xH x = 1 for x ≠ 0. Check that xH = 1 / x, that the use of H is much more effective and efficient. The use of 1 / x is superfluous since students must learn about exponents anyway.

Ruffini’s Rule

Ruffini’s Rule is a method not only to factor polynomials but also to isolate the factors. A generalised version is called “synthetic division” for the reason that it isn’t actually division. On wikipedia, Ruffini’s Rule is called “Horner’s Method“. On mathworld, the label “Horner’s Method” is used for something else but related again. My suggestion is to stick to mathworld.

Thus, the issue at hand would seem to have been answered by Ruffini’s Rule already. When we can avoid division then we don’t need a limit around it. However, our discussion is about whether this really answers our question and whether we really understand the answer.

Historical note

I thank Peter Harremoēs for informing me about both Ruffini’s Rule and some neat properties that we will see below. His lecture note in Danish is here. Surprising for me, he traced the history back to Descartes. Following this further, we can find this paper by John Suzuki, who identifies two key contributions by Jan Hudde in Amsterdam 1657-1658. Looking into my copy of Boyer “The history of the calculus” now, page 186, I must admit that this didn’t register to me when I read this originally, as it registers now. We see the tug and push of history with various authors and influences, and thus we should be cautious about claiming who did what when. Suzuki’s statement remains an eye-opener.

“We examine the evolution of the lost calculus from its beginnings in the work of Descartes and its subsequent development by Hudde, and end with the intriguing possibility that nearly every problem of calculus, including the problems of tangents, optimization, curvature, and quadrature, could have been solved using algorithms entirely free from the limit concept.” (John Suzuki)

Apparently Newton dropped the algebra because it didn’t work on trigonometry and such, but with modern set theory we can show that the algebraic approach to the derivative works there too. For the discussion below: check that limits can be avoided.

Division is also a way to isolate factors

When we have 2 x = 6, then we can determine 2 x = 2 3, and recognize the common factor 2. By the human eye, we can see that x = 3 and then we have isolated the factor 3. But in mathematics, we must follow procedures as if we were a computer programme. Hence, we have the procedure of eliminating 2, which is called division:

2H 2 x = 2H 2 3

x = 3

The latter example abuses the property that 2 is nonzero. We must actually check that the divisor is nonzero. If we don’t check then we get:

4 x = 9 x

4 x xH = 9 xH

4 = 9

Checking for zero is not as simple as it seems. Also expressions with only numbers might contain zero in hidden format, as for example  (4 + 2 – 6)H. Thus it would seem to be an essential part of mathematics to develop a sound theory for the algebra of expressions and the testing on zero.

Calculus uses the limit around the difference quotient to prevent division by zero. But the real question might rather be whether we can isolate a factor. When we can isolate that factor without division that requires a limit, then we hopefully have a simpler exposition. Polynomials are a good place to start this enquiry.

Shifting to rings without division ?

The real numbers form a “field” and when we drop the idea of division, then we get a “ring“. Above 2 x = 6 might also be solved in a ring without division. For we can do:

2 x – 2 3 = 6 – 2 3

2 (x – 3) = 0

2 = 0    or    x – 3 = 0

We again use that 2 ≠ 0. Thus x = 3.

This example doesn’t show a material difference w.r.t. the assumption of division by 2. We also used that 6 can be factored and that 2 was a common factor. Perhaps this is the more relevant notion. Whatever the case, it doesn’t seem to be so useful to leave the realm of the real numbers.

Properties of polynomials

Our setup has a polynomial p[x] with focus of attention at x = a with point {a, b} = {a, p[a]}. When we regard (xa) as a factor, then we get a “quotient” q[x] and a “remainder” r[x].

p[x] = (xa) q[x] + r[x]

It is a nontrivial issue that q and r are polynomials again (proof of polynomial division algorithm, or proofwiki). These proofs don’t use limits but assume that the divisor is nonzero. Thus we might be making a circular argument when we use that q and are polynomials to argue that limits aren’t needed. Examples can be given of polynomial long division. Such examples tend not to mention explicitly that the divisor cannot be zero. Nevertheless, let us proceed with what we have.

Since (xa) has degree 1, the remainder must be a constant, and thus be equal to p[a]. Thus the “core equation” is:

p[x] = (xa) q[x] + p[a]      …  (* core)

p[x] – p[a] = (xa) q[x]

At x = a we get 0 = 0 q[a], whence we are at a loss about how to isolate q[x] or q[a].

When we have defined derivatives via other ways, then we can check that the derivative of (*) is:

p’ [x] = q[x] + (xa) q’ [x]

p’ [a] = q[a]

We can also rewrite (*) so that it indeed looks like an difference quotient.

q[x] = (p[x] – p[a])  (xa)H       …. (** slope = tan[θ], see Spiegel’s diagram)

We cannot divide by (x a) for x = a, for this factor would be zero.

PM. In the world of limits, we could define the derivative of p at a by taking the Limit[x → a, q[x]]. This generates again (Spiegel’s diagram):

q[a] = tan[α]

But our issue is that we want to avoid limits.

Incline

The incline of the polynomial at point {a, b} = {a, p[a]} is the line, with the same slope as the polynomial.

y – p[a] = s (x a)    …  (*** incline)

The difference between polynomial and incline might be called the error. Thus:

error = p[x] – y = (p[x] – p[a]) – (y – p[a])

= (x a) q[x] – s (x a)

= (x a) (q[x] – s)

When we take s = q[a] then:

error = p[x] – y = (x a) (q[x] – q[a])

Key question

A key question becomes: can we isolate q[x] by some method ? We already have (**), but this format  contains the problematic division. Is there another way to isolate q ? There appear to be three ways. Likely these ways are essentially the same but emphasize different aspects.

Method 1. Dynamic quotient

The dynamic quotient manipulates the domain and relies on algebraic simplification. Instead of H we use D, with y xD = y // x.

q[x] = (p[x] – p[a])  (xa)D

means: we first take x ≠ a,

then take D = H, so that this is normal division again,

then simplify,

and then declare the result also valid for x = a.

The idea was presented in ALOE 2007 while COTP 2011 is a proof of concept. COTP shows that it works for polynomials, trigonometry, exponentials and recovered exponents (logarithms). For polynomials it is shown by means of recursion.

Looking at this from the current perspective of the polynomial division algorithm, then we can say that the method also works because division of a polynomial of degree n > 0 by a polynomial of degree m = 1 generates a neat polynomial of degree n m. Thus we can isolate q[x] indeed. Since q[x] is polynomial, substitution of x = a provides no problem.

The condition on manipulating the domain nicely plugs the hole in the polynomial division algorithm. It is actually necessary to prevent circularity.

Method 2. Incline

Via Descartes (and Suzuki’s article above) we understand that perpendicular to the incline (tangent) there is a line on which there is a circle that touches the incline too. This implies that (x a) must be a double root of the polynomial.

We may consider p[x] / (x a)2 and determine the remainder v[x]. The line y = v[x] then is the incline. Or, the equation of the tangent of the polynomial at point {a, p[a]}. It is relatively easy to determine the slope of this line, and then we have q[a].

Check the wikipedia example. In Mathematica we get PolynomialRemainder[x^3 – 12 x^2  – 42, (x – 1)^2, x] = -21 x – 32 indeed. At = 1, q[a] = -21.

This method assumes “algebraic ways” to separate quotient and remainder. We can find the slope for polynomials without using the limit for the derivative. Potentially the same theory is required for the simplification used in the dynamic quotient.

Remarkably, the method presumes x ≠ a, and still derives q[a]. I cannot avoid the impression that this method still has a conceptual hole.

Addendum 2017-01-11: By now we have identified these methods to isolate a factor “algebraically”:

1. Look at the form (powers) and coefficients. This is basically Ruffini’s rule, see below. Michael Range works directly with coefficients.
2. Dynamic quotient that relies on the algebra of expressions.
3. Divide away nonzero factors so that only the problematic factor remains that we need to isolate. (This however is a version of the dynamic quotient, so why not apply it directly ?)

An example of the latter is p[x] = x^3 – 6 x^2 + 11 x – 6. Trial and error or a graph indicates that zero’s are at 1 and 2. Assuming that those points don’t apply we can isolate p[x] / ((x – 1) (x – 2)) = (x – 3) by means of long division. Subsequently we have identified the separate factors, and the total is p[x] = (x – 1) (x – 2) (x – 3).

Check also that “division” is repeated subtraction, whence the method is fairly “algebraic” by itself too.

Addendum 2016-12-26: However, check the next weblog entry.

PM 1. General method to find the slope

The traditional method is to use the derivative p'[x] = 3 x^2 – 24 x, find slope p‘[1] = -21, and construct the line y = -21 (x – 1) + p[1]. This method remains didactically preferable since it applies to all functions.

PM 2. Double root in error too

If p[x] = 0 has solution x = a, then the latter is called a root, and we can factor p[x] = (x a) q[x] with remainder zero.

For example, p[x] – p[a] = 0 has solution x = a. Thus p[x] – p[a] = (x a) q[x] with remainder zero.

Also q[x] – q[a] = 0 has solution x = a. Thus q[x] – q[a] = (x a) u[x] with remainder zero.

Thus the error has a double root.

error = p[x] – y = (x a)2 u[x]

Unfortunately, this insight only allows us to check a given line y = s x + c, for then we can eliminate y.

Method 3. Ruffini’s Rule

See above for the summary of Ruffini’s Rule and the links. For the application below you might want to become more familiar with it. Check why it works. Check how it works, or here.

The observation of the double root generates the idea of applying Ruffini’s Rule twice.

I don’t think that it would be so useful to teach this method in highschool. Mathematics undergraduates and teachers better know about its existence, but that is all. The method might be at the core of efficient computer programmes, but human beings better deal with computer algebra at the higher level of interface.

The assumption that x a goes without saying, but it remains useful to say it, because at some stage we still use q[a], and we better be able to explain the paradox.

Application of Ruffini’s Rule to the derivative

Let us use the example of Ruffini’s Rule at MathWorld  to determine the incline (tangent) to their example polynomial 3 x^3 – 6 x + 2, at x = 2. They already did most of the work, and we only include the derivative.

The first round of application gives us p[a] = p[2] = 14, namely the constant following from MathWorld.

A second round of application gives the slope, q[a] = 30.

2 |  3   6    6
6  24
3 12  30

Using the traditional method, the derivative is p’ [x] = 9 x^2 – 6, with p‘[2] = 30.

The incline (tangent) in both cases is y = 30 (x – 2) + 14 = 30 x – 46.

The major conceptual issue

The major conceptual issue is: while s is the slope of a line, and we take s = q[a], why would we call q[a] the slope of the polynomial at x = a ? Where is the element of “inclination” ? We might have just a formula of a line, without the notion of slope that fits the function. In other words, q[a] is just a number and no concept.

The key question w.r.t. this issue of the limit – and whether division causes a limit – is not quite w.r.t. Ruffini’s Rule but with the definition of slope, first for the line itself, secondly now for the incline of  a function. We represent the incline of a function with a line, but only because it has the property of having a slope and angle with the horizontal axis.

The only reason to speak about an incline is the recognition that above equation (**) generates a slope. We are only interested in q[a] = tan[α] since this is the special case at the point x a itself.

It is only after this notion of having a slope has been established, that Ruffini’s Rule comes into play. It focuses on “factoring as synthetic division” since that is how it has been designed. There is nothing in Ruffini’s Rule that clarifies what the calculation is about. It is an algorithm, no more.

Thus, for the argument that q[a] provides the slope at x = a, we still need the reasoning that first x ≠ a, then find a general expression q[x] and only then find x = a.

And this is what the algebraic approach to the derivative was designed to accomplish.

Addendum 2016-12-26: See the next weblog entry for another approach to the notion of the incline (tangency).

Ruffini’s Rule corroborates that the method works, but that it works had already been shown. However, it is likely a mark of mathematics that all these approaches are actually quite related. In that perspective, the algebraic approach to the derivative supplements the application of Ruffini’s Rule to clarify what it does.

Obviously, mathematicians have been working in this manner for ages, but implicitly. It really helps to state explicitly that the domain of a function can be manipulated around (supposed) singularities. The method can be generalised as

f ‘[x] = {Δf x)Dthen set Δx = 0} = {Δf // Δx, then set Δx = 0}

It also has been shown to work for trigonometry and the exponential function.