Ruffini’s Rule and meaning of division

We continue the earlier discussion on (1) differentials and (2) polynomials. There is also this earlier discussion about (static or dynamic) division.

At issue is: Can we avoid the use of limits when determining the derivative of a polynomial ?

A sub-issue is: Can we avoid division that requires a limit ?

We use the term incline instead of tangent (line), since this line can also cross a function and not just touch it.

We use H = -1, so that we can write x xH = xH x = 1 for x ≠ 0. Check that xH = 1 / x, that the use of H is much more effective and efficient. The use of 1 / x is superfluous since students must learn about exponents anyway.

Ruffini’s Rule

Ruffini’s Rule is a method not only to factor polynomials but also to isolate the factors. A generalised version is called “synthetic division” for the reason that it isn’t actually division. On wikipedia, Ruffini’s Rule is called “Horner’s Method“. On mathworld, the label “Horner’s Method” is used for something else but related again. My suggestion is to stick to mathworld.

Thus, the issue at hand would seem to have been answered by Ruffini’s Rule already. When we can avoid division then we don’t need a limit around it. However, our discussion is about whether this really answers our question and whether we really understand the answer.

Historical note

I thank Peter Harremoēs for informing me about both Ruffini’s Rule and some neat properties that we will see below. His lecture note in Danish is here. Surprising for me, he traced the history back to Descartes. Following this further, we can find this paper by John Suzuki, who identifies two key contributions by Jan Hudde in Amsterdam 1657-1658. Looking into my copy of Boyer “The history of the calculus” now, page 186, I must admit that this didn’t register to me when I read this originally, as it registers now. We see the tug and push of history with various authors and influences, and thus we should be cautious about claiming who did what when. Suzuki’s statement remains an eye-opener.

“We examine the evolution of the lost calculus from its beginnings in the work of Descartes and its subsequent development by Hudde, and end with the intriguing possibility that nearly every problem of calculus, including the problems of tangents, optimization, curvature, and quadrature, could have been solved using algorithms entirely free from the limit concept.” (John Suzuki)

Apparently Newton dropped the algebra because it didn’t work on trigonometry and such, but with modern set theory we can show that the algebraic approach to the derivative works there too. For the discussion below: check that limits can be avoided.

Division is also a way to isolate factors

When we have 2 x = 6, then we can determine 2 x = 2 3, and recognize the common factor 2. By the human eye, we can see that x = 3 and then we have isolated the factor 3. But in mathematics, we must follow procedures as if we were a computer programme. Hence, we have the procedure of eliminating 2, which is called division:

2H 2 x = 2H 2 3

x = 3

The latter example abuses the property that 2 is nonzero. We must actually check that the divisor is nonzero. If we don’t check then we get:

4 x = 9 x

4 x xH = 9 xH 

4 = 9

Checking for zero is not as simple as it seems. Also expressions with only numbers might contain zero in hidden format, as for example  (4 + 2 – 6)H. Thus it would seem to be an essential part of mathematics to develop a sound theory for the algebra of expressions and the testing on zero.

Calculus uses the limit around the difference quotient to prevent division by zero. But the real question might rather be whether we can isolate a factor. When we can isolate that factor without division that requires a limit, then we hopefully have a simpler exposition. Polynomials are a good place to start this enquiry.

Shifting to rings without division ?

The real numbers form a “field” and when we drop the idea of division, then we get a “ring“. Above 2 x = 6 might also be solved in a ring without division. For we can do:

2 x – 2 3 = 6 – 2 3

2 (x – 3) = 0

2 = 0    or    x – 3 = 0

We again use that 2 ≠ 0. Thus x = 3.

This example doesn’t show a material difference w.r.t. the assumption of division by 2. We also used that 6 can be factored and that 2 was a common factor. Perhaps this is the more relevant notion. Whatever the case, it doesn’t seem to be so useful to leave the realm of the real numbers.

Properties of polynomials

Our setup has a polynomial p[x] with focus of attention at x = a with point {a, b} = {a, p[a]}. When we regard (xa) as a factor, then we get a “quotient” q[x] and a “remainder” r[x].

p[x] = (xa) q[x] + r[x]

It is a nontrivial issue that q and r are polynomials again (proof of polynomial division algorithm, or proofwiki). These proofs don’t use limits but assume that the divisor is nonzero. Thus we might be making a circular argument when we use that q and are polynomials to argue that limits aren’t needed. Examples can be given of polynomial long division. Such examples tend not to mention explicitly that the divisor cannot be zero. Nevertheless, let us proceed with what we have.

Since (xa) has degree 1, the remainder must be a constant, and thus be equal to p[a]. Thus the “core equation” is:

p[x] = (xa) q[x] + p[a]      …  (* core)

p[x] – p[a] = (xa) q[x]

At x = a we get 0 = 0 q[a], whence we are at a loss about how to isolate q[x] or q[a].

When we have defined derivatives via other ways, then we can check that the derivative of (*) is:

p’ [x] = q[x] + (xa) q’ [x]

p’ [a] = q[a]

We can also rewrite (*) so that it indeed looks like an difference quotient.

q[x] = (p[x] – p[a])  (xa)H       …. (** slope = tan[θ], see Spiegel’s diagram)

We cannot divide by (x a) for x = a, for this factor would be zero.

PM. In the world of limits, we could define the derivative of p at a by taking the Limit[x → a, q[x]]. This generates again (Spiegel’s diagram):

q[a] = tan[α]

But our issue is that we want to avoid limits.

Incline

The incline of the polynomial at point {a, b} = {a, p[a]} is the line, with the same slope as the polynomial.

y – p[a] = s (x a)    …  (*** incline)

The difference between polynomial and incline might be called the error. Thus:

error = p[x] – y = (p[x] – p[a]) – (y – p[a])

= (x a) q[x] – s (x a)

= (x a) (q[x] – s)

When we take s = q[a] then:

error = p[x] – y = (x a) (q[x] – q[a])

Key question

A key question becomes: can we isolate q[x] by some method ? We already have (**), but this format  contains the problematic division. Is there another way to isolate q ? There appear to be three ways. Likely these ways are essentially the same but emphasize different aspects.

Method 1. Dynamic quotient

The dynamic quotient manipulates the domain and relies on algebraic simplification. Instead of H we use D, with y xD = y // x.

q[x] = (p[x] – p[a])  (xa)D

means: we first take x ≠ a,

then take D = H, so that this is normal division again,

then simplify,

and then declare the result also valid for x = a.

The idea was presented in ALOE 2007 while COTP 2011 is a proof of concept. COTP shows that it works for polynomials, trigonometry, exponentials and recovered exponents (logarithms). For polynomials it is shown by means of recursion.

Looking at this from the current perspective of the polynomial division algorithm, then we can say that the method also works because division of a polynomial of degree n > 0 by a polynomial of degree m = 1 generates a neat polynomial of degree n m. Thus we can isolate q[x] indeed. Since q[x] is polynomial, substitution of x = a provides no problem.

The condition on manipulating the domain nicely plugs the hole in the polynomial division algorithm. It is actually necessary to prevent circularity.

Method 2. Incline

Via Descartes (and Suzuki’s article above) we understand that perpendicular to the incline (tangent) there is a line on which there is a circle that touches the incline too. This implies that (x a) must be a double root of the polynomial.

We may consider p[x] / (x a)2 and determine the remainder v[x]. The line y = v[x] then is the incline. Or, the equation of the tangent of the polynomial at point {a, p[a]}. It is relatively easy to determine the slope of this line, and then we have q[a].

Check the wikipedia example. In Mathematica we get PolynomialRemainder[x^3 – 12 x^2  – 42, (x – 1)^2, x] = -21 x – 32 indeed. At = 1, q[a] = -21.

This method assumes “algebraic ways” to separate quotient and remainder. We can find the slope for polynomials without using the limit for the derivative. Potentially the same theory is required for the simplification used in the dynamic quotient.

Remarkably, the method presumes x ≠ a, and still derives q[a]. I cannot avoid the impression that this method still has a conceptual hole.

Addendum 2017-01-11: By now we have identified these methods to isolate a factor “algebraically”:

  1. Look at the form (powers) and coefficients. This is basically Ruffini’s rule, see below. Michael Range works directly with coefficients.
  2. Dynamic quotient that relies on the algebra of expressions.
  3. Divide away nonzero factors so that only the problematic factor remains that we need to isolate. (This however is a version of the dynamic quotient, so why not apply it directly ?)

An example of the latter is p[x] = x^3 – 6 x^2 + 11 x – 6. Trial and error or a graph indicates that zero’s are at 1 and 2. Assuming that those points don’t apply we can isolate p[x] / ((x – 1) (x – 2)) = (x – 3) by means of long division. Subsequently we have identified the separate factors, and the total is p[x] = (x – 1) (x – 2) (x – 3).

Check also that “division” is repeated subtraction, whence the method is fairly “algebraic” by itself too.

Addendum 2016-12-26: However, check the next weblog entry.

PM 1. General method to find the slope

The traditional method is to use the derivative p'[x] = 3 x^2 – 24 x, find slope p‘[1] = -21, and construct the line y = -21 (x – 1) + p[1]. This method remains didactically preferable since it applies to all functions.

PM 2. Double root in error too

If p[x] = 0 has solution x = a, then the latter is called a root, and we can factor p[x] = (x a) q[x] with remainder zero.

For example, p[x] – p[a] = 0 has solution x = a. Thus p[x] – p[a] = (x a) q[x] with remainder zero.

Also q[x] – q[a] = 0 has solution x = a. Thus q[x] – q[a] = (x a) u[x] with remainder zero.

Thus the error has a double root.

error = p[x] – y = (x a)2 u[x]

Unfortunately, this insight only allows us to check a given line y = s x + c, for then we can eliminate y.

Method 3. Ruffini’s Rule

See above for the summary of Ruffini’s Rule and the links. For the application below you might want to become more familiar with it. Check why it works. Check how it works, or here.

The observation of the double root generates the idea of applying Ruffini’s Rule twice.

I don’t think that it would be so useful to teach this method in highschool. Mathematics undergraduates and teachers better know about its existence, but that is all. The method might be at the core of efficient computer programmes, but human beings better deal with computer algebra at the higher level of interface.

The assumption that x a goes without saying, but it remains useful to say it, because at some stage we still use q[a], and we better be able to explain the paradox.

Application of Ruffini’s Rule to the derivative

Let us use the example of Ruffini’s Rule at MathWorld  to determine the incline (tangent) to their example polynomial 3 x^3 – 6 x + 2, at x = 2. They already did most of the work, and we only include the derivative.

The first round of application gives us p[a] = p[2] = 14, namely the constant following from MathWorld.

A second round of application gives the slope, q[a] = 30.

2 |  3   6    6
            6  24
       3 12  30

Using the traditional method, the derivative is p’ [x] = 9 x^2 – 6, with p‘[2] = 30.

The incline (tangent) in both cases is y = 30 (x – 2) + 14 = 30 x – 46.

The major conceptual issue

The major conceptual issue is: while s is the slope of a line, and we take s = q[a], why would we call q[a] the slope of the polynomial at x = a ? Where is the element of “inclination” ? We might have just a formula of a line, without the notion of slope that fits the function. In other words, q[a] is just a number and no concept.

The key question w.r.t. this issue of the limit – and whether division causes a limit – is not quite w.r.t. Ruffini’s Rule but with the definition of slope, first for the line itself, secondly now for the incline of  a function. We represent the incline of a function with a line, but only because it has the property of having a slope and angle with the horizontal axis.

The only reason to speak about an incline is the recognition that above equation (**) generates a slope. We are only interested in q[a] = tan[α] since this is the special case at the point x a itself.

It is only after this notion of having a slope has been established, that Ruffini’s Rule comes into play. It focuses on “factoring as synthetic division” since that is how it has been designed. There is nothing in Ruffini’s Rule that clarifies what the calculation is about. It is an algorithm, no more.

Thus, for the argument that q[a] provides the slope at x = a, we still need the reasoning that first x ≠ a, then find a general expression q[x] and only then find x = a.

And this is what the algebraic approach to the derivative was designed to accomplish.

Addendum 2016-12-26: See the next weblog entry for another approach to the notion of the incline (tangency).

Ruffini’s Rule corroborates that the method works, but that it works had already been shown. However, it is likely a mark of mathematics that all these approaches are actually quite related. In that perspective, the algebraic approach to the derivative supplements the application of Ruffini’s Rule to clarify what it does.

Obviously, mathematicians have been working in this manner for ages, but implicitly. It really helps to state explicitly that the domain of a function can be manipulated around (supposed) singularities. The method can be generalised as

f ‘[x] = {Δf x)Dthen set Δx = 0} = {Δf // Δx, then set Δx = 0} 

It also has been shown to work for trigonometry and the exponential function.

Advertisements

Comments are closed.