# Benutzer:Georg-Johann/Mathematik

Zur Navigation springen Zur Suche springen
A Julia Set (in white) visualized using CCA

## Visualising Julia sets

Julia sets and Fatou sets can be nice. Their computergraphical generation can be hard.

In the remainder of this page, we work out a method which can be used to visualise the Julia set of a complex function ƒ. The advantage is that you don't need to know attractors of the iteration

${\displaystyle z\mapsto f(z)}$

The generated images will be smooth in the Fatou part.

### Overview: Imaging methods

Note that there exist other approaches to color complex dynamical systems like

#### Inverse Iteration Method (IIM)

Compute the preimages of ƒ, i.e. compute the reverse orbit. Because the stability of the fixed points turns from attractive to repelling and vice versa, one just choses a complex number and looks where it goes under the invers iteration. The trouble is that the results will not uniformly distributet in ${\displaystyle J}$ and that you have to compute the inverse of ƒ.

#### Coloring by speed of attraction (CSA)

Taking a value from the lattice of points to color, perform iterations until the iterative is close to an attractor. Color the point according to the number of iterations needed to bring it close enough to the attractor.

This method is commonly used to visualize Julia sets of polynomials and Julia sets that are attached to Newton's famous method for finding the zeros of a function. Polynomials or degree > 1 always have infinity as a super-attractive fixed point. The rational function that occurs in Newton's method has always the function's roots as attractive fixed points. However, in both cases there may be other attractors, which – moreover – need not to consist of just one point.

#### Escape Time Algorithm (ETA)

If ∞ is an attractor, i.e. a fixed point of the process, then color the point according to the number of iterations – the time – it takes until one sees the point escapes towards ∞. If the point does not escape during the maximum number of iterations, the point is colored as belonging to the Julia set or to the basin of some other attractor. This method works for polynomials. The most prominent Julia sets are the ones for zz2+c where c is an element of the Mandelbrot set or not far away from the mandelbrot set. If you see a picture of such a Julia set, it is likely that ETA had been used to get the picture.

#### Cauchy Convergence Algorithm (CCA)

In the remainder of this page, I will present a different approach whose idea as basically the same es that of Escape Time Algorithm. However, no basin of attraction must be known in advance and the different basins of attraction can be separated and be colored differently. The approch uses the notion of Cauchys convergence. Instead of observing the orbit of a point, this method observes how the distance of two nearby points z and z+ε evolves as these two values are iterated. If the difference tends to 0, then the point heads for an attractor. If the difference does not approach 0, then the point is close to (or a part of) the Julia set.

##### The Metric

Let ${\displaystyle \varrho }$ be a canonical projection of the compactified complex plane onto the Riemann sphere S2:

${\displaystyle \varrho :{\overline {\mathbb {C} }}=\mathbb {C} \cup \{\infty \}\rightarrow S_{2}}$

This gives us a metric d: As distance between two points in the complex plane we take their distance on the sphere, i.e. the length of the orthodrome. This means that the metric is bounded by π and even the distance to ∞ (which is now the north pole) is finite.

In order to compute the distance between two points z and w we rotate the sphere S2 in such a way that

1. w maps to 0
2. z maps to the positive real axis

After these transformations the distance can be computed quite easily. The rotation can be accomplished by a squence of isometric Möbius transformations. All an all, we get

${\displaystyle t_{w}:z\mapsto {\begin{cases}|z|\;,&{\text{ if }}w=0\\{\textstyle {\frac {1}{|z|}}}\;,&{\text{ if }}w=\infty \\{\textstyle {\frac {1}{|w|}}}\;,&{\text{ if }}z=\infty \\&\\{\textstyle {\frac {1}{|w|}}}\cdot \left|{\frac {z{\overline {w}}-|w|^{2}}{z{\overline {w}}+1}}\right|\;,&{\text{ else}}\end{cases}}}$

which is left as an exercise to the reader. The bar denotes complex conjugation. The metric is then

${\displaystyle d(z,w)=\,\!2\arctan(t_{w}(z))}$

The nice feature that is introduced by d is that sequences that formerly diverged against infinity now converge towards infinity, i.e. towards the north pole of S2.

##### Stability of Orbits

Recall the definition of the Julia set for a contraction mapping ƒ. The definition implies some facts on the stability of the iteration

${\displaystyle z\mapsto f(z)\mapsto f^{2}(z)\mapsto f^{3}(z)\mapsto \cdots }$

where ƒn denotes the n-th iterative of ƒ:

${\displaystyle f^{\,n}=f\circ f\circ \cdots \circ f}$

The set

${\displaystyle \{f^{\,n}(z)\}_{n\in \mathbb {N} _{0}}}$

is called Orbit of z (under ƒ).

The orbits of two points z and w behave similar − in some sense − if z and w lie close together and belong to the Fatou set Fƒ which is the complement of the Julia set Jƒ of ƒ. If z is an element of Jƒ then z and w will behave quite different, even if w itself is an element of Jƒ.

To get a notion of the stability of an orbit, we set

${\displaystyle \Sigma _{n}(z)=\Sigma _{n}(z,\varepsilon )=\sum _{k\leqslant n}d\,(f^{k}(z),f^{k}(z+\varepsilon ))}$

for a small ε and with the metric d from above. This means that we take two points which are close together, and then we summarize their distances as ƒ makes them jump around on the Riemann sphere. Note that for any fixed ε the sum can diverge for large n even if z is a Fatou point.

However, we can use Σn to measure how close a point is to Jƒ: the larger the sum is, the more instable is the iteration and the closer is the point to the Julia set of ƒ.

To destinguish points of (or close to) the Julia set from points in the Fatou set, we need a creterion. To get it, we compute all the Σ-values for the points that we want to color, i.e. for all points in a lattice Γ. After computing these values, we do a little bit of statistics to get the expected value E and the standard deviation σ for the set of Σ-values: Let

{\displaystyle {\begin{aligned}E_{n}&=E_{n}(\Gamma ,\varepsilon )=E\left\{\log(\delta +\Sigma _{n}(z,\varepsilon ))\right\}_{\!z\,\in \,\Gamma }\\\sigma _{n}&={\sqrt {D_{n}}}\\\end{aligned}}}

Remind that

{\displaystyle {\begin{aligned}EX&={\textstyle {\frac {1}{|X|}}}\sum \limits _{x\in X}x\\D\!X&=EX^{2}-(EX)^{2}\\\end{aligned}}}

It turns out that the values Σ are widely spread over several scales. Therefore, we do not use Σ directly. Instead, we use the logarithm of Σ. The value δ is just a small constant to avoid the logarithm's input to be zero.

##### Coloring

Now, we have all we need to color a point:

1. choose
1. a lattice ${\displaystyle \Gamma }$ of points ${\displaystyle z}$ to color
2. the number ${\displaystyle n}$ of iterations to perform
3. a small ${\displaystyle \varepsilon }$
2. for all points, compute ${\displaystyle \Sigma _{n}(z,\varepsilon )}$
3. compute ${\displaystyle E_{n}}$ and ${\displaystyle \sigma _{n}}$
4. compute
${\displaystyle J(z)=\,\ln(\delta +\Sigma _{n}(z))-K}$

for some constant ${\displaystyle K}$. Because ${\displaystyle K}$ will be used to separate points that belong to the Julia set (${\displaystyle J(z)>0}$) from points that to not (${\displaystyle J<0}$), reasonable values for ${\displaystyle K}$ are greater than ${\displaystyle E_{n}}$. Try settings like ${\displaystyle K=E_{n}+2\sigma _{n}}$ or ${\displaystyle E_{n}+\sigma _{n}}$ or the like. If ${\displaystyle J(z)>0}$ then we color ${\displaystyle z}$ as belonging to the Julia set. If ${\displaystyle J(z)<0}$ we can use that value to shade the Fatou set. If we know some attractor, we can check if ƒ${\displaystyle n}$${\displaystyle (z)}$ is close to it and use that information, too.

To map values to the valid ranges for saturation and brightness we use the function ${\displaystyle h}$ from section helper function h.

##### Modifications

The computation of ${\displaystyle E_{n}}$ takes a lot of time. The visualisation process needs two passes:

1. compute ${\displaystyle E_{n}}$ from all ${\displaystyle \Sigma _{n}(z)}$
2. color the points using ${\displaystyle E_{n}}$ and recomputed ${\displaystyle \Sigma _{n}(z)}$

Alternatively

1. compute and store all ${\displaystyle \Sigma _{n}(z)}$
2. compute ${\displaystyle E_{n}}$ and color the points

An other approach looks like that:

Find the smallest ${\displaystyle n}$ so that ${\displaystyle d_{k}(z,z+\varepsilon )}$ is below a given bound for all ${\displaystyle k\geq n}$. If no such ${\displaystyle n}$ can be found, then assume ${\displaystyle z}$ to be an element of the Julia set. Otherwise, use ${\displaystyle n}$ to color the Fatou set.

This algorithm is a variant of the escape time algorithm (ETA). Note that in ETA the point does not really escape (at least if we are on the sphere), it just converges to ∞. This approach is similar. However, we don't need to know an attractive fixed point of ƒ.

Up to now, I didn't try the modified version. One disadvantage may be that the Fatou set will no more appear smooth colored. Then I am not sure if this modification is really an advantage, because the iteration must be done until a given maximum number of iterations is reached. Note that even if ${\displaystyle d_{n}}$ is under the bound for some ${\displaystyle n}$ the distance ${\displaystyle d_{k}}$ can rise again. I cannot say if this effect is crucial or can be neglected...

##### Gallery

${\displaystyle N}$ denotes the Newton operator

${\displaystyle N:f\mapsto {\mathit {id}}-{\frac {f}{f'}}}$

#### Using Critical Points Absorption (CPA)

The previous method yeilds neat, smooth colorings and requires least knowledge about the dynamics of the process. However, it is quite time consuming. Teh following approach is an extension of escape time algorithm (ETA) for polynomials.

Let ƒ be a polynomial of degree d > 1 over C. Such a polynomial has at most d critical points: infinity and the at most d–1 zeroes of ƒ′. It is well known that each attractor of the process z→ƒ(z) absorbs at least one critical point. Suppose zK is a critical value, then ƒn(zK) comes arbitrarirly close to one of the attractive cycles of ƒ.

A process for a quadratic polynomial ƒ(z) = z2 + c is the simplest case: The critical values are 0 and  ∞. As ∞ absorbs itself, only 0 is left, and we have the following cases. M denotes the Mandelbrot set.

${\displaystyle c\notin M}$
Easy case. 0 is absorbed by ∞, and J(ƒ) consists just of Cantor dust. Escape time to ∞ can be used to color all points.
${\displaystyle c\in M\setminus \partial M}$
If c is an element of the interior of M, then 0 will be absorbed by a (super) attractive cycle. Compute
${\displaystyle w=w(c,n)=f^{n}(0)}$
for a sufficiently large n. As w (basically) only depends on c, it can be precomputed before starting the visualization of J(ƒ). Notice that w is the element of a cycle that might have more than one element, i.e. w is only unique modulo that cycle.
Coloring for a point z in C is then as easy as ETA:
• If z is absorbed by ∞, use escape time to color z.
• If ƒn(z) comes close to w, i.e if |ƒn(z) – w| < ε for the first time, use that n to color z.
• If ƒn(z) neither comes close to ∞ nor comes close to w, then color z as part of J(ƒ). Only few z will fall into this category.

## Visualising complex functions

Color space in HSV-projection

Suppose we want to visualise a complex valued function like

{\displaystyle {\begin{aligned}f:\mathbb {C} &\;\to \mathbb {C} \\z&\;\mapsto w=f(z)\\\end{aligned}}}

In order to color ${\displaystyle w}$ we decompose it into its absolute value ${\displaystyle |w|}$ and its argument ${\displaystyle \arg(w)}$.

Then, we assign the color

${\displaystyle \mathrm {HSV} \left({\tfrac {1}{2\pi }}\arg(w),\;1-g_{a_{s},b_{s}}(|w|),\;g_{a_{v},b_{v}}(|w|)\right)}$

to the point representing ${\displaystyle z}$. In this HSV color space, all values are in the range from 0 to  1. The first component (hue) of the HSV color depends only on the argument of ${\displaystyle w}$ and the second and third component (saturation and value) depend only on the absolute value of ${\displaystyle w}$. We use a transformation ${\displaystyle g_{a,b}}$ on ${\displaystyle w}$ to map it into the interval ${\displaystyle [0,1]}$. For ${\displaystyle g}$ see section helper functions.

### Examples

The values indexed s (saturation) control the transition saturated colors→white (resp. gray scale), i.e. intermediate values → infinity. The values indexed h (value/valenz) control the transition black→bright, i.e. zero→nonzero. Parameter a controls where the transition takes place: a is just the radius of the circle dividing the two regions (dark/bright, saturated/gray, etc.) Parameter b controls how sharp the transition is: b small = soft, b large = sharp.

The following images all show the range [-10,10]×[-10,10]${\displaystyle i}$ and use ${\displaystyle a_{s}=5}$ (radius of rainbow) and ${\displaystyle a_{v}=1}$ (radius of black disc).

In the image with swapped meanings of S and V zero is printed in white and infinity in black.

## Helper functions

### Helper function t

some ${\displaystyle t\,}$

is a smooth transition from −1 to 1 that satisfies ${\displaystyle t'(0)=1}$ and ${\displaystyle t(x)=-t(-x)}$. There are many choices for it. Some of them are

{\displaystyle {\begin{aligned}t:\mathbb {R} &\to (-1,1)\\x&\mapsto \tanh \,x\\&\mapsto {\tfrac {2}{\pi }}\arctan \left({\tfrac {\pi }{2}}x\right)\\&\mapsto {\tfrac {2}{\pi }}\operatorname {gd} \left({\tfrac {\pi }{2}}x\right)\\&\mapsto {\frac {x}{\sqrt {1+x^{2}}}}\\&\mapsto {\frac {x}{1+|x|}}\\\end{aligned}}}

with gd denoting the gudermannian function.

### Helper function h

${\displaystyle h_{a}\,}$

Helper function ${\displaystyle h_{a}}$ is almost the same like ${\displaystyle t}$ but it maps to ${\displaystyle (0,1)}$ and we can adjust the speed of transition by parameter ${\displaystyle a}$.

{\displaystyle {\begin{aligned}h_{a}:\mathbb {R} &\to (0,1)\\x\;&\mapsto {\tfrac {1}{2}}+{\tfrac {1}{2}}t(2ax)\end{aligned}}}

with ${\displaystyle a>0}$. Then ${\displaystyle h_{a}}$ is symmetric to the point (0, 1/2) and

{\displaystyle {\begin{alignedat}{2}h_{a}&(0)&&={\tfrac {1}{2}}\\h'_{a}&(0)&&=a\\h_{a}&(-\infty )&&=0^{+}\\h_{a}&(\infty )&&=1^{-}\\\end{alignedat}}}

Negative values are mapped to values between 0 and 1/2. Positive values are mapped to values between 1/2 and 1. The parameter ${\displaystyle a}$ controls how fast the transition will be.

If we want a falling function, we can use the symmetry

${\displaystyle h_{a}(x)\,=1-h_{a}(-x)=1-h_{-a}(x)}$

i.e. we negate ${\displaystyle x}$ or ${\displaystyle a}$.

### Helper function g

${\displaystyle g_{a,b}\,}$

This function maps the positive numbers to the interval ${\displaystyle [0,1)}$.

{\displaystyle {\begin{aligned}g_{a,b}:\mathbb {R} ^{\geqslant \,0}&\to (0,1)\\x\;&\mapsto {\tfrac {1}{2}}+{\tfrac {1}{2}}\,t(2b\cdot a\cdot u(x/a))\end{aligned}}}

for some function ${\displaystyle u}$ that is defined below. If ${\displaystyle u}$ is appropriately chosen then for ${\displaystyle g}$ the following holds

${\displaystyle {\begin{array}{lll}g_{a,b}(0)&=&0\\g_{a,b}(\infty )&=&1\\g_{a,b}(a)&=&1/2\\g'_{a,b}(a)&=&b\\g'_{a,b}(x)&>&0{\text{ for }}x>0\end{array}}}$

This means that ${\displaystyle g_{a,b}}$

• grows continuously from 0 to 1 as ${\displaystyle x}$ grows from ${\displaystyle 0}$ to ${\displaystyle \infty }$
• we can control where ${\displaystyle g}$ crosses the middle between 0 and 1 by specifying parameter ${\displaystyle a}$
• we can control how fast ${\displaystyle g}$ passes the point ${\displaystyle (a,1/2)}$ by specifying parameter ${\displaystyle b}$

We are left with determining the finction ${\displaystyle u}$ on ${\displaystyle \mathbb {R} ^{>0}}$ with

${\displaystyle u\,}$

${\displaystyle u}$ must satisty

${\displaystyle {\begin{array}{lll}u(0^{+})&=&-\infty \\u(\infty )&=&\infty \\u(1)&=&0\\u'(1)&=&1\end{array}}}$

For ${\displaystyle u}$ we set

${\displaystyle u(x)={\tfrac {1}{2}}\left(x-{\tfrac {1}{x}}\right)}$

Another choice for ${\displaystyle u}$ might be ${\displaystyle u(x)=\ln(x)}$. Note that ${\displaystyle u(x)=-u(1/x)}$.

### Helper function w

${\displaystyle w_{a}\,}$

This function maps the positive numbers to the interval ${\displaystyle [0,1)}$.

{\displaystyle {\begin{aligned}w_{a}:\mathbb {R} ^{\geqslant \,0}&\to [0,1)\\x\;&\mapsto t(ax)\end{aligned}}}

By ${\displaystyle a}$ we can control its slope in the origin:

${\displaystyle w'(0)=\,a}$

## Bézier-Curves

Suppose we have a smooth function

{\displaystyle {\begin{aligned}f:[t_{0},t_{1}]&\;\to \;\mathbb {R} ^{2}\\t&\;\mapsto \;(x(t),\,y(t))\\\end{aligned}}}

and like to draw an approximation of it using quadratic Béziers. Obviously, the two end points lie on ƒ and the control point lies on the crosspoint — for simplicity, we assume it exists — of the two tangents through the bounding points. In other words, we have to determine the intersection of the two lines

${\displaystyle g_{n}:\textstyle {\binom {x_{n}}{y_{n}}}+\lambda _{n}\cdot \textstyle {\binom {{\dot {x}}_{n}}{{\dot {y}}_{n}}}}$

with canonical abbreviations like

${\displaystyle {\dot {x}}_{n}={\tfrac {dx}{dt}}(t_{n})}$

This leads to a determining equation for the λ's

${\displaystyle {\binom {x_{0}-x_{1}}{y_{0}-y_{1}}}={\binom {-{\dot {x}}_{0}\quad {\dot {x}}_{1}}{-{\dot {y}}_{0}\quad {\dot {y}}_{1}}}{\binom {\lambda _{0}}{\lambda _{1}}}}$

whose solution is

${\displaystyle {\binom {\lambda _{0}}{\lambda _{1}}}={\frac {1}{{\dot {x}}_{1}{\dot {y}}_{0}-{\dot {x}}_{0}{\dot {y}}_{1}}}{\binom {{\dot {y}}_{1}\quad -{\dot {x}}_{1}}{{\dot {y}}_{0}\quad -{\dot {x}}_{0}}}{\binom {x_{0}-x_{1}}{y_{0}-y_{1}}}}$

This gives us the intersection, i.e. the control point, by evaluating one of (or the two of) the g's at the end point(s).

### Cubic

Suppose we have a smooth function

{\displaystyle {\begin{aligned}f:[t_{0},t_{1}]&\;\to \;\mathbb {R} ^{2}\\t&\;\mapsto \;(x(t),\,y(t))\\\end{aligned}}}

and like to draw an approximation of it using cubic Béziers

{\displaystyle {\begin{aligned}b(t)&=(1-t)^{3}P_{0}+3(1-t)^{2}tP_{1}+3(1-t)t^{2}P_{2}+t^{3}P_{3}\\&=P_{0}+3(P_{1}-P_{0})t+3(P_{2}-2P_{1}+P_{0})t^{2}+(P_{3}-3P_{2}+3P_{1}-P_{0})t^{3}\\&=P_{3}+3(P_{2}-P_{3})(1-t)+3(P_{1}-2P_{2}+P_{3})(1-t)^{2}+(P_{0}-3P_{1}+3P_{2}-P_{3})(1-t)^{3}\end{aligned}}}

From the two last representations we see immediately the value of the derivatives of b in the end points. We want the first and second derivatives in the end points to point in the same direction as the derivatives of ƒ do. Thus

This method applied to the standard parametrization (cos t, sin t) of a quarter of the unit circle (in black):
We get the coordinates (1, 1/2) and (1/2, 1) for the control points (red). The resulting bézier (orange) does not approximate the circle as good as possible.
{\displaystyle {\begin{aligned}\alpha \cdot {\dot {f}}_{0}&=P_{1}-P_{0}\\\beta \cdot {\dot {f}}_{1}&=P_{3}-P_{2}\\\gamma \cdot {\ddot {f}}_{0}&=P_{0}-2P_{1}+P_{2}\\\delta \cdot {\ddot {f}}_{1}&=P_{1}-2P_{2}+P_{3}\\\end{aligned}}}

and together with ƒ0=P0 and ƒ1=P3 we get the linear system

${\displaystyle {\begin{pmatrix}-2{\dot {x}}_{0}&-{\dot {x}}_{1}&-{\ddot {x}}_{0}&0\\-2{\dot {y}}_{0}&-{\dot {y}}_{1}&-{\ddot {y}}_{0}&0\\{\dot {x}}_{0}&2{\dot {x}}_{1}&0&-{\ddot {x}}_{1}\\{\dot {y}}_{0}&2{\dot {y}}_{1}&0&-{\ddot {y}}_{1}\end{pmatrix}}\cdot {\begin{pmatrix}\alpha \\\beta \\\gamma \\\delta \end{pmatrix}}={\begin{pmatrix}x_{0}-x_{1}\\y_{0}-y_{1}\\x_{1}-x_{0}\\y_{1}-y_{0}\end{pmatrix}}}$

Note that in the special case of x(t) = t we get

${\displaystyle \alpha =\beta ={\tfrac {1}{3}}(t_{1}-t_{0})}$

i.e. the x-coordinates of the control points P1 and P2 divide the interval [t0,t1] with ratios 1:2 resp. 2:1. This is quite astonishing because in order to get the control points we do not have to evaluate second derivatives of ƒ. This is due to properties of Bernstein polynomials.

To solve the above system we use standard technique like the adjugate.

Also note that if both second derivatives of x or both second derivatives of y happen to vanish the above system won't have full rank, i.e. the corresponding matrix won't be invertible. However, it's sufficient to determine α and β to get the control points and for vanishing second derivatives we get the less complicated systems

${\displaystyle {\begin{pmatrix}-2{\dot {x}}_{0}&-{\dot {x}}_{1}\\{\dot {x}}_{0}&2{\dot {x}}_{1}\end{pmatrix}}\cdot {\binom {\alpha }{\beta }}={\binom {x_{0}-x_{1}}{x_{1}-x_{0}}}\qquad {\text{ if }}{\ddot {x}}_{0,1}=0}$

resp.

${\displaystyle {\begin{pmatrix}-2{\dot {y}}_{0}&-{\dot {y}}_{1}\\{\dot {y}}_{0}&2{\dot {y}}_{1}\end{pmatrix}}\cdot {\binom {\alpha }{\beta }}={\binom {y_{0}-y_{1}}{y_{1}-y_{0}}}\qquad {\text{ if }}{\ddot {y}}_{0,1}=0}$

Solving this yields

${\displaystyle {\binom {\alpha }{\beta }}={\frac {1}{3}}(x_{1}-x_{0}){\binom {{\dot {x}}_{0}}{{\dot {x}}_{1}}}\qquad {\text{ if }}{\ddot {x}}_{0,1}=0}$

and ditto for y.

As you can see in the image above, there is some room for improvement and therefore we work out a second approach. Again, we set P00, P31 and constrain the two control points P1 resp. P2 to lie on the tangent through the end point next to it. This leaves us again with a two dimensional space to search in. Instead of imposing properties on second derivative, we now simply force the bézier to meet the curve ƒ in a third point Q, i.e. we set

${\displaystyle Q=f_{t}=f(t\cdot t_{0}+(1-t)\cdot t_{1})\quad {\stackrel {.}{=}}\quad b_{[P_{0},P_{1},P_{2},P_{3}]}(t)}$

for some t in (0,1) and b denoting the bézier. The condition on the end and control points again reads as

This method applied to the standard parametrization (cos t, sin t) of a quarter of the unit circle (in black):
We get the coordinates (1, ω) and (ω, 1) for the control points (red) with
${\displaystyle {\scriptstyle \omega ={\frac {4}{3}}({\sqrt {2}}-1)}}$
The resulting bézier approximation (orange) is fine. There is no visual difference between the circle and the bézier. The additional point Q is indicated halfway on the bow.
{\displaystyle {\begin{aligned}P_{0}&=f_{0}\\P_{1}&=f_{0}+\alpha \cdot {\dot {f}}_{0}\\P_{2}&=f_{1}-\beta \cdot {\dot {f}}_{1}\\P_{3}&=f_{1}\\\end{aligned}}}

Putting this together with the condition on Q we get

{\displaystyle {\begin{aligned}&{\binom {x_{t}-x_{0}}{y_{t}-y_{0}}}+(2t^{3}-3t^{2}){\binom {x_{1}-x_{0}}{y_{1}-y_{0}}}\\&\qquad \qquad =3t(1-t){\binom {(1-t){\dot {x}}_{0}\quad -t{\dot {x}}_{1}}{(1-t){\dot {y}}_{0}\quad -t{\dot {y}}_{1}}}{\binom {\alpha }{\beta }}\end{aligned}}}

Note that we can use more than one point Q. Suppose we like to use n points Qi to guide the bézier. Each Qi will add two more lines in the above linear system, i.e. the system will look like

${\displaystyle w=M\!\cdot \!{\tbinom {\alpha }{\beta }}}$

with a 2n-dimensional vector w and a 2n×2 matrix M. In general, this system is overdetermined and thus has no solution. Therefore, we solve the 2-dimensional system

${\displaystyle M^{\top }w=M^{\top }\!M\!\cdot \!{\tbinom {\alpha }{\beta }}}$

instead which yields a least-square solution for α and β.

We make the above system a little bit more explicit for the case of one additional point at t = 1/2. The linear sysstem then reduces to

${\displaystyle {\binom {\alpha }{\beta }}={\frac {4}{3}}\,{\frac {1}{{\dot {x}}_{1}{\dot {y}}_{0}-{\dot {x}}_{0}{\dot {y}}_{1}}}{\binom {{\dot {y}}_{1}\quad -{\dot {x}}_{1}}{{\dot {y}}_{0}\quad -{\dot {x}}_{0}}}{\binom {x_{0}+x_{1}-2x_{1/2}}{y_{0}+y_{1}-2y_{1/2}}}}$

Note that the same matrix already occured in the computation for quadratic bézier curves above, however the matrix now gets multiplied with a vector describing the curvature, whereas in the quadratic case it gets multiplied with a vector descriping the direction.

If the special case of a linear function x we get

${\displaystyle \alpha =\beta ={\frac {4}{3}}\cdot {\frac {y_{0}+y_{1}-2y_{1/2}}{{\dot {y}}_{1}-{\dot {y}}_{0}}}}$

#### noname

Suppose we have a smooth function

{\displaystyle {\begin{aligned}f:[t_{0},t_{3}]&\;\to \;\mathbb {R} ^{2}\\t&\;\mapsto \;(x(t),\,y(t))\\\end{aligned}}}

from the reals to the plane and like to draw an approximation of it using cubic Béziers.

Let u be a second function with fixed points at t0 and t3 which is smooth and monotone. Then the composition ƒ o u will yield exactly the same plot for any such u. Applying function u means that drawing the curve at different, arbitrarily increasing and descreasing speeds does not change the way the plot of the curve looks like. That is nice from the plotter's point of view. However, this generates some difficulties when we try to approximate the curve, which eventially turns out to be a playing field for calculus of variations that will lead to formulas and partial differential equations much too complex for practical considerations. So we look for some characteristics that are invariant under reparameterisations u.

The derivative of ƒ o u is the velocity[1]

${\displaystyle v={\frac {\mathrm {d} }{\mathrm {d} t}}(f\circ u)={\dot {f}}\circ u\cdot {\dot {u}}}$

This furmula is the rationale why the first derivative of the bézier curve shall point in the same direction as the first derivative of ƒ: the direction is independent of the parametrisation u. The speed v always points into the same direction, no matter how fast we drive along ƒ. This is no more true for the acceleration

${\displaystyle a={\dot {v}}={\frac {\mathrm {d} ^{2}}{\mathrm {d} t^{2}}}(f\circ u)={\dot {f}}\circ u\cdot {\ddot {u}}\,+\,{\dot {u}}^{2}\!\cdot \!{\ddot {f}}\circ u}$

In the remainder of the essay we will only use properties of the curve at its end points. Thus, before we proceed, let's use the fact that u has fixed points in the t-valueas that yield the end points in order so simplify the formulas for v and ${\displaystyle a}$. The formulas then read

${\displaystyle v={\dot {f}}\cdot {\dot {u}}\quad {\text{ and }}\quad a={\dot {f}}\cdot {\ddot {u}}\,+\,{\dot {u}}^{2}\!\cdot \!{\ddot {f}}}$

provided we evaluate these quandities at one of the end points.

We can think of ${\displaystyle a}$ as being composed of two orthogonal components: one in direction of motion that speeds up or slows down the pencil, and on perpendicular to v which leads to a change in direction. The projection of ${\displaystyle a}$ onto the direction normal to v is parallel to[2]

{\displaystyle {\begin{aligned}\langle a,v^{\bot }\rangle \cdot v^{\bot }&=\langle a,{\tbinom {~{\dot {y}}}{-{\dot {x}}}}\rangle \cdot {\tbinom {~{\dot {y}}}{-{\dot {x}}}}\\&={\binom {{\ddot {x}}\,{\dot {y}}^{2}\,-\,{\ddot {y}}\,{\dot {x}}\,{\dot {y}}}{{\ddot {y}}\,{\dot {x}}^{2}\,-\,{\ddot {x}}\,{\dot {y}}\,{\dot {x}}}}\cdot {\dot {u}}^{4}\\&\parallel {\binom {{\ddot {x}}\,{\dot {y}}^{2}\,-\,{\ddot {y}}\,{\dot {x}}\,{\dot {y}}}{{\ddot {y}}\,{\dot {x}}^{2}\,-\,{\ddot {x}}\,{\dot {y}}\,{\dot {x}}}}\end{aligned}}}
${\displaystyle L(a,b)=\int _{a}^{b}\!{\sqrt {{\dot {x}}^{2}+{\dot {y}}^{2}}}\,\mathrm {d} t}$
${\displaystyle K(a,b)=\int _{a}^{b}{\frac {{\dot {x}}\,{\ddot {y}}\,-\,{\dot {y}}\,{\ddot {x}}}{{\dot {x}}^{2}+{\dot {y}}^{2}}}\,\mathrm {d} t}$
1. In our notations composition of functions has higher priority than multiplication, so we omit parenthesis if appropriate.
2. we denote "parallel to" as ${\displaystyle \parallel }$

## Sphärengleiche Linear Transforms

Given a linear transfrom in n-dimensional euklidean Space:

{\displaystyle {\begin{aligned}A:\quad \mathbb {R} ^{n}&\;\to \;\mathbb {R} ^{n}\\x&\;\mapsto \;A\cdot x\end{aligned}}}

We call two linear transforms A and B spärengleich if they map the unit sphere to the same set:

${\displaystyle A\sim B\quad \Leftrightarrow \quad A\cdot D_{n}=B\cdot D_{n}\qquad {\text{with}}\qquad D_{n}=\{x\in \mathbb {R} ^{n}\;/\;|x|\leqslant 1\}}$

Obviously, this ~ is an equivalence relation and we look for a representant of each equivalence class. We observe that this relation preserves the spectral norm

${\displaystyle A\sim B\quad \Rightarrow \quad \|A\|_{2}=\|B\|_{2}}$

and that orthogonal matrices don't change the equivalence class:

${\displaystyle Q^{\top }Q=\mathrm {id} \quad \Rightarrow \quad A\sim A\cdot Q}$

for any A. The last line is immediately clear because orthogonal transformations map spheres to themselves. The preservation of spectral norm follows from the definition of spectral norm. Let

${\displaystyle A=U_{A}\cdot \Sigma _{A}\cdot V_{A}^{\top }}$

be the singular value decomposition (svd) of A. Then we have

${\displaystyle A\sim U_{A}\cdot \Sigma _{A}}$

and

${\displaystyle A\sim B\quad \Rightarrow \quad \Sigma _{A}=\Sigma _{B}}$

However, the converse is not true.

## Arcsin

arccos and its asymptote ${\displaystyle \scriptstyle {\sqrt {2(1-x)}}}$ in orange.

In order to approximate arcsin and arccos if the argument is close to 1 the following expansions might be useful:

{\displaystyle {\begin{aligned}\arccos(1-x)&={\sqrt {2x}}\cdot a(x)\\\arcsin(1-x)&={\frac {\pi }{2}}-{\sqrt {2x}}\cdot a(x)\end{aligned}}}

with a rational power series

{\displaystyle {\begin{aligned}a(x)&=1+{\frac {1}{12}}x+{\frac {3}{160}}x^{2}+{\frac {5}{896}}x^{3}+{\frac {35}{18432}}x^{4}+{\mathcal {O}}(x^{5})\\&=1+{\frac {1}{2}}\cdot {x \over 3\cdot 2}+{\frac {1\cdot 3}{2\cdot 4}}\cdot {x^{2} \over 5\cdot 2^{2}}+{\frac {1\cdot 3\cdot 5}{2\cdot 4\cdot 6}}\cdot {x^{3} \over 7\cdot 2^{3}}+\cdots \\&=\sum _{j=0}^{\infty }{\binom {2j}{j}}{\frac {x^{j}}{(2j+1)8^{j}}}\\&=\sum _{j=0}^{\infty }{\frac {(2j-1)!!}{(2j)!!}}{\frac {x^{j}}{(2j+1)\,2^{j}}}\end{aligned}}}

where !! denotes the double factorial. The radius of convergence of a is 2. We start by observing that

${\displaystyle \arccos(1-x)=2\arcsin {\bigl (}{\sqrt {x/2}}{\bigr )}}$

which can easily verified by differentiaton. It follows that

{\displaystyle {\begin{aligned}a(x)={\arccos(1-x) \over {\sqrt {2x}}}={\arcsin({\sqrt {x/2}}) \over {\sqrt {x/2}}}=\sum _{j=0}^{\infty }{\binom {2j}{j}}{\frac {x^{j}}{(2j+1)8^{j}}}\end{aligned}}}

Also note the following half-argument relations for –2 ≤ x ≤ 2:

{\displaystyle {\begin{aligned}2\arcsin(x/2)&=\operatorname {sgn} (x)\arccos(1-x^{2}/2)\\2\arccos(x/2)&=\pi -\operatorname {sgn} (x)\arccos(1-x^{2}/2)\end{aligned}}}

### N-th derivative

{\displaystyle {\begin{aligned}{\frac {d^{n}}{dx^{n}}}\arcsin(x)&\;=\;\sum _{j=0}^{\left\lfloor {\frac {n-1}{2}}\right\rfloor }{\binom {n-1}{2j}}(2j-1)!!\,(2n-3-2j)!!\,{\frac {x^{n-1-2j}}{(1-x^{2})^{n-j-1/2}}}\\{\frac {d^{n}}{dx^{n}}}\arccos(x)&\;=\;-{\frac {d^{n}}{dx^{n}}}\arcsin(x)\end{aligned}}}

Again, !! denotes the double factorial. Note that for k < 0, k!! = 1.

### Approximation

The relative error of ${\displaystyle x\cdot a(2x^{2})}$ against ${\displaystyle \arcsin(x)}$ in ${\displaystyle [0,1/2]}$ stays below 6·10−18.
The relative error of ${\displaystyle {\sqrt {2-2x}}\cdot a(1-x)}$ against ${\displaystyle \arccos(x)}$ in ${\displaystyle [1/2,1]}$ stays below 6·10−18.

For 64-bit IEEE double computation (53-bit mantissa) of arcsin and arccos we use the following approach:

• If 1/2 ≤ |x| ≤ 1, compute
${\displaystyle \arccos(|x|)={\sqrt {2-2|x|}}\,a(1-|x|)}$
• If |x| ≤ 1/2, compute
${\displaystyle \arcsin(|x|)=|x|\cdot a(2x^{2})}$

We have to evaluate a(x) for x in [0, 1/2] and use the following rational MiniMax approximation of order [5/4]:

${\displaystyle a(x)\approx p(x)/q(x)}$

with

p(x) =
+ 0.99999999999999999442491073135027586203
- 1.0352340338921976278427312087167692142 x
+ 0.35290206232981519813422591897720574012 x^2
- 0.043334831706416857056123518013656946650 x^3
+ 0.0012557428614630796315205218507940285622 x^4
+ 0.0000084705471128435769021718764878041684288 x^5

q(x) =
+ 1
- 1.1185673672255329236623716486696411533 x
+ 0.42736600959872448854098334016758333519 x^2
- 0.063555884849631716599421483898013782858 x^3
+ 0.0028820878185134035637440105959294542908 x^4


Then we use the symmetries

${\displaystyle \arccos(x)+\arcsin(x)=\pi /2}$
${\displaystyle \arccos(x)+\arccos(-x)=\pi }$
${\displaystyle \arcsin(x)+\arcsin(-x)=0}$

of arcsin and arccos to get the desired result(s).

In order to compute arsinh, use a with negative arguments.

## Dobble Spot It

Dobble is a card game to have a nice time with fast pattern recognition. Each card shows 8 different icons, and any two cards have exactly one icon in common. The task is to find this common icon as fast as possible.

This essay is about how such a deck of cards can be constructed, and it supplies some mathematical background. As Dobble is famous for that mathematical background, we won't get too much into the theory; many articles found on the net address that background. A deck of card consists of 55 cards, each showing 8 different icons out of a set of 57 icons. So how do we have to arrange these icons on the cards so that any two cards picked at random have exactly one icon in common?

The property

any two cards have exactly one icon in common

reminds of a theorem in plane geometry:

any two different lines in the plane meet in exactly one point

Well, almost. In Euclidean geometry there are lines in the plane that don't meet, namely parallel lines. Before we go into details, let's summarize the geometric objects and how to associate to them a deck of cards, and how properties of the game arise from properties in geometry.

The two different Ways to identify Objects of Dobble with Objects in Geometry
Projective Plane
of Order Q
Deck of Cards
Cards = Lines,  Icons = Points
Deck of Cards
Cards = Points,  Icons = Lines
Q2+Q+1 Lines Q2+Q+1 Cards Q2+Q+1 Icons
Q2+Q+1 Points Q2+Q+1 Icons Q2+Q+1 Cards
Each line passes through Q+1 points Each card shows Q+1 icons Each icon is shown on Q+1 cards
Each point is incident on Q+1 lines Each icon is shown on Q+1 cards Each card shows Q+1 icons
Any two different lines meet in exactly one point Any two different cards have exactly one icon in common Any two different icons are shown together on exactly one card
Any two different points uniquely determine one line Any two different icons are shown together on exactly one card Any two different cards have exactly one icon in common
﻿Properties of Dobble
﻿Dual properties which do only apply to a complete deck of Dobble with 57 cards. As mentioned below however, Dobble in incomplete as it comes with 55 cards only. Therefore, the dual properties do not hold for the Dobble you can buy.

In order to overcome the problem with parallel lines in Euclidean geometry, we switch to projective geometry which doen't comes with that shortcoming. Whereas points in the Euclidean plane can be regarded as pairs of numbers (x,y), points in the projective plane are triples (x : y : z) such that not all three are equal to zero. In addition, we consider two triples P and P' as the same if they are multiples of each other, i.e. if there is a number λ ≠ 0 such that

${\displaystyle (x:y:z)=(\lambda x':\lambda y':\lambda z')}$

A line g in the projective plane is given by a triple (gx : gy : gz) and a point P = (x : y : z) is incident on the line provided

${\displaystyle g_{x}x+g_{y}y+g_{z}z=0}$

This is similar to the Euclidean case where a point is incident on a line if

${\displaystyle g_{x}x+g_{y}y+g_{z}=0}$

but in projective space the additional symmetry in the formulae for the point-on-a-line relation has its counterpart in the additional symmetry of any two distinct lines meet in exactly one point. Notice that if z ≠ 0 we can divide the projective condition by z, and the outcome is basically the condition for Euclidean space. However, in the projective case there are also points with z = 0 which don't have a counterpart in Euclidean geometry. These points are sometimes called the horizon or points at infinity, but we don't need such fuzzy wording or any distinction of different classes of points to construct a deck of Dobble.

The second difference to ordinary geometry is that a deck of cards consists only of finitely many items: a finite number of icons, a finite number of cards, and a finite number of icons per card. This is handled by considering geometry over a finite field instead of geometry over the Reals. A field is an entity which features addition and multiplication, both commutative and connected by the distributive law. The addition of any element can be undone, and the element that doesn't change the result of an addition is called zero and denoted as 0. The multiplication with any element except 0 can be undone, and the element that doesn't change the result of a multiplication is called one and denoted as 1.

So let's switch to a finite field F with Q elements. The first observation is that there are only finitely many points in the plane. There are Q3 triples of the form (x : y : z) with x, y and z in F. As not all three coordinates shall be zero, we are left with Q3−1 non-zero triples. Triples which are a multiple of each other are regarded as the same point, and because there are Q−1 non-zero values in F which can serve as the factor λ from above, we get the total number of points of the projective plane over the finite field F of order Q as

${\displaystyle {\frac {Q^{3}-1}{Q-1}}=Q^{2}+Q+1}$

This is also the number of lines in that plane because the lines are also represented as triples. In order to see how many points are incident on each line, let's enumerate the points as

${\displaystyle (0:0:1),\;(0:1:z){\text{ and }}(1:y:z)}$

If the x-coordinate is non-zero, we can divide all coordinates by x which gets us the points of the form (1 : y : z). If x is zero and y is non-zero, then divide by y to get the points of the form (0 : 1 : z). If both x and y are zero, that point can be represented as (0 : 0 : 1) because z must be non-zero then.

In order to compute which points are incident on which line, we could just do brute force and iterate over all lines and all points and test whether or not the relation from above is satisfied. But we can do better by working out explicit formulae. To that end we use a different relation to determine whether a point is incident on a line:

${\displaystyle g_{x}z+g_{y}y+g_{z}x=0}$

This is just a rearrangement of points which does not change the global structure. The advantage is that in our enumeration of points and lines, the first coordinates, i.e. x and gx, are always 0 or 1 which will simplify the computation.

Lines and Points incident on it
Line g = (gx : gy : gz) Constraint(s) on
coordinates of P
P = (x : y : z) Case
${\displaystyle (0:0:1)}$ ${\displaystyle x=0}$ ${\displaystyle (0:0:1),\;(0:1:z)}$ (1)
${\displaystyle (0:1:g_{z})}$ ${\displaystyle y+g_{z}x=0}$ ${\displaystyle (0:0:1),\;(1:-g_{z}:z)}$ (2)
${\displaystyle (1:g_{y}:g_{z})}$ ${\displaystyle z+g_{y}y+g_{z}x=0}$ ${\displaystyle (0:1:-g_{y}),\;(1:y:-g_{z}-g_{y}y)}$ (3)

In either case, there are Q+1 points on each line, and it's easy to verify that any two distinct lines have exactly one point in common. Due to the duality of points and lines, each point is incident on Q+1 different lines.

### The case Q = 2: Fano Plane

Image 1
 ﻿(0:0:1) Text ﻿(0:1:0) ﻿(0:1:1) ﻿(1:0:0) ﻿(1:0:1) ﻿(1:1:0) ﻿(1:1:1)
Image 2:

Let's work out the simplest case of Q = 2, the Fano plane. The field F is the Galois Field GF(2), the field with the 2 elements 0 and 1. We expect 22+2+1 = 7 lines and hence also 7 points, each line passing through 2+1 = 3 points, and each point incident on 3 lines.

Fano Plane
Case Line Points Cards
(1) (0:0:1) (0:0:1), (0:1:0), (0:1:1)
(2) (0:1:0) (0:0:1), (1:0:0), (1:0:1)
(0:1:1) (0:0:1), (1:1:0), (1:1:1)
(3) (1:0:0) (0:1:0), (1:0:0), (1:1:0)
(1:0:1) (0:1:0), (1:0:1), (1:1:1)
(1:1:0) (0:1:1), (1:0:0), (1:1:1)
(1:1:1) (0:1:1), (1:0:1), (1:1:0)

### The case Q = 7: Dobble

For Q = 7 we get Dobble: 57 icons on 57 cards, each card displaying 8 icons. But wait — a deck of Dobble consists only of 55 cards, not of 72+7+1 = 57. Why that? Nobody knows! Two cards are "missing" and not contained in the game. These two missing cards would show the following combinations of icons:

• snowman, exclamation mark, dog, eye, light bulb, ladybug, skull, hammer
• snowman, question mark, gingerbread man, maple leaf, cactus, daisy, ice cube, dino

Due to these two missing cards, some of the statements won't hold for Dobble: For example, not all icons are present 8-fold in the entire game. In particulat, the snowman is only present 6 times as it is the icon common to the two missing cards. And not for any combination of two icons there is a card showing that combination. For example, there is no card showing both dog and eye because this combination belongs to one of the missing cards.

### Beyond Dobble

A central object in the construction from above is the field F which only exists if Q is the power of a prime number p, i.e. Q =pn for some prime number p and a natural number n ≥ 1. What happens if Q is not the power of a prime? We can use a rather axiomatic approach and define the projective plane of order Q to be an entity consisting of Q2+Q+1 points, Q2+Q+1 lines, each line passing through Q+1 points, each point incident on Q+1 lines, any two points uniquely determining a line, and any two lines having exactly one point in common.

For Q = 1, the construction from above still works provided we take the "1" in the formulae literally and set the free variable (z in cases (1) and (2), y in case (3)) to  0. That yields a game with 3 cards, each showing 2 icons out of a set of 3 icons. This works even though there is no field with one element.

For Q > 1, the Bruck–Ryser theorem adds some constraints on Q: If a projective plane of order Q exists and Q is 1 or 2 mod 4, then Q must be the sum of two squares. Hence, if Q is 1 or 2 mod 4 but and not the sum of two squares, e.g. Q =6 , then no projective plane of order Q exists. However, there are infinitely many numbers remaining to which the theorem does not apply, the first one being 10 — the only case where an answer is known so far. The result for 10 has been achieved by heavy computation. The next greater value which is not the power of a prime and where the Bruck–Ryser theorem does not apply is Q = 12 with 13 icons per card and 157 cards. Taking a pure combinatorial approach we get

${\displaystyle {\binom {157}{13}}={\frac {157!}{(157-13)!\cdot 13!}}=3\,393\,796\,168\,826\,188\,475\approx 3.3\cdot 10^{18}}$

different 13-icon subsets (possible cards) which can be built out of the 157 icons, and for a complete game we have to pick 157 from these 3.3·1018 cards in such a way that all the axioms are satisfied.

## Linear Recurrence

Suppose the linear recurrence

${\displaystyle x_{n}=a_{1}x_{n-1}+a_{2}x_{n-2}+\cdots +a_{k}x_{n-k}=\sum _{j=1}^{k}a_{j}x_{n-j}}$

for ${\displaystyle n>k\geqslant 1}$ where ${\displaystyle x_{1}}$, ..., ${\displaystyle x_{k}}$ are given numbers.

We want to determine an explicit representation of ${\displaystyle x_{n}}$. To that end, write the recurrence as:

${\displaystyle \underbrace {\left({\begin{array}{l}x_{n\;\;\;}\\x_{n-1}\\\;\;\vdots \\x_{n-k+2}\\x_{n-k+1}\\\end{array}}\right)} _{\displaystyle {=:y_{n}}}=\underbrace {\begin{pmatrix}a_{1}&a_{2}&\cdots &a_{k-1}&a_{k}\\1&0&\cdots &0&0\\0&1&\cdots &0&0\\\vdots &&\ddots &&\vdots \\0&0&\cdots &1&0\\\end{pmatrix}} _{\displaystyle {=:A\in K^{k\times k}}}\cdot \underbrace {\left({\begin{array}{l}x_{n-1}\\x_{n-2}\\\;\;\vdots \\x_{n-k+1}\\x_{n-k}\\\end{array}}\right)} _{\displaystyle {=:y_{n-1}}}}$

so that it takes the form

${\displaystyle y_{n}=Ay_{n-1}=A^{n-k}y_{k}}$

Now suppose ${\displaystyle A}$ has ${\displaystyle k}$ different eigenvectors ${\displaystyle v_{j}}$ and we know all of them, including the corresponding eigenvalues ${\displaystyle \lambda _{j}}$. Then we can write:

${\displaystyle y_{k}=\sum _{j=1}^{k}\beta _{j}v_{j}=V{\begin{pmatrix}\beta _{k}\\\vdots \\\beta _{1}\end{pmatrix}}={\begin{pmatrix}v_{k}&\cdots &v_{1}\end{pmatrix}}\cdot {\begin{pmatrix}\beta _{k}\\\vdots \\\beta _{1}\end{pmatrix}}}$

where the ${\displaystyle \beta _{j}}$ are scalars in the algebraic closure of ${\displaystyle K}$ and ${\displaystyle V}$ is a matrix with the eigenvectors of ${\displaystyle A}$ as columns. Hence:

${\displaystyle y_{n}=A^{n-k}y_{k}=A^{n-k}{\Big (}\sum _{j=1}^{k}\beta _{j}v_{j}{\Big )}=\sum _{j=1}^{k}\beta _{j}A^{n-k}v_{j}=\sum _{j=1}^{k}\beta _{j}\lambda _{j}^{n-k}v_{j}\qquad (1)}$

which leaves is with the computation of the ${\displaystyle \beta _{j}}$, the ${\displaystyle v_{j}}$ and the ${\displaystyle \lambda _{j}}$. Once we determined the eigenvectors, we get the ${\displaystyle \beta _{j}}$ by means of:

${\displaystyle {\begin{pmatrix}\beta _{k}\\\vdots \\\beta _{1}\end{pmatrix}}=V^{-1}y_{k}}$

Expanding the determinant of ${\displaystyle A-\lambda E}$ by expanding after it's top row, we find that all eigenvalues satisfy the characteristic equation

${\displaystyle \lambda ^{k}=\sum _{j=1}^{k}a_{j}\lambda ^{k-j}=a_{1}\lambda ^{k-1}+a_{2}\lambda ^{k-2}+\cdots +a_{k-1}\lambda +a_{k}}$

From this we easily see that the eigenvectors of ${\displaystyle A}$ are:

${\displaystyle v_{j}=\left({\begin{array}{l}\lambda _{j}^{k-1}\\\;\;\vdots \\\lambda _{j}^{2}\\\lambda _{j}\\1\\\end{array}}\right)}$

Due to (1), in order to get ${\displaystyle x_{n}}$ we take the top component of ${\displaystyle y_{n}}$ to get:

${\displaystyle x_{n}=\sum _{j=1}^{k}\beta _{j}\lambda _{j}^{n-k}\lambda _{j}^{k-1}=\sum _{j=1}^{k}\beta _{j}\lambda _{j}^{n-1}\qquad (2)}$

Thus we are finished: Depending on the ${\displaystyle a_{j}}$, the eigenvalues can be computed explicitly or by numerical methods. From the eigenvalues we get the matrix ${\displaystyle V}$ which we use to compute the coefficients ${\displaystyle \beta _{j}}$ from the starting values ${\displaystyle x_{1}}$ ... ${\displaystyle x_{k}}$ so that we have determines all unknowns in (2).

### Example 1

Let ${\displaystyle a_{1}=a_{2}=1}$ so that we get the Fibonacci sequence

${\displaystyle x_{n}=x_{n-1}+x_{n-2}}$

with the characteristic equation

${\displaystyle \lambda ^{2}=\lambda +1}$