Best piecewise linear interpolation

I have a time series of data that I want to interpolate using a piecewise linear function. The lengths of the linear pieces should be adaptive such that the maximum error of the points in the interval does not exceed a predetermined value Max_R.

I guess the specification boils down to: Find a set of intervals such that the time series is approximately linear, within a tolerance Max_R, for each interval.

At the same time, I am willing to bastardize my specification enough so that the time consumption is linear, or at least strictly less than quadratic, in the number of points. I also don't want to use third-party software libraries: This should be a standalone program.

This is my first stab at the problem.

Code:
define L_0 = suitable power of 2.
while we have not reached end of array:
Let L = L_0
while error > Max_R:
Find the linear interpolation function for the next L points.
Compute max error = distance from point to curve within this interval.
If error > Max_R, let L := L/2.
end while
The current L points have been fitted; now move to next point.
end while
My algorithm has the drawback that it produces not the optimal intervals, but only intervals of certain lengths 2^n, with a maximum interval length of L_0. For the time consumption I think a worst-case scenario is something like N*L_0*log(L_0), but you are welcome to correct me if I am wrong.

And you are invited to suggest to me a better algorithm!

arildno

Homework Helper
Gold Member
Dearly Missed
Hmm, your algorithm (which to me seems perfectly okay) has one feature concerning, say, the linear regression computation at one interval I think can be simplified:
In YOUR case, you always start out with the longest ,and therefore clunkiest, calculation, which is most likely to fail.
Wouldn't it reduce the actual computation to start out with linear regression computation for the SMALLEST set, and if it fills your max error condition proceed to the next step (by doubling your L-value, rather than halving it as you do) and check it as well, retaining the highest order that fulfills the error condition?

Good point - and quite obvious when you think of it. :D
That way I don't even have to retain the unwanted parameter L_0.
Thanks!

arildno

Homework Helper
Gold Member
Dearly Missed
Yes, I was just rethinking this, and found out you could skip that L_0 altogether.
Good luck with the programming! I'm realizing that the correct answer depends very much on what kind of application one is thinking of and how much you are willing to sacrifice in accuracy.
I found a solution that works for *my* purposes, but strictly speaking it's not a fit anymore...
One increases the interval point by point, for each point updating the maximum and minimum allowable slope that the line segment can have in order for all points to lie close. If the next point lies outside the interval specified by these slopes, the interval is finished and a new one is commenced.
This algorithm is just O(N).

Don't know if this made sense but I guess I'm just writing for an audience on one (myself) now... ;)

arildno

Homework Helper
Gold Member
Dearly Missed
What do you mean with it is not a fit anymore?
At each point, you have less than the maximal error allowed, and you get a continuous wiggly-line throughout your whole interval.

Wasn't that what you wanted?

I'm not sure if I understand your problem, do explain..

arildno

Homework Helper
Gold Member
Dearly Missed
You ARE using Lagrange multipliers here on your linear regression technique, aren't you, in order to fuse together your line segments?

arildno

Homework Helper
Gold Member
Dearly Missed
In case you've forgotten about the Lagrange multipliers, remember that once you've determined the best linear fit for an interval L_n, the last point (i.e, the COMPUTED value of "y", not the exact data point!!) MUST be fixed when you do further linear regression along the whole of domain. Otherwise, you will get discontinuous line pieces, not a continuous wiggly-line.

In order to achieve that you need to tweak the regression technique with a Lagrange multiplier condition fixing the new line to fuse at its point of beginning, with the old line segment's end point.

D H

Staff Emeritus
For the time consumption I think a worst-case scenario is something like N*L_0*log(L_0), but you are welcome to correct me if I am wrong.
A solution that's not the best but is nonetheless "good enough" can be polynomial in time. Finding the best piecewise linear interpolation is, in general, an NP-hard problem. It's a segmentation problem, very similar to the knapsack problem. I don't know how far down the "best" rabbit hole you want to go, but if you do want to go that route, the Remes Exchange Algorithm may be of help.

arildno

Homework Helper
Gold Member
Dearly Missed
If you take this Lagrange-tweaked linear regression algorithm, it is yet another point of improvement you should consider:

Is it any reason at all to believe that the linear fit starting at the initial data point should be the SAME linear fit model if going from the last data point and backwards?

Not at all!

If, in addition you look at linear fit models initializing at an interior point, you'll get even more possible fits.

Which one to choose?

I would set up the following quality criterion:
1. Among those linear fits remaining below the local maximal error demand, you should only choose among those with the LEAST number of interpolating line segments. Among those again, pick that one with least total error.

Note that for interior points, you'll need to check BOTH the forward-backforwards expansion (FB) and the backwards-forwards expansion (BF).

An FB expansion from a point x_i first fixes the best linear fit with points coming after x_i. Lagrange multiplier techniques are used to end the process forwards, and also to generate best linear fits from x_i to the preceding points. A BF expansion starts out in the opposite direction.

Thus, if you have N points in total, you'll have 2N-2 linear fit models to compare against each other against the quality criterion.

arildno

Homework Helper
Gold Member
Dearly Missed
If you are uncertain about how to proceed with the Lagrange multiplier scheme, I'll present it. I show you a "forwards" technique only, making it work backwards isn't really that difficult.
Suppose you have N data points in total, and you have reached the beginning of interval L, where you must fuse together the end value from interval L-1 with the beginning value at interval L, and detmine the correct coefficients for the line segment on L, namely $\alpha_{L}x+\beta_{L}$
We call our local x's on L for $x_{L,i}$, and in general our corresponding y-values for $y_{L,i}$ We wish to make the maximal value of "i" on our interval L as large as possible, i.e, never exceeding the maximal allowable error. Suppose we have reached a point where we want to investigate the behaviour of linear regression up to, and including i=m.
Now, the FIRST y-value on L is NOT the correct data value, rather, it equals $y_{L,1}=\alpha_{L-1}x_{L-1,M}+\beta_{L-1}, x_{L-1,M}=x_{L,1}$
where "M" is the number of points contained at interval L-1.
----------------------
The condition that the y-values at end point at interval L-1 coincides with y-value at beginning of L can be represented as the condition:
$$G(\alpha_{L},\beta_{L})=(\alpha_{L}-\alpha_{L-1})x_{L,1}+\beta_{L}-\beta_{L-1}=0$$
Letting then F be:
$$F(\alpha_{L},\beta_{L})=\sum_{i=2}^{m}(\alpha_{L}x_{L,i}+\beta_{L}-y_{L,i})^{2}$$
(note start of summation on i=2, rather than at i=1)
we gain the three equations to be solved, in the manner of Lagrange multipliers:
$$\frac{\partial{F}}{\partial\alpha_{L}}=\lambda \frac{\partial{G}}{\partial\alpha_{L}}$$
$$\frac{\partial{F}}{\partial\beta_{L}}=\lambda \frac{\partial{G}}{\partial\beta_{L}}$$
$$G=0$$
The Lagrange multiplier "lambda" is easily found from eq. 2; and $\beta_{L}$ is readily found from eq.3.
Inserting these relations into eq.1 will, with some slight manipulations, yield the correct value for $\alpha_{L}$

Last edited:
• 1 person

arildno

Homework Helper
Gold Member
Dearly Missed
If I've done this correctly, I get (where I choose to sum from i=1):
$$\alpha_{L}=\frac{\sum_{i=1}^{m}(y_{L,i}-y_{L,1}) (x_{L,i}-x_{L,1})}{ \sum_{i=1}^{m}(x_{L,i}-x_{L,1})^{2}}$$
$$\beta_{L}=y_{L,1}-\alpha_{L}x_{L,1}$$
$$y_{L,1}\equiv\alpha_{L-1}x_{L,1}+\beta_{L-1}$$

D H

Staff Emeritus
If you are uncertain about how to proceed with the Lagrange multiplier scheme, I'll present it.
I see a number of issues with your approach, arildno.

One is that you haven't addressed the issue of how to size each of the segments. This is a hard problem, quite literally hard. It is NP-hard, provably so. It's essentially the knapsack problem. Another is that you're essentially doing a least squares approach, but with the constraint that the solution hit the first point exactly and that the segments are piecewise continuous. EmpaDoc did not specify piecewise continuous segments.

The biggest problem is the use of least squares. That is not what is wanted here. Least square regression finds the values of aL and bL that minimize $\sum_i (y_{L,i}- (a_L + b_Lx_{L,i}))^2$. This is a minimax function approximation problem; what is wanted is the values of aL and bL that minimize $\max_i |y_{L,i}- (a_L + b_Lx_{L,i})|$.

This is the approach taken in writing a math library implementation of some function (e.g., sin(x), erf(x)). After taking into account special constraints and geometry (e.g., $\sin(x) =x$ for sufficiently small values of x, and $\sin(x)=\sin(x \bmod 2\pi)$), the function is modeled by a piecewise sequence of readily calculable functions, each of which minimizes the maximum error within the segment at hand. The functions are not necessarily continuous at the segment boundaries. There's no need for continuity since the goal is to make the absolute error always within half an ULP (unit of least precision).

arildno

Homework Helper
Gold Member
Dearly Missed
1. Well, from what I read, there was only a LOCAL maximal error bound that was to be fulfilled, NOT a global one:
"maximum error of the points in the interval does not exceed a predetermined value Max_R. "
It can be readily ascertained whether or not that holds for a least-squares on m points, and my procedure will churn out a suggestion that fulfills this (if the local error condition holds for m point, it is then checked if a similar procedure for m+1 points hold as well) .
2. I do recognize that a full optimization is NOT what my procedure yields, but I don't see that was what OP asked after, either.
3. It was NOT asked to MINIMIZE the error at the interval

arildno

Homework Helper
Gold Member
Dearly Missed
"One is that you haven't addressed the issue of how to size each of the segments"
Yes, I have. Just increase m until the then (yet again) computed line segment fails the local error condition on at least one point. Keep the last one that worked.
-------------------------------------------------------------------------
It won't be a theoretical optimum (neither in terms of as few line segments possible, nor of those again, with the least global error) but it is a valid scheme, relative to the specifications given.

As for continuity, that will up to the OP to decide whether he wishes to include or not.

Last edited:

D H

Staff Emeritus
Well, from what I read, there was only a LOCAL maximal error bound that was to be fulfilled, NOT a global one:
"maximum error of the points in the interval does not exceed a predetermined value Max_R. "
They are one and the same thing, arildno.

The unstated overarching goal is to come up with an approximation g(x) of some target function f(x) such that g(x) is readily computable and such that the error in the approximation always less than or equal to some predetermined value (in this case, Max_R). The reason for using a piecewise linear approach (or a piecewise polynomial approach, or rational polynomial, or continued fraction) is that this satisfies the goal of being readily computable. By making the error in each segment less than or equal to this predetermined limit automatically means that the error will be less than or equal to this limit globally (i.e., over the domain of interest).

It can be readily ascertained whether or not that holds for a least-squares on m points, and my procedure will churn out a suggestion that fulfills this (if the local error condition holds for m point, it is then checked if a similar procedure for m+1 points hold as well) .
Yes, it can, but least squares is not optimal. It's not solving the problem at hand. The problem at hand is to come up with an approximation that minimizes the maximum deviation. As a customer of your function g(x), I don't care one bit what the root mean square error is. I care about the worst thing that can happen to me if I use your function. Using least squares almost inevitably means you are going to have to have more segments than are needed by a minimax approach. The title of this thread is "Best piecewise linear interpolation", and it's a minimax approach that is going to yield this magical "best".

Unfortunately, a minimax approach is in general an NP-hard problem. Fortunately, there are lots of polynomial time techniques that come a lot closer to this magical "best" than does least squares. This is particularly so for a linear approximation.

One way of looking at this is that least squares uses the L2 norm while a minimax approach uses the L norm. The nice thing about the L2 norm is that it is so markedly well-behaved. This is what makes it so amenable to analysis. The L norm is not well-behaved. It has discontinuities galore. Just because the L norm is not well-behaved does not mean that one should ditch it. That's like the drunken man looking for his lost keys under the streetlight, even though he knows he lost them over where it's dark.

arildno

Homework Helper
Gold Member
Dearly Missed
"The problem at hand is to come up with an approximation that minimizes the maximum deviation."
No it's not.

"I have a time series of data that I want to interpolate using a piecewise linear function. The lengths of the linear pieces should be adaptive such that the maximum error of the points in the interval does not exceed a predetermined value Max_R. "
------------------------------------------------------------------------------------------------

There is NO demand here for error minimization, only that the local error should not exceed a predetermined value.
That is two completely different things.
.............................................................
As for Least-Squares, that was chosen as a handy algorithm for a line segment generator, that is SEPARATE for afterwards checking whether or not the local error bound is not violated.
------------------
As for the "title" of a thread, I couldn't care less. It is the specifications given in the TEXT that matters, not OP's "title"

arildno

Homework Helper
Gold Member
Dearly Missed
Furthermore, you have not read OP's demand on computation time:

"At the same time, I am willing to bastardize my specification enough so that the time consumption is linear, or at least strictly less than quadratic, in the number of points."
---
My approach is easily such a bastardized algoritm that is linear, or quadratic at the most.

Thus, it satisfies the requirements on time consumption as well.

arildno

Homework Helper
Gold Member
Dearly Missed
" Using least squares almost inevitably means you are going to have to have more segments than are needed by a minimax approach."
In full agreement to that. And?
If you can come up with a minimax approach that is also of the same order of computations as my approach, I'm sure he'll appreciate it.
I certainly will! Last edited:

arildno

Homework Helper
Gold Member
Dearly Missed
DH gives very good advice to you if really want to minimize the number of line segments to use, but are willing to increase computation time somewhat.

Thus, you really need to prioritize whether "shortest possible computation time" should override "fewest line segments to be used".

D H

Staff Emeritus
An alternative to least squares is to use orthogonal polynomials, preferably Chebychev polynomials. For most well-behaved functions, a Chebychev polynomial approximation comes very close to that magical function that minimizes the maximum absolute deviation, much closer than does a least squares regression. For this reason, making a Chebychev approximation is typically used as the initial step in a Remes exchange type algorithm. A Chebychev approximation is computationally of the same order as a least squares regression, so it's not an undue burden.

arildno

Homework Helper
Gold Member
Dearly Missed
An alternative to least squares is to use orthogonal polynomials, preferably Chebychev polynomials. For most well-behaved functions, a Chebychev polynomial approximation comes very close to that magical function that minimizes the maximum absolute deviation, much closer than does a least squares regression. For this reason, making a Chebychev approximation is typically used as the initial step in a Remes exchange type algorithm. A Chebychev approximation is computationally of the same order as a least squares regression, so it's not an undue burden.
So, if I understand correctly:
First, you make a higher order polynomial interpolation of the data points with a nice class of polynomials, and then reappoximate that segmented polynomial with a piecewise linear interpolation? Sounds cool! You ARE using Lagrange multipliers here on your linear regression technique, aren't you, in order to fuse together your line segments?
Well, now I was actually thinking: "I can get the time consumption down to O(N) if I draw a line that passes close enough to all the points, without optimizing it at all". I donät think that can be done if I make an actual fit...

The cost-benefit analysis here depends, obviously, on my specific needs. In this case I am in fact making an animation; so no stringency is needed if it just looks good...

At the same time I am scientifically minded enough that I will probably try the more stringent fit that you so helpfully put up, arildno!

arildno

Homework Helper
Gold Member
Dearly Missed
Caring about the leaste squares mean error as I do in my approach will make for an extremely jagged line, relative to what can be gained through other techniques.
DH says, at one point, that the computational burden need not be much heavier here than a least squares approach, so that might be an alternative to look into.

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving