Best piecewise linear interpolation

EmpaDoc · Oct 2, 2013

I have a time series of data that I want to interpolate using a piecewise linear function. The lengths of the linear pieces should be adaptive such that the maximum error of the points in the interval does not exceed a predetermined value Max_R.

I guess the specification boils down to: Find a set of intervals such that the time series is approximately linear, within a tolerance Max_R, for each interval.

At the same time, I am willing to bastardize my specification enough so that the time consumption is linear, or at least strictly less than quadratic, in the number of points. I also don't want to use third-party software libraries: This should be a standalone program.

This is my first stab at the problem.

Code:

define L_0 = suitable power of 2.
while we have not reached end of array: 
    Let L = L_0
    while error > Max_R:
        Find the linear interpolation function for the next L points. 
        Compute max error = distance from point to curve within this interval. 
        If error > Max_R, let L := L/2.
    end while
    The current L points have been fitted; now move to next point.
end while

My algorithm has the drawback that it produces not the optimal intervals, but only intervals of certain lengths 2^n, with a maximum interval length of L_0. For the time consumption I think a worst-case scenario is something like N*L_0*log(L_0), but you are welcome to correct me if I am wrong.

And you are invited to suggest to me a better algorithm!

arildno · Oct 2, 2013

Hmm, your algorithm (which to me seems perfectly okay) has one feature concerning, say, the linear regression computation at one interval I think can be simplified:
In YOUR case, you always start out with the longest ,and therefore clunkiest, calculation, which is most likely to fail.
Wouldn't it reduce the actual computation to start out with linear regression computation for the SMALLEST set, and if it fills your max error condition proceed to the next step (by doubling your L-value, rather than halving it as you do) and check it as well, retaining the highest order that fulfills the error condition?

EmpaDoc · Oct 2, 2013

Good point - and quite obvious when you think of it. :D
That way I don't even have to retain the unwanted parameter L_0.
Thanks!

arildno · Oct 2, 2013

Yes, I was just rethinking this, and found out you could skip that L_0 altogether.
Good luck with the programming!

EmpaDoc · Oct 3, 2013

I'm realizing that the correct answer depends very much on what kind of application one is thinking of and how much you are willing to sacrifice in accuracy.
I found a solution that works for *my* purposes, but strictly speaking it's not a fit anymore...
One increases the interval point by point, for each point updating the maximum and minimum allowable slope that the line segment can have in order for all points to lie close. If the next point lies outside the interval specified by these slopes, the interval is finished and a new one is commenced.
This algorithm is just O(N).

Don't know if this made sense but I guess I'm just writing for an audience on one (myself) now... ;)

arildno · Oct 3, 2013

What do you mean with it is not a fit anymore?
At each point, you have less than the maximal error allowed, and you get a continuous wiggly-line throughout your whole interval.

Wasn't that what you wanted?

I'm not sure if I understand your problem, do explain..

arildno · Oct 3, 2013

You ARE using Lagrange multipliers here on your linear regression technique, aren't you, in order to fuse together your line segments?

arildno · Oct 3, 2013

In case you've forgotten about the Lagrange multipliers, remember that once you've determined the best linear fit for an interval L_n, the last point (i.e, the COMPUTED value of "y", not the exact data point!) MUST be fixed when you do further linear regression along the whole of domain. Otherwise, you will get discontinuous line pieces, not a continuous wiggly-line.

In order to achieve that you need to tweak the regression technique with a Lagrange multiplier condition fixing the new line to fuse at its point of beginning, with the old line segment's end point.

D H · Oct 3, 2013

EmpaDoc said:

For the time consumption I think a worst-case scenario is something like N*L_0*log(L_0), but you are welcome to correct me if I am wrong.

A solution that's not the best but is nonetheless "good enough" can be polynomial in time. Finding the best piecewise linear interpolation is, in general, an NP-hard problem. It's a segmentation problem, very similar to the knapsack problem. I don't know how far down the "best" rabbit hole you want to go, but if you do want to go that route, the Remes Exchange Algorithm may be of help.

arildno · Oct 3, 2013

If you take this Lagrange-tweaked linear regression algorithm, it is yet another point of improvement you should consider:

Is it any reason at all to believe that the linear fit starting at the initial data point should be the SAME linear fit model if going from the last data point and backwards?

Not at all!

If, in addition you look at linear fit models initializing at an interior point, you'll get even more possible fits.

Which one to choose?

I would set up the following quality criterion:
1. Among those linear fits remaining below the local maximal error demand, you should only choose among those with the LEAST number of interpolating line segments. Among those again, pick that one with least total error.

Note that for interior points, you'll need to check BOTH the forward-backforwards expansion (FB) and the backwards-forwards expansion (BF).

An FB expansion from a point x_i first fixes the best linear fit with points coming after x_i. Lagrange multiplier techniques are used to end the process forwards, and also to generate best linear fits from x_i to the preceding points. A BF expansion starts out in the opposite direction.

Thus, if you have N points in total, you'll have 2N-2 linear fit models to compare against each other against the quality criterion.

arildno · Oct 3, 2013

If you are uncertain about how to proceed with the Lagrange multiplier scheme, I'll present it. I show you a "forwards" technique only, making it work backwards isn't really that difficult.
Suppose you have N data points in total, and you have reached the beginning of interval L, where you must fuse together the end value from interval L-1 with the beginning value at interval L, and detmine the correct coefficients for the line segment on L, namely [itex]\alpha_{L}x+\beta_{L}[/itex]
We call our local x's on L for [itex]x_{L,i}[/itex], and in general our corresponding y-values for [itex]y_{L,i}[/itex] We wish to make the maximal value of "i" on our interval L as large as possible, i.e, never exceeding the maximal allowable error. Suppose we have reached a point where we want to investigate the behaviour of linear regression up to, and including i=m.
Now, the FIRST y-value on L is NOT the correct data value, rather, it equals [itex]y_{L,1}=\alpha_{L-1}x_{L-1,M}+\beta_{L-1}, x_{L-1,M}=x_{L,1}[/itex]
where "M" is the number of points contained at interval L-1.
----------------------
The condition that the y-values at end point at interval L-1 coincides with y-value at beginning of L can be represented as the condition:
[tex]G(\alpha_{L},\beta_{L})=(\alpha_{L}-\alpha_{L-1})x_{L,1}+\beta_{L}-\beta_{L-1}=0[/tex]
Letting then F be:
[tex]F(\alpha_{L},\beta_{L})=\sum_{i=2}^{m}(\alpha_{L}x_{L,i}+\beta_{L}-y_{L,i})^{2}[/tex]
(note start of summation on i=2, rather than at i=1)
we gain the three equations to be solved, in the manner of Lagrange multipliers:
[tex]\frac{\partial{F}}{\partial\alpha_{L}}=\lambda \frac{\partial{G}}{\partial\alpha_{L}}[/tex]
[tex]\frac{\partial{F}}{\partial\beta_{L}}=\lambda \frac{\partial{G}}{\partial\beta_{L}}[/tex]
[tex]G=0[/tex]
The Lagrange multiplier "lambda" is easily found from eq. 2; and [itex]\beta_{L}[/itex] is readily found from eq.3.
Inserting these relations into eq.1 will, with some slight manipulations, yield the correct value for [itex]\alpha_{L}[/itex]

arildno · Oct 4, 2013

If I've done this correctly, I get (where I choose to sum from i=1):
[tex]\alpha_{L}=\frac{\sum_{i=1}^{m}(y_{L,i}-y_{L,1}) (x_{L,i}-x_{L,1})}{ \sum_{i=1}^{m}(x_{L,i}-x_{L,1})^{2}}[/tex]
[tex]\beta_{L}=y_{L,1}-\alpha_{L}x_{L,1}[/tex]
[tex]y_{L,1}\equiv\alpha_{L-1}x_{L,1}+\beta_{L-1}[/tex]

D H · Oct 4, 2013

arildno said:

If you are uncertain about how to proceed with the Lagrange multiplier scheme, I'll present it.

I see a number of issues with your approach, arildno.

One is that you haven't addressed the issue of how to size each of the segments. This is a hard problem, quite literally hard. It is NP-hard, provably so. It's essentially the knapsack problem. Another is that you're essentially doing a least squares approach, but with the constraint that the solution hit the first point exactly and that the segments are piecewise continuous. EmpaDoc did not specify piecewise continuous segments.

The biggest problem is the use of least squares. That is not what is wanted here. Least square regression finds the values of a_L and b_L that minimize ##\sum_i (y_{L,i}- (a_L + b_Lx_{L,i}))^2##. This is a minimax function approximation problem; what is wanted is the values of a_L and b_L that minimize ##\max_i |y_{L,i}- (a_L + b_Lx_{L,i})|##.

This is the approach taken in writing a math library implementation of some function (e.g., sin(x), erf(x)). After taking into account special constraints and geometry (e.g., ##\sin(x) =x## for sufficiently small values of x, and ##\sin(x)=\sin(x \bmod 2\pi)##), the function is modeled by a piecewise sequence of readily calculable functions, each of which minimizes the maximum error within the segment at hand. The functions are not necessarily continuous at the segment boundaries. There's no need for continuity since the goal is to make the absolute error always within half an ULP (unit of least precision).

arildno · Oct 4, 2013

1. Well, from what I read, there was only a LOCAL maximal error bound that was to be fulfilled, NOT a global one:
"maximum error of the points in the interval does not exceed a predetermined value Max_R. "
It can be readily ascertained whether or not that holds for a least-squares on m points, and my procedure will churn out a suggestion that fulfills this (if the local error condition holds for m point, it is then checked if a similar procedure for m+1 points hold as well) .
2. I do recognize that a full optimization is NOT what my procedure yields, but I don't see that was what OP asked after, either.
3. It was NOT asked to MINIMIZE the error at the interval

arildno · Oct 4, 2013

"One is that you haven't addressed the issue of how to size each of the segments"
Yes, I have. Just increase m until the then (yet again) computed line segment fails the local error condition on at least one point. Keep the last one that worked.
-------------------------------------------------------------------------
It won't be a theoretical optimum (neither in terms of as few line segments possible, nor of those again, with the least global error) but it is a valid scheme, relative to the specifications given.

As for continuity, that will up to the OP to decide whether he wishes to include or not.

D H · Oct 4, 2013

arildno said:

Well, from what I read, there was only a LOCAL maximal error bound that was to be fulfilled, NOT a global one:
"maximum error of the points in the interval does not exceed a predetermined value Max_R. "

They are one and the same thing, arildno.

The unstated overarching goal is to come up with an approximation g(x) of some target function f(x) such that g(x) is readily computable and such that the error in the approximation always less than or equal to some predetermined value (in this case, Max_R). The reason for using a piecewise linear approach (or a piecewise polynomial approach, or rational polynomial, or continued fraction) is that this satisfies the goal of being readily computable. By making the error in each segment less than or equal to this predetermined limit automatically means that the error will be less than or equal to this limit globally (i.e., over the domain of interest).

It can be readily ascertained whether or not that holds for a least-squares on m points, and my procedure will churn out a suggestion that fulfills this (if the local error condition holds for m point, it is then checked if a similar procedure for m+1 points hold as well) .

Yes, it can, but least squares is not optimal. It's not solving the problem at hand. The problem at hand is to come up with an approximation that minimizes the maximum deviation. As a customer of your function g(x), I don't care one bit what the root mean square error is. I care about the worst thing that can happen to me if I use your function. Using least squares almost inevitably means you are going to have to have more segments than are needed by a minimax approach. The title of this thread is "Best piecewise linear interpolation", and it's a minimax approach that is going to yield this magical "best".

Unfortunately, a minimax approach is in general an NP-hard problem. Fortunately, there are lots of polynomial time techniques that come a lot closer to this magical "best" than does least squares. This is particularly so for a linear approximation.

One way of looking at this is that least squares uses the L₂ norm while a minimax approach uses the L_∞ norm. The nice thing about the L₂ norm is that it is so markedly well-behaved. This is what makes it so amenable to analysis. The L_∞ norm is not well-behaved. It has discontinuities galore. Just because the L_∞ norm is not well-behaved does not mean that one should ditch it. That's like the drunken man looking for his lost keys under the streetlight, even though he knows he lost them over where it's dark.

arildno · Oct 4, 2013

"The problem at hand is to come up with an approximation that minimizes the maximum deviation."
No it's not.
Read it again:

"I have a time series of data that I want to interpolate using a piecewise linear function. The lengths of the linear pieces should be adaptive such that the maximum error of the points in the interval does not exceed a predetermined value Max_R. "
------------------------------------------------------------------------------------------------

There is NO demand here for error minimization, only that the local error should not exceed a predetermined value.
That is two completely different things.
............
As for Least-Squares, that was chosen as a handy algorithm for a line segment generator, that is SEPARATE for afterwards checking whether or not the local error bound is not violated.
------------------
As for the "title" of a thread, I couldn't care less. It is the specifications given in the TEXT that matters, not OP's "title"

arildno · Oct 4, 2013

Furthermore, you have not read OP's demand on computation time:

"At the same time, I am willing to bastardize my specification enough so that the time consumption is linear, or at least strictly less than quadratic, in the number of points."
---
My approach is easily such a bastardized algorithm that is linear, or quadratic at the most.

Thus, it satisfies the requirements on time consumption as well.

arildno · Oct 4, 2013

" Using least squares almost inevitably means you are going to have to have more segments than are needed by a minimax approach."
In full agreement to that. And?
If you can come up with a minimax approach that is also of the same order of computations as my approach, I'm sure he'll appreciate it.
I certainly will!

arildno · Oct 4, 2013

Empadoc:
DH gives very good advice to you if really want to minimize the number of line segments to use, but are willing to increase computation time somewhat.

Thus, you really need to prioritize whether "shortest possible computation time" should override "fewest line segments to be used".

D H · Oct 4, 2013

An alternative to least squares is to use orthogonal polynomials, preferably Chebychev polynomials. For most well-behaved functions, a Chebychev polynomial approximation comes very close to that magical function that minimizes the maximum absolute deviation, much closer than does a least squares regression. For this reason, making a Chebychev approximation is typically used as the initial step in a Remes exchange type algorithm. A Chebychev approximation is computationally of the same order as a least squares regression, so it's not an undue burden.

arildno · Oct 5, 2013

D H said:

An alternative to least squares is to use orthogonal polynomials, preferably Chebychev polynomials. For most well-behaved functions, a Chebychev polynomial approximation comes very close to that magical function that minimizes the maximum absolute deviation, much closer than does a least squares regression. For this reason, making a Chebychev approximation is typically used as the initial step in a Remes exchange type algorithm. A Chebychev approximation is computationally of the same order as a least squares regression, so it's not an undue burden.

So, if I understand correctly:
First, you make a higher order polynomial interpolation of the data points with a nice class of polynomials, and then reappoximate that segmented polynomial with a piecewise linear interpolation? Sounds cool!

EmpaDoc · Oct 8, 2013

arildno said:

You ARE using Lagrange multipliers here on your linear regression technique, aren't you, in order to fuse together your line segments?

Well, now I was actually thinking: "I can get the time consumption down to O(N) if I draw a line that passes close enough to all the points, without optimizing it at all". I donät think that can be done if I make an actual fit...

The cost-benefit analysis here depends, obviously, on my specific needs. In this case I am in fact making an animation; so no stringency is needed if it just looks good...

At the same time I am scientifically minded enough that I will probably try the more stringent fit that you so helpfully put up, arildno!

arildno · Oct 8, 2013

Note the comments from DH:
Caring about the leaste squares mean error as I do in my approach will make for an extremely jagged line, relative to what can be gained through other techniques.
DH says, at one point, that the computational burden need not be much heavier here than a least squares approach, so that might be an alternative to look into.

Best piecewise linear interpolation

High School Ant on a stretchy rope puzzle

High School Potato paradox

Geometric Game: Fun With Matches (Safe!)

Undergrad Three Circle Problem

High School Three Squares Problem

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Best piecewise linear interpolation

Similar threads