# Optimizing a parallel queue

1. Jan 11, 2012

### mentil

1. The problem statement, all variables and given/known data

You want to process x units of a product (at y processing plants), minimizing the total time spent processing all the units. Each place y_i has a queue of q_i units ahead of you and a processing rate of r_i units/minute. If you can simultaneously process units at the same time at different places, what is the minimum amount of time you can spend?

2. Relevant equations

3. The attempt at a solution
I solved the problem by formulating it like the following:
min(max((x>0)*(x+q)/m))

where x,q,m are vectors representing how quantities sent/queues/rates at each processing plant.
and with constraints

sum(x)=total units
x_i>=0

The (x>0) is just an indicator variable saying that if we go to processing plant y_i, then we will have to add in the time we have to wait in the queue.

I get what I think is the correct solution by using an evolutionary search algorithm, but I was wondering if anyone has any suggestions on how I might restructure the problem to be a linear programming problem or what algorithm is the right one to use (otherwise it seems like an np hard problem?). Does anyone know any similar types of problems or methods?

I hope this is the right area to post this...thanks!

2. Jan 11, 2012

### lanedance

not fully worked , but here's an idea

So the time spent if we allocate xi>=0 to plant y_i, then we will wait
$$t_i = \frac{q_i+x_i}{r_i}$$

Now making the assumption (actually a pretty big leap) that x is large enough that every plant will be used, and that we can treat x as continuous, then the minimum time will be achieved when all the plants finish at the same time. Otherwise we could shift product from a plant still going to one that has not yet finished.

This gives the following set of equations
$$x_i = r_it-q_i$$
$$\sum_i x_i = x$$

Which should be solvable for the x_i and t, solving for t gives
$$x_i = r_it-q_i$$
$$x = \sum_i x_i = \sum_i r_it-q_i = t (\sum_i r_i)-\sum_iq_i$$
$$t = \frac{x+\sum_i q_i}{\sum_i r_i}$$

Needs some more thought, but if any of the x_i come out as negative, this would imply the queue at the corresoonding y_i is too long and the most negative should be excluded, and the process could be repeated. However, the negtive x could be interpreted as shifted a portion of q_i to a different plant.

Last edited: Jan 11, 2012
3. Jan 11, 2012

### mentil

However, I don't think I can assume x is large enough so that every plant is used (given the specific problem, the actual solution has 1 plant not being used).

4. Jan 11, 2012

### lanedance

ok, well the given x_i should come out as negative, and you can rule it out and repeat... and its as easy as checking if $\frac{q_i}{r_i} > t$

the next problem would be converting from the continuous solution to the discretised solution

Last edited: Jan 11, 2012
5. Jan 11, 2012

### Ray Vickson

Let me re-phase your question. Say you have processors i = 1,2,...,n. Processor i has a delay d_i (d_i = r_i*q_i); that is, processor i is not available to you until time d_i has elapsed. So, if we have N units to process (my N = your x) and let x_i be the number of units processed on unit i, then the completion time is the maximum of the d_i + r_i*x_i for i = 1, 2, ..., n, but *only counting the i that you actually use*. You want to minimize this, subject to x_i >= 0 and sum x_i = N.

This can be written is a standard mixed-integer linear programming problem. The problem I have is that if I show you how to do it I will be doing the problem for you. Let me try to give a hint, and I hope not give too much away.

Let T = completion time (T >= 0). How is T related to d_1 + r_1*x_1 if processor 1 is used? How is related to d_2 + r_2*x_2 if processor 2 is used? And so on. So, let y_i be a binary variable, with y_i = 0 if processor i is not used and y_i = 1 if it is used. How can you model the situation that T >= d_1 + r_1*x_1 if y_1 = 1 and no relevant restriction on T if y_1 = 0? Take a large, positive number M > 0; for example, any M >= max(d_i) + max(r_i)*N will do. Suppose you write the consraint T >= d_1 + r_1*x_1 - M*(1-y_1). What do you get when y_1 = 1? What do you get when y_1 = 0? Of course, this is not the whole story: you don't want to have all y_i = 0, and you don't want to have x_i > 0 with y_i = 0. Can you see how to ensure that x_i > 0 only if y_i = 1?

There is also another, less systematic but more elementary method that I think will work (but have no fully-convincing proof). By re-labeling if necessary, suppose d_1 < d_2 < ... < d_n. Put (d_2-d_1)/r_1 units on Machine M1 (or, if this is > N, put all N units on M1). So, either we are done, or else we now have M1 and M2 both available at the same time. If we put x more units together on M1 and M2 and require the same finishing times, we need to put x_1 on M1 and x_2 on M2, with x1*r_1 = x2*r_2 and x_1 + x_2 = x. Do that, and increase x until the completion times on M1 and M2 reach d_3 or the job is done. If we reach d_3 we now have M1, M2 and M3 all available at the same time, and we load x more units on those three (keeping all three completion times equal) until we are done or reach d_4, etc. Intuitively, it seems that this procedure ought to produce the optimal schedule, but at the moment this needs proof.

RGV

Last edited: Jan 11, 2012
6. Jan 11, 2012

### mentil

I believe the constraint M*y>=x works.

Thank you for the help!

7. Jan 11, 2012

### mentil

I tried the following:

Minimize the objective function: max(T)

with constraints
T>=d_i+r_i*x_i-M*(1-y_i)
T>=0
sum(x_i) = total
x_i >=0
m*y_i<=x_i
T>=0
y_i binary

every time I run it through excel solver, however, I get a different solution. Does this formulation look ok? It makes sense to me...

8. Jan 11, 2012

### Ray Vickson

Your (x_i,y_i) constraint is backwards. You need x_i <= total*y_i (assuming total = given data).

BTW: if you think more about the second suggestion I made you should be able to develop a slick method that avoids binary variables, or even linear programming, completely.

RGV

9. Jan 11, 2012

### mentil

Thanks! That's actually what I meant.

I do think your second way is interesting, but I'm not sure it suits my purpose. This isn't a homework problem, but a sub-problem I needed to look at in order to get to a bigger problem. It helps framing it as a linear programming problem because now I can add more constraints and specifications.

One thing that I'm not sure about (maybe you can convince me otherwise). For your second solution, whether you decide to put something on a machine does not depend on the rate of processing. For instance let's say we have 2 units to process, 2 processing plants m1 m2 with weight times d1<d2 and rates r2>>r1 (for the sake of example, let's say d1=1 second, d2 = 2 seconds, r1= 1 job/second and r2=100 jobs/second), the ideal solution seems to be to put everything on m2, even though it starts later.

10. Jan 12, 2012

### Ray Vickson

Everything depends on what your objective is. I assumed you want to minimize finish time = completion time of last operation = first time the whole job is available for shipping, etc. If so, there is a simple starting result: if a group of machines is used in the optimal solution, then the completion time must be the same on all these machines. (If not, then one machine completes later than some of the others. You can lessen the finish time by moving some work off this latest machine and distributing it among the others. You can keep doing that until all completion times are equal.) I am, of course, assuming you allow fractional units to be processed, so that in your example, we need 1+x1 = 2 + x2/100 and x1+x2=2, giving x1 = 102/101, x2 = 100/101, and completion time T = 1 + x1 = 203/101 = 2.00990099 .... Your solution has completion time T2 = 2 + 2/100 = 202/100 = 2.02, which is just a bit more. Even if we restrict the x_i to integer values your solution is still not optimal, for putting x_1 = x_2 = 1 gives a finish time of T = 2 + 1/100 = 2.01. The two completion times are not equal any more, but it is still better to put some work on both machines.

RGV

Last edited: Jan 12, 2012
11. Jan 12, 2012

### lanedance

see for example post #2

12. Jan 12, 2012

### Ray Vickson

Here, at last, is the simple algorithm to compute the optimal solution. Again, assume d_1 < d_2 < ... < d_n, and let N = total amount of work (number of units) to be done. First, let W_k(T) = maximum work (number of units) that can be processed on machines 1--k and completing at time T (T >= d_k). Note that the optimal allocation is x_j = (T-d_j)/r_j for j = 1,...,k, and W_k(T) = sum x_j = A_k *T - B_k, where A_k = sum{1/r_i, i=1..k} and B_k = sum{d_i/r_i, i=1..k} (or, recursively: A_1 = 1/r_1, B_1 = d_1/r_1 and for j >= 2, A_j = 1/r_j + A_{j-1}, B_j = (d_j/r_j) + B_{j-1}). Suppose we have already determined that N > W_k(d_k) (so machines 1.. (k-1) are not enough); then, if W_k(d_{k+1}) >= N, stop: we need only machines 1,...,k. Solve for T from W_k(T) = N, then compute the x_j from the previous formulas. Else, if W_k(d_{k+1}) < N then let k --> k+1 and continue, or else if k = n then stop: we need all n machines, and can get the solution from W_n(T) = N, and x_j given by preceding formulas.

If not all d_j are distinct, for example, if n = 6 and d_1 = d_2 < d_3 = d_4 = d_5 < d_6 we need to look at W2(d_3), W_5(d_6) or else W_6(T), but the formulas are unchanged; the algorithm just needs some extra bells and whistles.

RGV