# Linear programming problem

• I
Hi everyone.
I am tackling a problem related to the purchase of items that are available from different vendors, and from what I've seen so far it seems to me that it may be solvable by linear programming methods (I'm using R).
However I'm not sure how to set up the problem to achieve the exact goal I have in mind.
Can you please advise me on this?

Suppose you search for a few items you need to purchase, let's call them "a", "b" and "c".
You find that they are available at different vendors, called "x", "y" and "z".
Not all objects are available at all vendors, so in the end you have a table like this:

Item,Vendor,Price
"a","x",1
"a","y",1.5
"b","y",2
"b","z",1
"c","x",2
"c","y",0.5

You want to buy each item once and only once, so if all that matters is minimising the total price for the items, this is pretty simple to solve. You will buy "a" from "x", "b" from "z" and "c" from "y".
I know how to solve this in R.

The problem is different, though.
The shipping cost is charged per total order per vendor, so selecting many vendors may minimise the cost of the items, but not the total cost we pay.
E.g. if each vendor charges 5 for shipping costs, the total for the above order will be 1+1+0.5+5+5+5 = 17.5 .

If, on the other hand, you buy both "a" and "c" from "y", and "b" from "z", you have 1.5+0.5+1+5+5=13.
The total price for the items is greater, but the shipping cost is reduced, and the total is also significantly reduced.

Do you think it's possible to solve the minimisation of the total cost (items+shipping) using linear programming? Or can you suggest any other technique/package please?

The way I solved the 'easy' part (minimisation of the total price for the items) was by R package 'lpSolve', where the 'x' vector to find is a binary vector (0's and 1's) having length equal to the number of rows of the table (in this case 6). The coefficient vector is the Price column. The model.matrix function on the Item column gives you the constraint matrix, and the product of 'x' by this matrix must give a vector of 1's (each item is bought only once).

However, if I wanted to include the shipping cost, I wouldn't know how to do.
I can't sum it to the price of each item, because it is constant with respect to the number of items bought from one vendor.
I could make another model.matrix on the Vendor column, and I know the product of 'x' by this gives me a vector of the number of products bought from each vendor.
Here is where I get stuck; no idea how this information can be used in the minimisation of the total price.

I considered running the 'easy' part, getting all the item-price-optimal solutions (because in more complex cases there _are_ multiple solutions), calculating the total price for each solution and sorting.
Trouble is, today I submitted to this protocol a matrix of 492 rows with 20 vendors. I asked lpSolve to give me at most 100 solutions, thinking 'surely there won't be as many as that, given all the constraints'. No such luck. What was I thinking? When you have 2492 possible vectors... I'm even amazed at how fast it ran!

Any help with this will be really appreciated.
Thanks!
L

## Answers and Replies

andrewkirk
Science Advisor
Homework Helper
Gold Member
The shipping cost is charged per total order per vendor
I'm not sure what this means, but I'll assume it means that for each vendor ##x## there is a flat cost ##D_x## for delivery from that vendor, regardless of the number of items purchased from them. Say there are ##n## vendors.

If so then you could append to your vector of binary variables a set of ##n## additional binary variables, one for each vendor, to indicate whether an order is bought from that one. Set the coefficients of each of these to the delivery fees. Then add to your set of constraints a set of ##n## additional constraints, one for each vendor ##x## specifying that the delivery variable for vendor ##x## is 1 iff at least one of the item-purchased variables for that vendor is 1.
I had a quick look at the spec for lpsolve and I'm pretty sure this approach could be implemented using that function.

Thank you @andrewkirk .
I'm not sure what this means, but I'll assume it means that for each vendor ##x## there is a flat cost ##D_x## for delivery from that vendor, regardless of the number of items purchased from them.
That's indeed what I meant. In fact, a further penalty comes from using too many different vendors, because a person must then email each vendor, evaluate the quotes, place the orders... I suppose this 'internal handling fee' can be easily included in the shipping cost for each vendor.
If so then you could append to your vector of binary variables a set of ##n## additional binary variables, one for each vendor, to indicate whether an order is bought from that one. Set the coefficients of each of these to the delivery fees. Then add to your set of constraints a set of ##n## additional constraints, one for each vendor ##x## specifying that the delivery variable for vendor ##x## is 1 iff at least one of the item-purchased variables for that vendor is 1.
I had a quick look at the spec for lpsolve and I'm pretty sure this approach could be implemented using that function.
I see what you mean. And that's where I got stuck. I could not find this option to specify 'iff' conditions.
Maybe that's because I am using the 'old' version of lpSolve for R. There is a newer package called 'lpSolveAPI', but I didn't install it so far, because it's far more complicated to use, it comes with a lot of caveats and warnings... I didn't want to mess this up from the start. http://lpsolve.sourceforge.net/5.5/R.htm
Perhaps the additional functions of the new package would allow me to specify 'iff' conditions. I need to study this in more detail.
Thanks
L

FactChecker
Science Advisor
Gold Member
Thank you @andrewkirk And that's where I got stuck. I could not find this option to specify 'iff' conditions.
It's not clear to me that this is a linear programming problem. The cost function includes 'iff' dependencies between the independent variables that may make it nonlinear.

PS. I guess you could introduce a variable for every possible combination of shipment packages, with the associated cost, and make a linear programming problem. That seems impractical.

Last edited:
Ouch, that's what I feared...
How does one solve nonlinear problems of this type? Is it possible in R? I'm only beginning to use it regularly, but it looks quite good to me.
I know Excel has got a nonlinear solver, but as far as I am aware it doesn't take more than a given, and rather small, number of binary coefficients.

On the other hand, I was thinking, I could run the lp routine first on the Vendor model matrix, i.e. minimise the number of vendors that cover the whole set of items I need to purchase, using as coefficient vector a function of the shipping cost and 'rank' of each vendor (because we have 'preferred', 'OK-ish' and 'blacklisted' vendors, to be honest). Of course in this case the constraint matrix would be >=1 (each item present at least once in the resulting set).
Once I have the set (hopefully only one) of vendors to use, I can filter the original table to include only these, and run lp again on the Item model matrix.

This probably won't give me all the time the least expensive solution, but in terms of the work we need to do it's probably good enough.

In fact, there is a related task that I think _is_ linear and can be solved by one run of lp.
In our business we often have to 'cherrypick' a set of distinct items from our collection. The items are stored in containers that we call 'plates'. It is very frequent that the same item is present in multiple plates, and each plate has many different items.
When a request for a set of n items comes, the people who manage the collection want to spend as little time as possible retrieving and opening plates (it's a long procedure), and the more plates they need to use, the more difficult and costly the procedure becomes.
The worst possible scenario is when each item is contained in one and only one plate. Then it's sure that n plates must be used.
Luckily, as I said it is often the case that multiple plates exist for a single item. So it is advantageous to use plates that contain most of the requested items together.

How do we minimise the number of plates to use, with the constraint of finding all the n items?
We have a routine that, given the set of n items, returns a table of all the plates where each item is present.
After some filtering (for quantity, format etc), the table looks like this:

Item,Plate
"a","00102"
"a","00250"
"b","01062"
"b","00102"
"c","00103"
"c","02209"

In this simple example it's clear that we can find all 3 items by taking only plates 00102 + either 00103 or 02209.
Not a big gain in absolute terms; but if we had 200 plates instead of 300, that _would_ be a big difference.

In general, if I take the above table and summarise it (e.g. by the 'table' function of R), I get the following:
"a" "b" "c"
"00102" 1 1 0
"00103" 0 0 1
"00250" 1 0 0
"01062" 0 1 0
"02209" 0 0 1

Then I can use this as the constraint matrix (transposed), a vector of 1's as the coefficients, and find the binary vector(s) that minimise the objective, while each constraint line must be >=1 (each item found at least once in the resulting set of plates).
I haven't tried this yet, but I'm pretty confident it should work.

Thank you all again for your input!
L

andrewkirk
Science Advisor
Homework Helper
Gold Member
that's where I got stuck. I could not find this option to specify 'iff' conditions.
There may be a neater way to do it, but I think the following may work.
Put two lines in const.mat for each vendor, with identical entries, which are all zeros except a 1 for each item offered by the vendor and '-n' for the delivery variable for that vendor, where n is the number of items offered. Set the const.rhs entries to zero for the first line and -n for the second. Then in const.dir set the entry for the first of these lines to '<=' and the second to '>'. That constrains the sum of 1 per item minus n if there's a delivery to be in the range -(n-1)...0, which I think is equivalent to what we want.

lavoisier
OK. I'm going to try this out in R using the data I mentioned.
The output will be a bit offset; I hope it's clear enough.

> om <- data.frame(Item=letters[c(1,1,2,2,3,3)],Vendor=letters[c(24,25,25,26,24,25)],Price=c(1,1.5,2,1,2,0.5))
> om

Item Vendor Price
1 a x 1.0
2 a y 1.5
3 b y 2.0
4 b z 1.0
5 c x 2.0
6 c y 0.5

> imm <- model.matrix(~Item+0,om)
> dimnames(imm)[[2]] <- c(letters[c(1,2,3)])
> imm

a b c
1 1 0 0
2 1 0 0
3 0 1 0
4 0 1 0
5 0 0 1
6 0 0 1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Item [1] "contr.treatment" > vmm <- model.matrix(~Vendor+0,om) > dimnames(vmm)[[2]] <- c(letters[c(24,25,26)]) > vmm x y z 1 1 0 0 2 0 1 0 3 0 1 0 4 0 0 1 5 1 0 0 6 0 1 0 attr(,"assign") [1] 1 1 1 attr(,"contrasts") attr(,"contrasts")$Vendor
[1] "contr.treatment"

This would be my original formulation of the problem. I would use the transpose of imm as the constraint matrix, om$Price as the coefficient vector, a vector of 1's as the constraint rhs and "==" as the direction (I want each item only once). Now, going with your suggestion, I add a delivery line to the om, one for each vendor. > om1 <- rbind(om, data.frame(Item=rep("del",3),Vendor=c("x","y","z"),Price=rep(5,3))) > om1 Item Vendor Price 1 a x 1.0 2 a y 1.5 3 b y 2.0 4 b z 1.0 5 c x 2.0 6 c y 0.5 7 del x 5.0 8 del y 5.0 9 del z 5.0 > imm1 <- model.matrix(~Item+0,om1) > dimnames(imm1)[[2]] <- c(c(letters[c(1,2,3)]),"del") > imm1 a b c del 1 1 0 0 0 2 1 0 0 0 3 0 1 0 0 4 0 1 0 0 5 0 0 1 0 6 0 0 1 0 7 0 0 0 1 8 0 0 0 1 9 0 0 0 1 attr(,"assign") [1] 1 1 1 1 attr(,"contrasts") attr(,"contrasts")$Item
[1] "contr.treatment"

> vmm1 <- model.matrix(~Vendor+0,om1)
> dimnames(vmm1)[[2]] <- c(c(letters[c(24,25,26)]))
> vmm1

x y z
1 1 0 0
2 0 1 0
3 0 1 0
4 0 0 1
5 1 0 0
6 0 1 0
7 1 0 0
8 0 1 0
9 0 0 1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Vendor [1] "contr.treatment" So now I'm a bit unsure what you meant I should do. Normally I would use transposed imm1 as the constraint matrix, and the last 3 elements of the result vector should be 1 iff the corresponding vendor is used, 0 otherwise, which didn't seem possible in lpSolve. If I understand correctly, you suggest that I use the information in vmm1 to add more constraint lines to imm1, in particular: > vt <- table(om1$Vendor)
> vt

x y z
3 4 2

> vmm2 <- vmm1
> vmm2[7:9,] <- vmm2[7:9,] * rep(-(vt-1),3)
> vmm2

x y z
1 1 0 0
2 0 1 0
3 0 1 0
4 0 0 1
5 1 0 0
6 0 1 0
7 -2 0 0
8 0 -3 0
9 0 0 -1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Vendor [1] "contr.treatment" So the overall constraint matrix cm will be a combination of imm1 and vmm2. > cm <- cbind(imm1,vmm2[,rep(1:3,2)]) > cm a b c del x y z x y z 1 1 0 0 0 1 0 0 1 0 0 2 1 0 0 0 0 1 0 0 1 0 3 0 1 0 0 0 1 0 0 1 0 4 0 1 0 0 0 0 1 0 0 1 5 0 0 1 0 1 0 0 1 0 0 6 0 0 1 0 0 1 0 0 1 0 7 0 0 0 1 -2 0 0 -2 0 0 8 0 0 0 1 0 -3 0 0 -3 0 9 0 0 0 1 0 0 -1 0 0 -1 The constraint rhs crhs and direction cdir defined as you specified: > crhs <- c(rep(1,4),rep(0,3),-(vt-1)) > crhs x y z 1 1 1 1 0 0 0 -2 -3 -1 > cdir <- c(rep("==",3),">=",rep("<=",3),rep(">",3)) > cdir [1] "==" "==" "==" ">=" "<=" "<=" "<=" ">" ">" ">" And the coefficient vector cv: > cv <- om1$Price
> cv

[1] 1.0 1.5 2.0 1.0 2.0 0.5 5.0 5.0 5.0

Now I call lpSolve:

> lpout <- lp("min",cv,t(cm),cdir,crhs,all.bin=TRUE)
> lpout$solution [1] 0 1 1 0 0 1 0 1 0 > lpout Success: the objective function is 9 > om1[as.logical(lpout$solution),]
Item Vendor Price
2 a y 1.5
3 b y 2.0
6 c y 0.5
8 del y 5.0

So I should buy all items from y. I had actually missed this when I suggested my 'by eye' solution.

For comparison, this is what I would get by using the Item constraint matrix alone:

> lpout_o <- lp("min",om$Price,t(imm),rep("==",3),rep(1,3),all.bin=TRUE) > lpout_o$solution
[1] 1 0 0 1 0 1

> lpout_o

Success: the objective function is 2.5

> om[as.logical(lpout_o\$solution),]
Item Vendor Price
1 a x 1.0
4 b z 1.0
6 c y 0.5

To be honest, I don't understand how your method works exactly, @andrewkirk , in particular how the additional constraints make sure that the delivery costs are only charged for the selected supplier(s). But as long as it gives me the desired result, that's great!

Thanks!
L

Maybe I get it now.
By multiplying the solution vector, that has the form [items,delivery], by the additional column 'x' of the constraint matrix, we are summing the number of items bought from vendor 'x', minus the total number of items sold by 'x' multiplied by the delivery 'switch' (0 or 1).
So if the delivery cost from 'x' is present (1), the sum will be: number of items bought from 'x' - total number of items sold by 'x', which will necessarily be <=0 (we can't buy from 'x' more items than 'x' sells), and must also be >-n, otherwise we would be paying 'x' a delivery cost without buying anything from 'x'.
If the delivery cost from 'x' is not present (0), the sum will be: number of items bought from 'x', which, if we buy nothing from 'x' will be 0 as required, but if we do buy something from 'x' will be a positive number, i.e. outside the defined boundaries (-n,0].

This is quite something.
I suppose there are many similar problems where this technique can be applied.

Thank you again very much @andrewkirk !