It is easy if you consider two cases first, that of constrained optimization when you have one constraint and second, that of constrained optimization when you have multiple constraints. We're really looking for an optimal solution to the problem: a set of values for each of the variables, which gives us the exact coordinates (or n-tuplet) of the point of maxima or minima.
So what exactly is a constraint? Simply put, it is a function on the variables, which fixes exactly what family of acceptable optimal solutions is the particular solution to the problem at hand.
Supposing f and g are two differentiable and continuous functions of 2 variables and C is a fixed constant value of g then, we can write
<br />
g(x, y) = C \qquad<br />
f is also called the objective function. Note that at a stationary point,
<br />
\frac{\partial f}{\partial x}\delta x + \frac{\partial f}{\partial y}\delta y = 0<br />
<br />
\frac{\partial g}{\partial x}\delta x + \frac{\partial g}{\partial y}\delta y = 0<br />
The trick then is the introduction of a scalar quantity \lambda which we appropriately call the Lagrangian Multiplier. We take the first equation above and add it to \lambda times the second equation, getting as a result another equation, which is
<br />
(\frac{\partial f}{\partial x} + \lambda \frac{\partial g}{\partial x}) \delta x + (\frac{\partial f}{\partial y} + \lambda \frac{\partial g}{\partial y}) \delta y = 0<br />
Now suppose that we choose the parameter \lambda such that the term in the first bracket on the left hand side is equal to zero, that is
<br />
\frac{\partial f}{\partial x} + \lambda \frac{\partial g}{\partial x} = 0<br />
This automatically implies that the term in the second bracket is also zero (\delta y is not zero). This gives another condition, namely
<br />
\frac{\partial f}{\partial y} + \lambda \frac{\partial g}{\partial y} = 0<br />
Now we have three (2 + 1) equations to solve for the optimal solution (x_{0}, y_{0}) which is a solution to each of these equations. The third equation, evidently, is the constraint g = C. We need to reintroduce it to ensure a particular solution and not an arbitary one.
More generally if
<br />
x_{k} \qquad \forall \qquad 1\leq k \leq n<br />
are n unknowns and
<br />
g_{j}(x_{1},x_{2},x_{3},...,x_{n}) = C_{j} \qquad \forall \qquad 1\leq j \leq m<br />
are m constraints, then we get a system of (n+m) equations (and evidently there are m multipliers, \lambda_{1}, \lambda_{2},...,\lambda_{m}).
<br />
\frac{\partial{f}}{\partial{x_k}} + \sum_{j=1}^m \lambda_j<br />
\frac{\partial{g_j}}{\partial{x_k}} = 0 \qquad \forall \qquad<br />
1\leq k \leq n<br />
<br />
g_{j}(x_{1},x_{2},x_{3},...,x_{n}) = C_{j} \qquad \forall \qquad 1\leq j \leq m<br />