# A What is the closed-form solution using ALS algorithm to optimize

Tags:
1. Feb 24, 2016

### kevin2016

$$C \in \mathbb{R}^{m \times n}, X \in \mathbb{R}^{m \times n}, W \in \mathbb{R}^{m \times k}, H \in \mathbb{R}^{n \times k}, S \in \mathbb{R}^{m \times m}, P \in \mathbb{R}^{n \times n}$$

${S}$ and ${P}$ are similarity matrices (symmetric).

$\lambda$, $\alpha$ and $\beta$ are regularized parameters (scalar).

$\circ$ is Hadamard product (element-wise product).

$$\min_{W, H}f(W, H)=\|C\circ(X-WH^T)\|^2_F+\lambda\|W\|^2_F +\lambda\|H\|^2_F + \\\alpha\|S-WW^T\|^2_F +\beta\|P-HH^T\|^2_F$$

For the objective function ${f}$, I used alternating least squares (ALS) algorithm to get the $\frac{\partial f}{\partial W}$ and $\frac{\partial f}{\partial H}$ and set both them to zeros ($\frac{\partial f}{\partial W} = 0$; $\frac{\partial f}{\partial H}=0$), thus I can get the analytical solution for both ${W}$ and ${H}$.

Let set first the $H$ as constant, thus in fact, I will solve such objective function to get $W$.

$$\min_{W, H}f(W, H)=\|C\circ(X-WH^T)\|^2_F+\lambda\|W\|^2_F + \alpha\|S-WW^T\|^2_F$$

Finally, I get

$$\frac{\partial f}{\partial W} = 2[C \circ (WH^T)]H - 2(C \circ X)H +2{\lambda}W + 4{\alpha}WW^TW - 4{\alpha}SW = 0$$, that is

$$[C \circ (WH^T)]H - (C \circ X)H +{\lambda}W + 2{\alpha}WW^TW - 2{\alpha}SW = 0 \quad (1)$$

For equation $(1)$, I can not get the analytical solution for $W$. So can you help me work out:

$$W=?$$

Thanks.

Last edited: Feb 24, 2016
2. Mar 1, 2016

### Greg Bernhardt

Thanks for the post! This is an automated courtesy bump. Sorry you aren't generating responses at the moment. Do you have any further information, come to any new conclusions or is it possible to reword the post?

3. Mar 1, 2016

### kevin2016

Hi Greg,

In fact, if someone can get the $\frac{\partial f}{\partial W_i}$, wher $W_i$ is the $i$th row of W, it is also OK.
Here is the original paper: "Collaborative matrix factorization with multiple similarities for predictiong drug-target interactions" by xiaodong Zheng and co-workers. In the paper, they give the result (that is $a_i$), but I can not proof their result. Can anybody help?