- #1

- 299

- 26

$$log(\sigma(x_{in})) + \frac{(y_{real}-\mu(x_{in}))^2}{2\sigma(x_{in})^2}$$

where ##\sigma(x_{in})## and ##\mu(x_{in})## are predicted by the network and are functions of the input. The network seems to be training well i.e. the loss goes down and I am attaching below 2 histograms I obtained after training the network and trying it on new data. The first one is a histogram of ##\frac{dy}{\mu(x_{in})}##, where ##dy = y_{real}-\mu(x_{in})##. The second histogram shows ##\frac{dy}{\sigma(x_{in})}##. Based on these it seems like the network is doing pretty well (the data has Gaussian noise added to it). However when I try to compute the mean and the error on the mean for ##dy## I get:

$$\frac{\sum_i{\frac{dy_i}{\sigma_i^2}}}{\sum_i{1/\sigma_i^2}} = -0.000172 $$

and

$$\sqrt{\frac{1}{\sum_i{1/\sigma_i^2}}} = 0.000003$$

where the sum is over all the data points I test the MDN on. This means that my predictions are biased by -0.000172. However, I am not sure why that is the case, as the MDN should easily notice that and add 0.000172 to all the ##\mu## predictions. I tried training several MDN's with lots of different parameters and I get the same result i.e. the result is biased (not always by the same amount or direction). Am I missing something or missinterpreting the results? Shouldn't the mean of my errors be consistent with zero and shouldn't simply adding that bias (0.000172 in this case) solve the issue? Any insight would be really appreciated.