One pretty good general book is by Grimmett and Stirzaker. I have the second edition
https://www.amazon.com/dp/0198536658/?tag=pfamazon01-20
but a newer third edition is also available. It is not measure-theoretic but they seem to be pretty careful with their discussions and they do have an eye towards applications. Most theorems have proof, but some useful stuff that is too advanced are carefully stated without proof. It has a nice section on Ergodic theorems, which connect time averages and ensemble averages. With your background I suspect this would be a reasonable choice.
I am an engineer so learned this stuff from that perspective so am most familiar with those references. Papoulis' "probability, random variables and stochastic processes" was the primary book that I used to learn about stochastic processes (
https://www.amazon.com/dp/0070484775/?tag=pfamazon01-20). It is pretty good but not the best organized. Here is what he says about about second order properties such as covariance functions:
" For the determination of the statistical properties of a stochastic process, knowledge of the [CDF] function ##F(x_1,x_2,\ldots,x_n; t_1, t_2, \ldots, t_n)## is required for every ##x_i##, ##t_i## and ##n##. However, for many applications, only certain averages are used, in particular, the expected value of ##x(t)## and of ##x^2(t)##. These quantities can be expressed in terms of the second-order properties of ##x(t)## defined as follows..."
Papoulis is much more applied than Grimmett and Stirzaker with less theory and more practical examples, applications and motivation. He still has a sections on ergodicity and other important theoretical aspects, though, and is careful with the way he treats the topic. He does have a section on Kalman filters which begins with the statement "In this section we extend the preceeding results to nonstationary processes with causal data and we show that the results can be simplified if the noise is white and the signal is an ARMA process." He then derives the general form, states the practical difficulties, then proves how the two stated assumptions imply some additional properties that allow the estimator to be simplified.
There are many other engineering books similar to Papoulis, some better than others. Some like the book by Stark and Woods better (
https://www.amazon.com/dp/0137287917/?tag=pfamazon01-20).
I think the book by Hajek on the other page I linked in my previous post is also pretty good, again from an engineering perspective. It is basically at a level between Papoulis and Grimmett and Stirzaker (closer to the latter). The link has a legal download of an earlier version of what is now a published book (
https://www.amazon.com/dp/1107100127/?tag=pfamazon01-20)
Ideally you would have a library to look at these before purchasing - not sure how practical that is right now, though...
Jason