To the contrary, using the term "correlation" as short for a specific type of linear statistical relationship is an abuse of terminology, although a convenient one if you are primarily using linear statistics. Correlation technically means any statistical relationship. Mutual information is a good measure here because it is one of the purest measures of statistical association. If there is any statistical relationship, then there will be mutual information.
In the context of the saying, it's also a good measure, because a statistical association doesn't imply causality, no matter if you're talking about correlation in the purest sense, or linear correlation. If you want to discuss the converse (does causality imply correlation?), I think it would be misleading and less interesting to use a narrow/restricted measure of correlation. Then again, due to the confusion with the word "correlation" becoming used so imprecisely in certain fields, it might be better just to ask if causality implies statistical association. Then it comes down to whether the process is stationary, or is the setting restricted properly so that the application of statistics is meaningful and core statistical assumptions can be made.
Likewise, in the context of all of the recent questions about causality and correlation, one should assume the broad definition of correlation (any statistical relationship) otherwise the questions are trivial, somewhat arbitrarily restrictive, and uninteresting.