Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
In a Python script I'm writing I am simulating multivariate normal random vectors with the expression
np.random.multivariate_normal(np.zeros(dim_obs), y_cov)
My script runs, but generates the following warning:
RuntimeWarning: covariance is not positive-semidefinite.
Also the little debug print statements I throw in there print False
most of the time
print( np.all(np.linalg.eigvals(y_cov) > 0) )
Why is this throwing false positives? My y_cov
is positive semi-definite because it is (sorry about the lack of TeX markup) B x x'B' + y y' where the B is a matrix, and the others are random vectors with each element positive.
In this particular run B is actually just a ones vector of size 9. Can I just ignore this warning? From the documentation:
Note that the covariance matrix must be positive semidefinite (a.k.a. nonnegative-definite). Otherwise, the behavior of this method is undefined and backwards compatibility is not guaranteed.
Edit:
here's a runnable thing altogether. Thanks for the tip @user2357112.
import numpy as np
num_factors = 1
dim_obs = 9
u = np.random.normal(size = num_factors)
v = np.random.normal(size = dim_obs)
y_cov = np.dot(np.ones((9,1)), np.exp(u.reshape((num_factors,1))/2))
y_cov = np.dot(y_cov, np.exp(u.reshape((1,num_factors))/2)) #transpose
y_cov = np.dot(y_cov, np.transpose(np.ones((9,1))))
y_cov += np.dot(np.exp( v.reshape((dim_obs,1)) / 2),
np.exp( v.reshape((1,dim_obs)) / 2))
print( np.random.multivariate_normal(np.zeros(dim_obs), y_cov) )
print( np.all(np.linalg.eigvals(y_cov) > 0) )
print( np.linalg.eigvals(y_cov) )
–
–
Theoretically, your matrix is positive semidefinite, with several eigenvalues being exactly zero. But the computations with floating point numbers introduce truncation errors which result in some of those eigenvalues being very small but negative; hence, the matrix is not positive semidefinite.
For the time being, it looks like the warning may be ignored; but NumPy documentation says that the behavior in non-psd case is undefined, so I would not want to rely on this. A way to correct for the floating point errors is to add a tiny multiple of the identity matrix to y_cov
. For example, like this:
min_eig = np.min(np.real(np.linalg.eigvals(y_cov)))
if min_eig < 0:
y_cov -= 10*min_eig * np.eye(*y_cov.shape)
Adding a fixed multiple of identity, like 1e-12, would work for all reasonable size matrices and still wouldn't matter for the results.
For completeness, a simpler way to reproduce the issue:
import numpy as np
x = np.random.normal(size=(5,))
y = np.outer(x, x)
z = np.random.multivariate_normal(np.zeros(5), y)
This throws the same warning (with high probability).
–
A more efficient way to generate the Gaussian samples in your case, which is also immune to the numerical issues identified by @zaq, is to observe that a multivariate, zero mean Gaussian random vector with covariance matrix equal to a*a.T + b*b.T
(a
, b
: column vectors) is equal in distribution to the random vector a*w1 + b*w2
where w1
and w2
are independent Gaussian scalar random variables of zero mean and variance 1
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.