## Partially-Observed "Singular Value Decomposition"

Singular Value Decomposition is a fanastic linear algebra operation which factorizes a matrix - that is, for a matrix M, it returns U, S, V such that M = U * S * V'. This can be viewed as a basis transformation from row space to column space. The main problem is that standard SVD requires a completely-observed matrix M, that is, all values of M are known.

SVD has been shown to work surprisingly well for collaborative filtering applications such as the netflix prize. However, in such problems, the matrix is usually *partially-observed*, or in other words, some entries of M are not known. The "simple" workaround has been to perform gradient descent in U and V to minimize the objective function:

f = \sum_{i=1}^N \sum_{j \in N(i)} (u_i * v_j' - M_{ij})^2 + \lambda \sum_{ij} ( ||u_i||^2 + ||v_j||^2 )

where N(i) are the observed entries in row i. This breaks down to the updates:

\delta_{ij} = u_i * v_j' - M_{ij}

u_{ik} = u_{ik} + \alpha( \delta_{ij} v_{jk} - \lambda u_{ik} )

v_{jk} = v_{jk} + \alpha( \delta_{ij} u_{ik} - \lambda v_{jk} )

Since we only work with known values of M, this simply avoids the requirement of M being completely-observed.

I am hoping to use this simple technique to perform data mining in other similar problems where "collaborative filtering" is assumed but tedious to compute - e.g., medical data where you have tons of clinical measurements and information about patients, together with a set of diagnosis. So given a set of patient info and clinical measurements, can you predict the diagnosis? Casting this information into a partially-observed matrix M is fairly simple, and I am having moderate success. Nevertheless, I would prefer to have Bayesian belief networks to incorporate some prior knowledge into a structured model.

## Reader Comments