Prediction of orthologous protein interactions
using a set cover approach
A goal of contemporary proteome research is the
elucidation of the protein-protein interactions in the cell. Based on currently
available protein-protein interaction and domain data of S. cerevisiae, we introduce a novel
method, Maximum Specificity Set Cover (MSSC), to predict protein-protein
interactions. This algorithm features two stages: First, we select high quality
protein-protein interactions that participate in topological motifs based on a
clustering measure. Second, we use MSSC to assign probabilities to domain
pairs. MSSC is also modified to include the possibility of having more than one
domain from each protein causing the protein-protein interaction. This approach
allows us to predict previously unknown protein-protein interactions with a
degree of sensitivity and specificity that clearly out-scores other approaches.
We find that the predicted interaction network preserves the characteristics of
the initial web of protein-protein interactions. We also observe high levels of
coexpression among putative interactions. We extend
our method to infer protein-protein interactions in multicellular
organisms where interaction data currently does not exist. Starting from
predictions in yeast, we find a set of orthologous
interactions in A. thaliana, C. elegans, D. melanogaster, M. musculus, and H.
sapiens