Skip to main content
Article
Explicit Representation of Protein Activity States Significantly Improves Causal Discovery of Protein Phosphorylation Networks
BMC Bioinformatics
  • Jinling Liu, Missouri University of Science and Technology
  • Xiaojun Ma
  • Gregory F. Cooper
  • Xinghua Lu
Abstract

Background: Protein phosphorylation networks play an important role in cell signaling. In these networks, phosphorylation of a protein kinase usually leads to its activation, which in turn will phosphorylate its downstream target proteins. A phosphorylation network is essentially a causal network, which can be learned by causal inference algorithms. Prior efforts have applied such algorithms to data measuring protein phosphorylation levels, assuming that the phosphorylation levels represent protein activity states. However, the phosphorylation status of a kinase does not always reflect its activity state, because interventions such as inhibitors or mutations can directly affect its activity state without changing its phosphorylation status. Thus, when cellular systems are subjected to extensive perturbations, the statistical relationships between phosphorylation states of proteins may be disrupted, making it difficult to reconstruct the true protein phosphorylation network. Here, we describe a novel framework to address this challenge.

Results: We have developed a causal discovery framework that explicitly represents the activity state of each protein kinase as an unmeasured variable and developed a novel algorithm called “InferA” to infer the protein activity states, which allows us to incorporate the protein phosphorylation level, pharmacological interventions and prior knowledge. We applied our framework to simulated datasets and to a real-world dataset. The simulation experiments demonstrated that explicit representation of activity states of protein kinases allows one to effectively represent the impact of interventions and thus enabled our framework to accurately recover the ground-truth causal network. Results from the real-world dataset showed that the explicit representation of protein activity states allowed an effective and data-driven integration of the prior knowledge by InferA, which further leads to the recovery of a phosphorylation network that is more consistent with experiment results.

Conclusions: Explicit representation of the protein activity states by our novel framework significantly enhances causal discovery of protein phosphorylation networks.

Meeting Name
18th Asia Pacific Bioinformatics Conference, APBC 2020 (2020: Aug. 18-20, Seoul, South Korea)
Department(s)
Biological Sciences
Second Department
Engineering Management and Systems Engineering
Research Center/Lab(s)
Intelligent Systems Center
Comments

This research was supported by the National Library of Medicine, training grant 5T15LM007059–32, by the National Human Genome Research Institute, grant U54HG008540 via the trans-NIH Big Data to Knowledge (BD2K) Initiative, and by the Pennsylvania Department of Health, grant 4100070287. Publication costs are funded by U54HG008540, R01LM012011, and NLM training grant 5T15LM007059–32.

Keywords and Phrases
  • Causal inference,
  • Protein kinase activity state,
  • Protein phosphorylation networks,
  • Cancer signaling pathways
Document Type
Article - Conference proceedings
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2020 The Authors, All rights reserved.
Creative Commons Licensing
Creative Commons Attribution 4.0
Publication Date
8-20-2020
Publication Date
20 Aug 2020
PubMed ID
32938361
Disciplines
Citation Information
Jinling Liu, Xiaojun Ma, Gregory F. Cooper and Xinghua Lu. "Explicit Representation of Protein Activity States Significantly Improves Causal Discovery of Protein Phosphorylation Networks" BMC Bioinformatics Vol. 21 Iss. Suppl 13 (2020) ISSN: 1471-2105
Available at: http://works.bepress.com/jinling-liu/7/