Biological graph dissimilarity characterization using graph theory

Playing this video requires the latest flash player from Adobe.

Download link (right click and 'save-as') for playing in VLC or other compatible player.

Recording Details

PIRSA Number: 


Many biological data sets and relationships can be modeled as graphs. Understanding how structure of these graphs relates to biological function is essential for understanding underlining mechanisms of disease and for aiding drug discoveries. Vertices of biological graphs represent individual entities such as genes and proteins. Edges represent the relationship between two cellular components such as physical and functional interactions. A challenging problem in the post-genomic era is graph comparisons as they are large typed complex and evolving. Comparing graph structures helps to gain insights into the underlying signaling mechanisms and treatments for complex diseases. With technological advancement biological data will continue to grow and so will the size and complexity of graphs.Large graph comparisons are computationally intensive as they involve the subgraph isomorphism problem which is NP-complete. Therefore graph comparison algorithms need to be efficient scalable and be able to systematically capture biologically meaningful graph structure differences. Efficient graph comparison algorithms are necessary for many types of biological graphs e.g. protein-protein interaction drug-target microRNA-gene gene-regulatory and co-expression graphs. Furthermore graph comparison algorithms are extremely useful for many applications such as comparing graphs characterizing different diseases representing different cancer subtypes or different drug treatment responses. There are two main categories of graph properties used for comparing biological graphs global graph properties and local graph properties. Global graph properties study the overall graph while local graph properties focus on local structures of the graph. Our objective is to develop an efficient scalable graph comparison algorithm such that graph structure differences between any two states can be obtained systematically. We achieve the objective in two steps. First we propose an algorithm such that graph structure differences are systematically obtained and verified that the differences are biologically meaningful. Then we develop a heuristic to improve upon the proposed algorithm in the first step in terms of efficiency and scalability. While our approaches are generic we apply it on non-small cell lung cancer data sets. The non-small cell lung cancer datasets are used to construct normal and tumor co-expression graphs. Global graphs properties do not contain the detail needed to capture the structural characteristics of biological graphs thus we used a local property graphlets. Graphlets are all non-isomorphic connected induced graphs on a specific number of vertices. By definition graphlets have the ability to capture all the local structures on a certain number of vertices. Results showed that our graphlet approach returns graph structure differences between normal and tumor conditions that correspond to biological knowledge. We then introduce a heuristic to identify areas that are likely to be different between the normal and tumor graph and perform graph comparisons on the identified areas only. The heuristic was able to achieve interesting results that were successfully validated in vitro.