Outline of topics



Network Construction: Generate PPI networks from flexible query options

  • Query a protein to study your favorite protein. Gene name, UniProt accession number and other database identifiers including Ensembl Gene, Ensembl Protein, NCBI Gene, UniGene, etc. are supported.
    If "extend search" option is checked, the system will also search and show interactions between interacting proteins of the query protein.

  • Query a list of proteins to study a list of proteins with your research interest.

  • Query a list of interactions to rapidly identify whether homemade interactions (for example, generated by yeast two-hybrid screening) are already publicly known or novel. Moreover, if "extended search" is selected, it will show all publicly known interactions between proteins (for example, interactions between preys) in the list besides query interactions.

  • Query two lists of proteins to study two lists of proteins that have biological correlation, to see whether there's any correlation in the protein interaction level. For example, input a list of up-regulated genes and a list of down-regulated genes from microarray experiment.

Network Filter: Get a more credible network by different criteria.


Network Analysis: Gain the insight of the network.

  • Network Function analysis identifies enriched GO terms in the PPI network by comparing GO frequencies in the given network against the background distribution, i.e. the distribution of GO terms of the whole organism. GO is structured as a hierarchical directed acyclic graph (DAG), which was taken into account when counting the number of annotated proteins. A protein is thought associated with a certain GO term if it is annotated with the term itself or a child of the term.

  • Network Topology gives an overview of network topological features including diameter, degree distribution, shortest path distribution and clustering coefficient of the interaction network.
    • Path: In a protein-protein interaction network, nodes represent proteins, and edges represent interactions. Path between two nodes is defined as a list of nodes where each node has an edge to the next node.
    • Shortest path: defined as the shortest path from one node to another in the network.
    • Diameter: defined as the maximum value of distance of shortest path over all pairs of distinct nodes in graph.
    • Degree distribution: measures the proportion of nodes in graph with a specified number of edges.
    • Shortest path distribution: measures the proportion of shortest path in graph with a specified length.
    • Clustering coefficient: tells how well connected neighbors of the node is. The value is 1 when neighbor nodes are fully connected and 0 when none of neighbor nodes are connected. See formal definition.

  • Topologically Important Proteins analysis applies centrality measures to identify topologically important proteins in the interaction network. Four centrality measures including eigenvector centrality, betweenness centrality, closeness centrality and degree centrality are implemented in PINA to determine the relative importance of a node (protein) within the graph (interaction network). See formal definition.

  • Common Interacting Proteins analysis identifies proteins that interact with at least two of the query proteins in the network.

Network Visualization: an interactive environment to view and edit network

  • The browser needs JVM 1.5 or up installed to launch an applet which support viewing, dragging, zooming of the network.
  • Network analysis can also be performed in the applet and the proteins/interactions in the result can be highlighted with different colors.
  • The network can be expanded by right-clicking (Mac users need to press and hold down Apple command key, then left-click) a protein and selecting "expand" item in the pop-up menu, which will add interacting proteins of the selected protein into the network. The expanding can be cancelled by right-clicking the same protein and selecting "collapse" item in the pop-up menu.
  • Protein can be highlighted by inputting gene name or UniProt AC in the right-bottom input box, which is useful to quickly identify a protein in a big network.
  • The picture of the network can be saved as JPG or PNG format with a specified zoom level to the local disk.

Network Download

  • Interaction network can be downloaded to local disk with GraphML format, MITAB format or PINA tab-delimited format.
  • GraphML: a XML format for graph representation. Node elements describe gene name, protein name, UniProt AC, GO terms of proteins; edge elements describe interaction of proteins. See example file.
  • MITAB (PSI-MI tab-delimited format): columns are explained by the header line in the example file. The file can be opened by Excel with selecting tab as delimiter.
  • SIF (Simple Interaction File): This format can be imported into Cytoscape directly. The disvantage is that annotation is not included.
  • PINA tab-delimited format: the 1 to 3 columns are UniProt AC, UniProt keywords of one interacting protein; the 4 to 6 columns are corresponding information of the other interacting protein; the left columns are interaction ID of source databases. See example file. The file can be opened by Excel with selecting tab as delimiter.

Interactome Modules: Network modules generated from PINA Interactomes

  • Module Collection is a set of network modules indentified from PINA interactomes using a specific clustering algorithm and parameter setting. The detail can be viewed from module collection links in the search result page.

  • Module Annotation gives a brief view of functions of Interactome modules using public knowledge including Gene Ontology, KEGG pathway and PFAM domains. In the search result page, only top 3 terms in each annotation source are shown, click "view annotation details" will give you the full list.

  • Search modules to search predefined Interactome modules are with query genes.

  • Identify enriched modules to identify statistically enriched Interactome modules in query genes using hypergeometric test.
    • Sample Number: There are two numbers in this column. The first one is the number of query proteins found in this module; the second one is the total number of query proteins.
    • Background Number: There are two numbers in this column. The first one is the total number of proteins in this module; the second one is the total number of proteins of one species with known interactions in PINA.

User Space: Save your query, own data and analysis result

  • Freely registered users can save PPI networks generated from user query or the output of the analysis tool on the server for the further analysis.
  • Users can remove interaction entries manually from the saved network.
  • Users can upload homemade protein-protein interactions to expand the public network.

  • Login: when the user logins the system, if "Remember me on this computer" checkbox is checked, the account will be remembered on the computer for one month unless the user click "log out" button on the right-top corner.

  • Network operation produces a network from two existing networks based on the following operations.
    • Union operation will generate a network containing all proteins and interactions in two networks.
    • Subtraction operation will generate a network containing proteins and interactions, which only belong to the destination network.
    • Intersection operation will generate a network containing common proteins and interactions of two networks.
    • Difference operation will generate a network containing proteins and interactions, which are not common to two networks.

User Community: Share comments/networks with other users

  • Freely registered users can write comments on either public interaction entries or user uploaded novel entries.
  • User's comments can be used as the criteria of the network filtering.
  • Freely registered users can share comments/networks in "user space" with other users on the system.
  • Tip: users can make a comment at the same time to all interactions of a network by clicking the link "Comment all interactions" below Logout button.

Data Integration: Non-redundant and more complete

  • Integrate data from 6 public protein-protein interaction databases to get a more complete dataset.
  • Identify the same interaction records in the different databases to build a non-redundant dataset.
  • Use BioMart and UniProt to annotate each protein with the same high quality information because some of the original records have limited annotation.

PINA4MS: Combining protein expression and interaction data

  • Click here for introduction to PINA4MS and the tutorial.