Sangam is a collection of user interfaces to study the following eScience application: Do different stressors activate different brain circuits and genes? An answer to this question helps clinicians and drug manufacturers to develop better treatments and drugs for stress disorders. Here, we describe one graphical interface of Sangam to integrate different data sources to facilitate navigation between a particular sequence of DNA (i.e., a gene), the RNA message it encodes, and the proteins for which it serves as a blueprint. Neuroscientist can use the Sangam Interface to search for a particular Gene, Protein or RNA in all the publicly available data sources and tie this information (using Proteus) with a private database stored in a Neuroscholar lab notebook. In the following, we elaborate on the graphical user interface and its interaction with Proteus.
Sangam has a user-friendly interface to enable scientists to navigate data. It is web ready to enable a scientist to use the system without downloading and installing software.
A scientist may manipulate the touch graph whose nodes depict DNA, RNA and Protein along with their attributes. Look and feel of the graph can be altered using the three different modes; Zoom, Rotate and Locality.
All the attributes of DNA, RNA and Protein are broadly classified as Function, Structure and Expression. To search for a particular attribute or detail, the scientist right clicks on the parent node and selects “Expand node”. This expands the Structure of DNA. The same concept applies to all other nodes such as “Function” and “Structure”.
Individual attributes or parent node (equivalent to selecting all the child node) can be selected for retrieval. This is the same as identifying the target list of a “select” SQL statement. The qualification list of the query is specified by the “Term” and “Species” text boxes. The specified term for the requested specie is retrieved from the KEGG Web Service (http://www.genome.jp/kegg/),. The results is a list of objects. The list of objects is shown to the user. The user picks those of interest and the system applies the target list of the query to provide the relevant information. We now describe these two steps in turn.
The following window shows the query’s target list specified by a user. The user has specified “CRH” as the term of interest for “rno” (stands for rattus) specie.
Once the scientist hits the query button (at the bottom of the form), Sangam invokes the Proteus web-service with an XML plan that invokes several Web Services internal to Proteus and the KEGG Web Services. The logical query tree is as follows:
A description of this plan is as follows. Proteus sends the initial input to the Iterator operator, which distributes the query across all the available instances of KEGG web service. The Iunion operator combines output from these instances and forwards it to Proteus coordinator. The resulting objects are sent back to the Sangam.
Click here for an XML representation of this query
Multiple matching records found by Sangam for a given term in any one of DNA, RNA and Protein will then be displayed to the scientist as an option to select a subset of the list for the detailed searching. This is shown below where a scientist has checked those objects of interest.
After selecting the desired matching records, the scientist hits the “Query Again” button for the detailed search. This generates a plan similar to the previous interface with one key difference. The Iunion operator is followed by a Project operator (specific to Proteus) which retrieves those attributes of interest specified by the scientist. (We show neither the logical query tree nor the XML plan in order to prevent repetition of information.) Proteus executes this plan to produce the complete desired info (formatted in XML) to Sangam’s user interface for display, see below.
User can hit the “Query Again” button again to start the procedure all over again. To initiate a fresh query, the scientist uses the “New Query” button.