![]() |
|
Dragon
Database : Tutorials | Annotate
| Search | Compare
| Download | Order
| Contact Us | Links
| Log, Tips & Bugs
Dragon View : Dragon Families | Dragon Order | Dragon Paths Dragon Map The Pevsner Laboratory |
This is a beta-test version of the DRAGON database web site. As such, data from four databases, Unigene, Swissprot, Pfam and KEGG is presently available. We will soon be adding data from a number of other databases including, OMIM, Interpro, Transfac, Homologene, The Biological and Biochemical Image Database and a number of yeast databases. Presently, Genbank accession numbers from human, rat and mouse will work in DRAGON. You can view the Logs, Tips & Bugs page to read about known bugs (please also email us if you find any others). Thank you for visiting DRAGON.
Lessons
and Examples
Examples
Overview of DRAGON
Explanation of Filtering and Filtering Records
Instructions for Annotating Data
Instructions for Searching Databases
Instructions for Comparing Arrays
Analysis
of microarray data subsequent to using DRAGON with the DRAGON View suite of
information visualization tools.
|
Dragon
Database
|
|
|
Annotate
|
"I would like to analyze my microarray data in relation to specific biological characteristics. How do I add that information to all of my microarrays genes simultaneously?" |
|
Search
|
"I would like to know whether keratin genes have ever been found to be expressed in brain tissue." |
|
Compare
|
"I am interested in cytokines and would like to perform a microarray experiment, which microarray platform has the cytokines I am interested in?" |
|
Learn
|
"I don't yet understand what DRAGON is, why I want to use it or how it can help in my microarray data analysis." |
|
Download
|
"I know how to use databasing systems and I want to use DRAGON on my own computer." |
|
Order
|
"I know how to use databasing systems and I want to use DRAGON on my own computer but I have a really slow connection to the internet." |
|
Dragon
View
|
|
|
View
|
"I used Dragon to derive information about my genes but now I want to be able to view relationships between my expression data and the biological characteristics of my genes." |
|
Dragon
Map
|
|
|
Explore
|
"I am interested in how human genes are distributed across tissues and chromosomes and how those genes are shared between different tissues." |
|
Other
|
|
|
Contact
|
"You misspelled something." or "My query didn't work, why?" or "Dragon is really helpful!" |
Explanation
of Filtering and Filtering Records
Several Genbank numbers (GB#'s) in the UniGene
flat-files are included in multiple UniGene entries. This can cause mis-matching
of GenBank numbers with Swissprot numbers in DRAGON. Therefore, we implement
a filter for duplicate GB#'s in DRAGON. This filter simply deletes any GB#'s
that are found in more than one UniGene entry. When a UniGene flat-file is parsed
we keep a record of each of the GB#'s that have been deleted. Below is an archive
of these filtering records. At the very bottom of each record, the number of
GB#'s deleted and the total number of GB#'s analyzed is listed.
| Date of Parsing | Filtering Record |
| 10/19/2000 | Filtering_Record.txt |
| 1/26/2001 | Filtering_Record.txt |
Instructions
for Annotating Data
1) Paste a delimited text file (tab-delimited
is the default) into the "Data Entry" field, or upload your file.
Click here to see an example of the type of
tab-delimited text file you might paste into the field below. Or you can view
the example data sets provided on the Annotate page to get an idea of the general
form your data should take.
2) This text file should contain at least two columns. One column should
be your expression data (e.g. differential ratio values or absolute intensity
values, it doesn't matter for Annotation, however, the tools in DRAGON View
are geared toward analysis of ratio data). The other column should be the Genbank
accession numbers that were provided with your microarray data. Presently,
DRAGON can only take GenBank numbers. (NOTE: some microarray data sets are provided
with proprietary accession numbers. In these cases, the company which produced
the microarray should also provide on their web site a table which contains
Genbank accession numbers that correspond with each of their proprietary accession
numbers. If this is the case then you need to integrate the Genbank accession
numbers with your data before you can use DRAGON).
3) Long lists and searches that request more types of information will
take more time and may take so long that your internet browser (i.e. Netscape
or Internet Explorer) thinks that no data was returned and will time-out. If
this happens, try the email option. DRAGON will email you your results whenever
it gets done with them.
4) You need to tell DRAGON which column contains the Genbank accession
numbers by typing the number for the column into the "Column number containing
GenBank numbers" text box. As you might expect, your farthest left column
would be number 1.
5) Next, define what sorts of information you would like DRAGON to append
to your information by checking specific checkboxes from the "Unigene Info:",
"Swissprot Info:" and "PFAM Info:" sections of the table.
You can choose information from more than one database. For example, you could
check all the check boxes and DRAGON would give you all the information it has
about each of your microarray genes.
6) Click the "Submit Gene List" button.
7) DRAGON will add the information that you choose as checked items as
new columns appended to the end of the table that you provided. You have to
choose the type of output file that you would like DRAGON to provide.
8)
Tip: If you wish to annotate your data with types of information that
categorize multiple genes (i.e. Pfam numbers, Swissprot Keywords or KEGG numbers),
then it is best to only annotate your data with one of these types of information
at a time.
An example of a tab-delimited text file that you could paste into DRAGON.
| Genbank Accession Number | Ratio (Cy3/Cy5) | Cy3 Intensity | Cy5 Intensity | Gene Name |
| AL036211 | 2.8 | 2423 | 800 | lumican |
| NM_001797 | 2.4 | 867 | 323 | cadherin 11 (OB-cadherin, osteoblast) |
| NM_000700 | 2.3 | 795 | 320 | annexin A1 |
| AW157548 | 2.2 | 5193 | 2128 | insulin-like growth factor binding protein 5 |
This is an example of the type of file you could enter into the text field above. The easiest way to generate a tab-delimited text file (if you don't already have one) is to paste your data into a spreadsheet program (i.e. Microsoft Excel) and then "Save as..." a "Tab-delimited text file" which should have a .txt as a file extension (if you are using a PC). The Genbank Accession Number (RED TEXT) is what DRAGON will use to append other sorts of information to your table. You will need to define which column contains Genbank Accession numbers after you pasted your data into the text field. The type of expression data (GREEN TEXT) that you have will obviously vary depending upon the type of microarray that you have used. For example, if you have used a radioactivity based system, then you won't have Cy3 and Cy5 intensity data. The key to the type of expression data that you enter into DRAGON is that it is sufficient for a full analysis of your microarray experiment in relation to the types of information you derive from DRAGON. Finally, you can add other sorts of information to your table, such as the gene names (BLUE TEXT) you were provided in your microarray data set. (NOTE: If nothing is being returned in your searches or you are getting strange errors, try removing any extraneous information columns, such as names. It is possible that particular characters and white spaces in your names could be altering you search).
A
sample dataset you can use on the annotation page.
You can use this tab-delimited
text file which contains a list of Genbank numbers to experiment with the
different features of DRAGON. Click on the link for the file, select all contents
of the file, copy and paste into the Data Entry field on the annotation
page. Annotate the genes as you wish.
One thing that you will encounter if you annotate this
list with certain types of data is that some of the genes are repeated a number
of times. This is due to the fact
that that gene is associated with more then one type of a certain criteria you
have chosen. For example, one gene and its associated protein can have numerous
keywords definitions. Therefore, DRAGON repeats the gene on numerous lines and
provides a different keyword on each line. This way, you can use the keywords
to sort your data in a spreadsheet program such as MS Excel. If a gene had two
keywords associated with it and you sorted your whole list by keywords, that
gene would be in two different places on your list clustered with all other
genes also associated with each of those keywords. Furthermore, you can take
your output and plug it into the suite of DRAGON View
information visualization tools.
Instructions
for Searching Databases
1) First, choose the database
that you want to search by clicking the radio button (
) associated with one of the databases. At the moment you can't search more
then one database simultaneously.
2) Next, you can type any characteristic that you are interested in into
the text boxes at the right of the table.
3) Then, you can check off any characteristic that you want to have sent
back to you at the left of the table.
4) Then click the "Submit Query" button.
5) DRAGON will search for genes or proteins only when you have entered
a characteristic and have checked its corresponding box. DRAGON will also provide
any other information that you have checked but not provided with a search term.
An example of a DRAGON search.
For example, you select "Unigene:"
by clicking in the radio button (
) to the left of it. Then you type "keratin" into the name field at
the right of the table and you check the checkbox (
) to the left of "Find gene by name:". Then, even though you only
want to search for keratins, you would like to know what the chromosomal cytoband
location of each keratin you find is. Therefore, you check the checkbox to the
left of "Find gene by cytoband:" but you don't enter anything into
the text box at the right of the table. DRAGON will only search for keratins,
but it will provide all information it has about the cytoband location of each
keratin it finds.
Instructions
for Comparing Arrays
The
array comparison feature of DRAGON is under development and will be available
shortly.
Analysis
of microarray data subsequent to using DRAGON
with the DRAGON View suite of information visualization tools.
Introduction
The
DRAGON View tools are being developed as a companion to the DRAGON Database
Annotation feature. Once you have annotated your data, you wanted to be able
to visualize whether families of related genes are all regulated in a similar
manner or whether genes in the same cellular pathway are all differentially
regulated. The DRAGON View tools, DRAGON Families, DRAGON Order and DRAOGN Paths
are being designed to address these needs.
DRAGON
Families
DRAGON
Families integrates two pieces of information that you provide. The first is
the ratio expression data which you derive from fluorescent based Cy3/Cy5 microarray
experiments. The second is the "type" information which you derive
from the DRAGON database Annotate tool. Type information can be many things.
Presently, Pfam numbers, Swissprot keywords and KEGG numbers are the most useful
type information which DRAGON provides. The Instructions on the DRAGON Families
page will guide you through the process of entering data. View the example data
sets if you have questions about what your data should look like. NOTE:
Unlike the Annotate tool, it is best to use comma-delimited text files with
the DRAGON View tools.
The biggest confusion with DRAGON Families comes with
the output. What exactly does it mean? Here is a diagram and brief explanation.

Each
box in the diagram above represents one gene (as defined by the Unigene database).
If you click on any box, you will be hyperlinked to the Unigene cluster which
corresponds to that gene. The color of each box represents the expression of
each of your genes.
Dark Red = 0 < X < 2.0
Bright Red = X >= 2.0
Dark Green
= 0 > X > -2.0
Bright Green = X <= -2.0
The black text after each row of boxes indicates the type of all of the genes in that group. It is important to realize the fact that all of the boxes on any given row are ALL OF THE GENES IN YOUR DATA THAT HAVE THAT TYPE. Therefore, large numbers of genes which are all in the same group and are all up or down regulated are potentially interesting. (We are currently implementing statistical tests for the significance of this type of result that will be available as part of your output shortly). If you have indicated the type information you are using correctly while inputing your data, then if you click on the hyperlink you should be linked to the proper description of that type. Finally, the blue number in parentheses is the average ratio expression value for all of the genes in that group.
DRAGON
Order
DRAGON
Order is another information visualization tool which attempts to get at the
same question addressed by DRAGON Families but from a different angle. The main
difference between the two tools is that DRAGON Order requires pre-sorted data
in order to work correctly. Here is a diagram and description of what the output
is telling you.
+ -
Each
row of yellow lines in the picture above is representative of the entire list
of genes which you entered. Each row is defined by a type written in white letters
to the right. The position of each yellow line in each row indicates that there
is a gene which belongs to the type that defines that row. So, for example,
the first row in the picture above is "Transmembrane." The "transmembrane"
keyword is a rather broad category, therefore, there are a large number of yellow
lines in this row. Where ever there is a yellow line in the row, that means
that a gene which encodes a protein which has a transmembrane domain is present.
The key is that, because you sorted your list of genes
by their ratio expression values before you entered them into DRAGON Order,
the position of each yellow line is indicative of the expression level of that
gene (the + (up-regulated) and - (down-regulated) signs at the top of the picture
are indicative of the expression levels across the data). Therefore,
an equal distribution of yellow lines across the whole row means that there
is no significant co-expression of a set of genes in that group. However, clusters
of yellow lines at either the far left or the far right of any given row is
interesting because it means that a set of related genes are all up or down
regulated. For example, four of the five "Cell Adhesion" genes are
clustered to the left of the row.
NOTE: One important note about DRAGON Order
is that its output can be quite large. Therefore, entering smaller lists of
genes is a good idea and I have disabled the upload feature for DRAGON Order
to limit the size of the gene lists entered.
DRAGON
Paths
DRAGON
Paths is still under heavy development. However, the concept is relatively straight-forward.
The Kyoto Encyclopedia for Genes and Genomes (KEGG) database diagrams cellular
pathways. DRAGON Paths takes data which includes expression values and Locuslink
gene identifiers and maps the location and expression value of your genes onto
diagrams of any of the KEGG cellular pathway diagrams. Here is an example of
a DRAGON Paths output.
Each
green box in the diagram is representative of a human protein (see the KEGG
web site for more information on the configuration of the KEGG pathway diagrams).
The numbers in the boxes are the EC number for the proteins. Red or green circles
are placed in the upper left corner of each protein that is found in your data.
The color of the circle is indicative of expression level. Red means up-regulated,
green means down regulated. Each green box is hyperlinked to the Locuslink entry
for that protein.
Copyright
2000 Kennedy Krieger Institute