Homeobox Genes DataBase
Data Mining
Data mining, or Knowledge Discovery
in Databases, is the nontrivial extraction of implicit, previously
unknown, and potentially useful information from data, or the search for
relationships and global patterns that exist in large databases.
The data mining process has been broken down into
selection, processing, transformation, data mining, interpretation, and
evaluation. The final two components should involve visual and global representations
that yield a gestalt to the data [Klevecz, 1999].
A gene-expression database should ideally provide
comparisons of spatial regions and patterns, and this implies that data
must be mapped into a common spatial and temporal framework [Bard et al.,
1998]. New visualisation techniques are essential for this.
Data Mining in the HOX Pro
Cluster analysis of promoter regions.
In the homeobox genes, apart from homeoboxes, there are other conservative
sites, including non-coding regions. Taking into account that the number
of these 300-800 bp promoter regions is comparable
with the number of sequestered homeoboxes, the phylogenetic analysis of
these zones definitely is as promising as that of 183-bp homeoboxes. Cluster
analysis allowed several gene groups to be revealed on the basis of
conservatism of sequences in the promoter zones. Two compact groups are
clearly seen: genes-homologs of the Deformed Drosophila gene (Dfd) and
homologs of the Sex comb reduced gene (Scr).
Hox RA-responding Enhancers Similarity with Alu Sequences.
We collect cases of sequence similarity between the documented cases of
hox RAREs and known Alu motifs.