Progress on Deletion Mapping

Selected ESTs are being mapped in a collaborative process among 10 project labs to chromosome locations defined by deletion line breakpoints and other cytogenetic stocks. Each lab returns mapping data to a central repository. Coordinators then review the data on a chromosome by chromosome basis to remove spurious and erroneous locations.

Deletion lines - A set of 101 deletion lines was selected and used in the mapping project, which provides an average of thirteen deletions per chromosome resulting in bins averaging 10cM. In additon, 45 cytogenetic lines including 21 nullisomis-tetrasomic and 24 ditelosomic lines are also used to assist in the mapping process.

Mapping probes - The source of mapping probes is the Unigene Set, which includes one member selected from each contigs, EST towards 5' end of a contig is preferred, and all the singletons failed to assemble in the contigs. A list of probe candidates was advanced for probe validation process after removing undesirable unigenes, such as retroelements, organellar sequences, rRNA sequences. Probes were validated using 5' sequence to confirm their original identity first, followed by using 3' sequence to remove duplicates. Any two ESTs with sequence similarity higher than 90% over at least 100 bases are considered as duplicates. The validated and nonduplicated ESTs were then sent to ten mapping labs for mapping.

An automated protocol was developed for large-scale probe screening and validation process. The protocol involves setting up a MySQL-based relational database to include all the relevant probe candidate information. Perl scripts were written and used to update validation results in the database.

Results - All the mapping data are for public access and these data have been confirmed by mapping coordinators.

As of February 2, 2004, mapping results have been reported for 8241 ESTs. This number (less duplicate ESTs sent as controls) includes 7873 unique ESTs, of which 7027 have been mapped to a specific location. After full review by the mapping coordinators, 6426 ESTs (18,785 loci) are confirmed for publication.