Assemblies

A special working group has been formed to advise on and carry out the clustering (Olin Anderson, Travis Banks, David Marshall, Nick Tinker and Steve Wanamaker). This group has a deadline of June 30 (2002) to produce a solution to the assembly problem. There is a separate emailing list for this group and those who want to be informed of the progress - to join this list please email Dave Matthews (matthews@greengenes.cit.cornell.edu).

It is envisaged that every 4-6 months the contigs will be updated with newly acquired sequences. There are 2 ways to do this -- do a complete whole new assembly or take the previous assembly and slot the new sequences into that. The latter approach has 2 advantages -- the nomenclature of the contigs remains the same from one assembly to the next, and it should require less computing power than a complete new assembly.

Status:

Assembly #3, 12 Dec 2002
Date: Mon, 10 Feb 2003 11:58:25 -0500
From: "Travis Banks" <tbanks@agr.gc.ca>

number of ESTs - 409765
number of contigs - 39813
number of singletons - 50116
number of contigs with depth 8 or more - 9346
number of contigs with depth 20 or more - 3336

This build was done using uicluster2 to precluster and CAP3 to assemble.

===========
Assembly #2, 28 Oct 2002
Date: Mon, 28 Oct 2002 14:43:10 -0500
From: "Travis Banks" <tbanks@agr.gc.ca>

number of ESTs - 198695
number of contigs - 22135
number of singletons - 28225
number of contigs with depth 8 or more - 5604

This build was done at 90% identity and 40 basepair overlap.

===========
Assembly #1, 21 Jun 2002
Date: Fri, 21 Jun 2002 17:12:27 -0400
From: "Daryl Somers" <SomersD@agr.gc.ca>
Subject: Wheat EST assembly is done !!

number of ESTs - 186257
number of contigs - 23395
number of singletons - 26489
number of contigs with depth 7 or more - 5343

The assembly was done using Paracel software - clustering with PCP version 
2.2.5, assembly with CAP4 2.2.5.  All parameters for clustering and 
assembly where default values except for cap4 which had an overlap length 
of 40 and a percent identity of 95%. 


<= Return to Wheat SNP homepage