NSF Wheat Genomics Project (DBI-9975989)
“The Structure and Function of the Expressed Portion of theWheat Genomes”

Advisory Committee Report based on Project meeting held
8 January 2000, San Diego

The Advisory Committee (AC) is impressed with the overall enthusiasm and commitment of the investigatorsand the speed with which the project had started.

A. Scientific goals

1. Libraries.
The AC found strong competency in library creation but felt thatevery opportunity should be explored to access high throughputmethods established elsewhere, e.g., Dupont. It was recommendedthat Tim Close and Henry Nguyen contact Chris Hainey, at Dupont,to help with details of cloning procedure, vector strains, andnormalization methods.

Choice of libraries appeared to be arbitrary. More clarityand focus (on reproductive biology) would be desirable. A libraryselection strategy should be in place.

The AC noted that measures were defined in the proposal forassessing library quality, but no numerical figures were presented.For example, the size of insert was mentioned, but no specificsize was given. The measures should be defined at this stage.

It is unclear what criteria would be used to determine thedepth of sequencing for each library. There should be definedcriteria for this.

2. Sequencing.
Capacity for sequencing was impressive; three ABI3700s are nowavailable. Consideration should be given to insert flexibilityfor using labs in addition to the primary Albany lab. This isa training issue as well, since this would expose more peopleto large scale sequencing experience.

3. Mapping.
The AC felt that connection to biological goals should be integralto mapping goals. Prioritization of the list of target sequencesshould not be exclusively based on identification of singletons.The AC felt that the group should not underestimate the largenumber of sequenced clones that will be generated, so initialpriorities (biologically based) would be beneficial.

It would be inappropriate to use a two-stage screening stepin the mapping procedure that would require involvement of organizationsexternal to this project to achieve mapping goals.

This is clearly an exciting program with immense potential.However, participants should not underestimate the need for strongcoordination of mapping effort with particular emphasis on qualitycontrol and data management, storage, and retrieval. Such coordinationshould have priority.

4. Functional genomics.
This area is evolving rapidly. Although focus is now on microarrays,other technologies may emerge in the future. The AC suggests thatinvestigators remain flexible to adoption of appropriate technology.It is of paramount importance to have good experimental systemsin place to provide a biological context to any technology thatis utilized.

5. Bioinformatics.
Knowing the competence of the investigators and their commitmentto long-term bioinformatics projects, the AC feels that the importanceof this aspect to this project is well understood by them. Thescale of the project, however, will require either the developmentof new procedures or the integration of external resources intothe process. These are both time consuming and personnel-consuming,with the personnel being from different backgrounds (computingand biology). As with many NSF projects, the bioinformatics appearsto be funded only to the level of a service function, which willmake many of the exciting, but technically challenging, partsof the project difficult to accomplish.

The informatics aspects as described or inferred include thefollowing: handling EST sequencing data from a large-scale project(100,000 ESTs); the annotation and deposit of this informationinto public repositories in a timely fashion; making the raw trace(or equivalent) data available via an FTP site; the developmentof EST contigs; the design of primers for mapping; maintainingthe newly derived map information; integration of new map informationwith existing genetic and physical map data; the development ofa relational database management system (RDBMS); the continuedsupport of the existing ACeDB-based database; the integrationof the two inherently different database designs through an applicationlayer; and the development of a common, seamless interface toboth information sets.

This seems to be a lot of work for the number of individualsinvolved. Moreover, the overall success of the project is tiedto the solution of some of these problems. The recommendationis to focus primarily on the development of sequence handlingcapacity, in part by borrowing from existing NSF projects. Sendrepresentatives to laboratories doing this type of work, borrowsoftware, do whatever is necessary to keep control of the dataflow. The second focus must be on techniques of high-throughputcontiging and primer design to support the mapping effort. Whilea relational model will certainly be of use to the community inthe future, the development of such a model as an early prioritywill consume too many resources. Developing an integration layerbetween relational and object modeled databases is also a dauntingtask. It is felt that it might be better to focus on the utilizationof the existing ACe models, at least until the high throughputaspects are well in hand. The ACe models continue to serve thecommunity well, and there is little to be gained early on by theconversion to a relational format.

6. Evolution.
Though no report was presented, the questions posed in Objective6 of the proposal may need to be reviewed in light of new dataon wheat genome organization.

B. Bioinformatics

The AC was impressed with the willingness of the project leadsand PIs to address difficult bioinformatics issues. There is aclear understanding of the need to extend systems beyond the currentcapabilities, which is laudable. A recommendation to maintainthe focus early on to simply raw data handling needs, making rawdata available to the community in terms of trace (or equivalent)files, and utilize existing public and specialized crop databasesto distribute derived information would seem to be sufficient.

C. Recruitment and diversity.

The AC was impressed with the overall strategy, enthusiasm,and ingenuity being adopted to recruit individuals from diversebackgrounds. They are conscious that aspects of the project arerepetitive, so it is important to have strategies in place tokeep staff motivated and interested. They think that N. Lapitan’splan for side projects to hold interest of staff is good and desirablebut it needs careful management to ensure project goals are met.

Training.
Having individual groups involved in multiple aspects of the projectwill improve the training that students and staff receive. Animportant outcome of the project is acquisition of skills in highdemand and this will underpin future use of the information andknowledge that will emerge from this initiative.

D. Intellectual Property issues

The impact of this public program globally should not be underestimatedand should be used to ensure that other projects make their informationavailable in a timely fashion.

The AC welcomed enthusiasm for publications, but would encouragethe group to establish authorship guidelines and to evaluate theguidelines from other large-scale projects.

The release path for information to emerge from this projectappears to be well defined and appropriate.

E. Measures of impact

Information and knowledge management emerging from this programwill be the major metric for progress. There should be close monitoringof contacts, requests, utilization of information, and materialsto facilitate reporting and evaluation.
This work will provide numerous opportunities for quality publications,a platform for further grant applications, routes for internationalcollaboration, and can catalyze cooperation with the private sector.The need for coordination with activities in Europe and Australiawas highlighted and would enhance the impact of the program.

F. Management structure

The management group has a strong track record of achievementand is highly respected in the scientific community. The AC didnot perceive problems and were impressed with the tone of firstmeeting and discussion and the mechanisms to ensure uninhibitedcommunication. The group’s decision making process appearsto be transparent.

This is an exciting program with great promise. The AC hopesthese comments will be useful.