Open position in Informatics/Bioinformatics

Crop Development Centre
University of Saskatchewan
College of Agriculture and Bioresources

September 7, 2012

A 2-years Engineer position in Informatics/Bioinformatics:

This position is funded by a research grant administered by the
University of Saskatchewan under the direction of Dr. C Pozniak, Crop
Development Centre, College of Agriculture and Bioresources.

Background

In support of the international effort (IWGSC - http://www.wheatgenome.org/)
to obtain a reference sequence of the bread wheat genome and to provide
plant communities dealing with large and complex genomes with a
versatile, easy-to-use online automated tool for annotation, we have
developed the TriAnnot pipeline (http://www.clermont.inra.fr/triannot).
Its modular architecture allows for the annotation and masking of
transposable elements, the structural and functional annotation of
protein-coding genes with an evidence-based quality indexing, and for
identification of conserved non-coding sequences and molecular
markers. The TriAnnot pipeline is parallelized on a 712 CPU computing
cluster that can run a 1-Gb sequence annotation in 26 hours. It is
accessible through a web interface for small scale analyses (< 100
sequences of 3Mb max) or through a server for large scale annotations
(thousands of scaffolds). The performance of TriAnnot was evaluated in
terms of sensitivity, specificity, and general fitness using curated
reference sequence sets from rice and wheat. In less than 8 h, TriAnnot
was able to predict more than 83% of the 3,748 CDS from rice chromosome 1
with a fitness of 67.4%. On a set of 12 reference Mb-sized contigs
from wheat chromosome 3B, TriAnnot predicted and annotated 93.3% of the
genes among which 54% were perfectly identified in accordance with the
reference annotation. It also allowed the curation of 12 additional
genes based on new biological evidence, increasing the percentage of
perfect gene prediction to 63%. TriAnnot was designed specifically for
complex genomes like wheat, and systematically showed a higher fitness
than other annotation pipelines. TriAnnot is easily adaptable to the
annotation of other plant genomes, and should become a useful resource
for the annotation of large and complex genomes in the future.

However, there is a need for further improvements:
  - Build a virtual machine of the TriAnnot pipeline;
  - Improve the Gene modeling process - definitely make a choice between
    two combiners: Eugene and Augustus;
  - Incorporate new, well trained ab initio gene prediction programs;
  - Develop tools to display splicing variants;
  - Implement a module for the discovery of SNP markers;
  - Develop and incorporate three new panels:
     o mapping the gene models on selected genome models
     o mapping the gene models within a pre-calculated phylogeny tree
     o mapping the gene models within a known biological pathways;
     o evaluate the possibility to develop a "bogas-like" interface for
       wheat manual curation (see
       http://bioinformatics.psb.ugent.be/webtools/bogas/).

Building a virtual machine is a high priority as the improved TriAnnot
pipeline will be installed as a mirror site to improve the informatics
infrastructure in Canada. In this respect, we would develop a
collaboration with iPlant to install the TriAnnot virtual machine
(TriAnnot-VM) within the iPlant cyber infrastructure located in the
USA. Collaborations are already underway with the CNRS at Roscoff
(C. Caron) and the IBCP at Lyon (C. Blanchet) to work on these
aspects. The engineer hired for this position will be in contact with
these different centers to build the virtual machine.

The development of the "bogas-like" interface will be made in
close collaboration with the VIB in Ghent, Belgium.

Research Groups

The candidate will work under the direction of Dr. C. Pozniak,
University of Saskatchewan.

The successful candidate will spend the first year as a Visiting
Scientist in the group "Structure, Function and Evolution of the Wheat
Genome" led by C. Feuillet and located at the UMR INRA-UBP GDEC in
Clermont-Ferrand (France). She/he will improve further the TriAnnot
pipeline (see above) and build the TriAnnot-VM in direct collaboration
with P. Leroy, the research engineer in charge of the TriAnnot project
in the group. She/he will interact with researchers and graduate
students that are users of the pipeline. She/he will also interact with
other groups involved in Genetics, Bioinformatics and with other
international laboratories developing similar activities.

In the second year of the project, the candidate will work at the Crop
Development Centre, University of Saskatchewan, Canada in the research
group led by Dr. Pozniak. She/he will install and test the TriAnnot-VM
within the informatics infrastructure available at the University of
Saskatchewan and use it to annotate the sequence of wheat chromosomes
being sequenced by that group. The second year of the project will
involve interaction with iPlant to install the TriAnnot-VM within the
iPlant cyber infrastructure in US. The successful candidate will also
assist with writing and preparation of manuscripts for publication.

Qualifications Required

The candidate must have a minimum of a Master's degree in computer
science or a related field with at least three years of bioinformatics
experience. The candidate should have a good knowledge in Perl, python
and Object Programming. Previous experience of Perl Object Oriented
Programming is recommended. Candidate has to be familiar with parallel
programming (Cluster) and Cloud computing. Research experience in plant
genomics is desirable.

Applicant must have a clear evidence of productivity, creativity and
independence; excellent communication skills, and must be proficient in
both reading and writing in the English language and an ability to work
both independently and within a team. Candidates must have demonstrated
report writing and publication capabilities.

Salary: A competitive salary is available with full benefits (pension,
health & dental, life insurance). Salary will be commensurate with
experience.

The University of Saskatchewan is committed to employment equity and
applications from women, aboriginal peoples, visible minorities and
persons with disabilities are encouraged. All qualified individuals are
invited to apply, but Canadian citizens and permanent residents will be
given priority.

Contacts

Parties interested in the position should provide by email a letter of
application, Curriculum Vitae with a publication list, a brief statement
of research and interests, and the names and contact information for at
least three references, to:

Chris Barker
Genome Prairie
Senior Project Manager
Room 2E80, Agriculture Building
Department of Plant Sciences
University of Saskatchewan
Saskatoon, Saskatchewan
S7N 5A8
cbarker@genomeprairie.ca

Review of applications will begin September 15th, 2012 and continue
until the position is filled.  All applicants will be notified shortly
thereafter.

The preferred start date for the position is November 1st, 2012.