CDC Influenza Data Science Fellowship
*Applications will be reviewed on a rolling-basis.
A research opportunity is currently available within the Influenza Division (ID), in the National Center for Immunization and Respiratory Diseases (NCIRD) at the Centers for Disease Control and Prevention (CDC) located in Atlanta, Georgia. ID performs domestic and international influenza surveillance, aids in influenza diagnosis, and is a leader in influenza virus characterization and control efforts.
The Influenza Division's Office of the Director (ID/OD) Informatics Group performs cross-cutting work spanning multiple areas, including the development of computational methods, data integration and enrichment, statistical analysis, laboratory automation, system management, special R&D projects, and technical consultation. The Informatics Group seeks to empower and strengthen informatics efforts throughout the division via innovation, collaboration, service, and the management of shared analytics resources. Additionally, the group serves as a forum for informatics knowledge exchange, helping to align division-wide computational efforts.
Under the guidance of a mentor, the participant will have opportunities to engage in a variety of data enrichment projects using techniques such as anomaly detection and/or feature selection. Statistical and evolutionary research projects are also available, such as augmenting systems for evolutionary group annotation, passage-mutation model refinement, and the high-throughput inference of virus reassortants. The participant will be trained to use distributed database techniques, such as Apache Hive and Apache Impala, using structured query language (SQL). There will also be opportunities to learn distributed scientific computation in the context of Apache Spark or Univa Grid Engine. The participant will gain experience using Git and GitLab and will receive training to enhance programming skills. Finally, the participant will learn about influenza, its molecular classifications and antigenic characterization, as well as the data ecosystem necessary for a world-class influenza surveillance system.
The qualified candidate should have received a master's degree in one of the relevant fields. Degree must have been received within five years of the appointment start date.
- Skill in at least one programming or scripting language (C/C++, Java, Perl, Python, R, Scala)
- Working knowledge of Linux command line (BASH, etc.) operations
- Training or experience in using Structured Query Language (SQL)
- Experience in one of the following will be strongly preferred: (a) probability estimation, (b) machine learning, (c) inferential statistics, or (d) Bayesian Graphical Models
If you have questions, send an email to ORISE.CDC.NCIRD@orau.org. Please include the reference code for this opportunity (CDC-ID-2020-0016) in your email.