Data Quality Analyst – GEMINI
GEMINI (www.geminimedicine.ca) is a unique big data platform in the Canadian healthcare landscape using advanced methods and analytics to extract and standardize data captured in hospital electronic health records. GEMINI currently exists at 30+ ON hospitals, where we have collected data on over 2 million hospital admissions across more than 30 hospitals. The GEMINI data platform supports the Ontario General Medicine Quality Improvement Network (GeMQIN), a provincial network led by Ontario Health to improve care for general medicine hospital patients. GEMINI is a collaborative data and analytics platform for all Ontario hospitals to accelerate research and quality improvement, leading to excellent hospital care.
We are looking to hire three Data Quality Analyst to join the GEMINI team and help develop a truly unique data platform in the Canadian healthcare landscape. The Data Quality Analysts will support and write reproducible code for GEMINI’s data pipeline including conducting quality checks, formatting, validating, and standardizing data from multiple sources, supporting administrative tasks, performing descriptive analyses, and optimizing and retrieving data from our internal databases.
We are looking for a candidate with excellent data processing skills, demonstrated experience working with large datasets and databases, programming skills, strong understanding of the Canadian healthcare environment, and an aptitude for data visualization. You will be joining a dynamic, fun and mission-driven team of clinicians, researchers, and quality improvement experts. Strong interpersonal and communication skills will be an important asset in this role.
This is a one-year contract position with the possibility of renewal.
Duties and Responsibilities:
Data Processing and Analysis (80%)
Conduct pre-processing checks on raw data collected from hospitals to align with GEMINI’s data reference model
Perform quality control checks following standard operating procedures to ensure data is complete and of high quality
Lead data processing workflow
Pre-process raw data which includes formatting and merging data from multiple sources, as well as understanding overall data quality
Standardize and harmonize data from data sets to prepare for loading into database
Write HTML/PDF/Microsoft Word reports summarizing the analysis
Create and modify R code as required
Efficiently identify and correct syntax and programming logic errors in R code
Responds to ad-hoc requests from GEMINI team
Design and maintain real-time ETL pipelines to extract, transform, and load streaming data from multiple sources
Process and transform healthcare data formats including HL7 messages and JSON files
Data Administration (20%)
Maintain and assist with writing/updating standard operating procedures and other documents to support data processing work
Engage with stakeholders to clarify data quality issues
Notify manager and data scientists of any issues and errors with code/analysis
Write documentation and contribute to project code repositories so that all work is reproducible
Attend code review meetings to discuss any problems/solutions to project code bases
Support GEMINI’s data operation needs
Explains capabilities and limitations of databases and information to researchers and other stakeholders
Qualifications:
Bachelor’s degree in Computer Science, Statistics/Biostatistics and/or related discipline
At least 2 years of experience with R
At least one year of experience with SQL
At least one year of experience working in a Linux/Unix environment
Excellent attention to detail and proven ability to learn new skills
Experience working independently and as part of a team
Excellent organizational skills to manage multiple tasks in a timely manner
Demonstrated the ability to adapt and manage changing priorities
Excellent written and oral communication skills
Experience working with and manipulating large datasets including merging and analyzing data from multiple sources
Proficient with MS Office software (Word, Excel, PowerPoint, Outlook, etc.)
Experience with healthcare data is an asset
Experience with git is an asset
Experience with cloud-based database platforms (AWS RDS/DynamoDB, Google Cloud SQL/Firestore, Azure SQL Database) is an asset
Unity Health Toronto is committed to creating an accessible and inclusive organization. We strive to provide a recruitment process that is barrier-free and in compliance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code. We understand that you may require an accommodation at any stage of the recruitment process. When you are contacted, please inform the Talent Acquisition Specialist and we will work with you to meet your accommodation needs. We want to emphasize that all accommodation requests are handled with the utmost confidentiality, respecting your privacy and dignity.