M

Data Engineer II - Windreich Department of Artificial Intelligence & Human Health

Mount Sinai Health System
Full-time
On-site
New York, New York, United States
Artificial Intelligence (AI)
Description

The Windreich Department of Artificial Intelligence and Human Health (AIHH) at the Icahn School of Medicine at Mount Sinai leads a pioneering collaboration with the Charles Bronfman Institute for Personalized Medicine (CBIPM) and the Hasso Plattner Institute for Digital Health (HPIMS) to transform the future of health care. Together, these three entities form a unified ecosystem that integrates advanced AI-driven diagnostics, large-scale clinical research, and cutting-edge digital innovation to accelerate scientific discovery and deliver meaningful improvements in patient care and health systems worldwide.

We are looking for a Data Engineer to be responsible for integrating healthcare data from various sources into an integrated platform. Your primary focus will be to aid in developing and managing ETL pipelines for our flagship data science platform.

Β 



Responsibilities
  • Design, develop, and maintain efficient and reliable ETL (Extract, Transform, Load) pipelines.
  • Integrate diverse data sources, ensuring high-quality and consistent data ingestion.
  • Perform data wrangling and preprocessing to prepare data for analysis and machine learning.
  • Collaborate with data scientists, software engineers, and other stakeholders to understand data requirements and deliver solutions.
  • Implement best practices for data management, including data governance, data quality, and data security.
  • Monitor and optimize data workflows to ensure performance and scalability.
  • Troubleshoot and resolve data-related issues promptly.
  • Work with healthcare data including PHI (Protected Health Information)


Qualifications
  • Bachelors degree in Computer Science or a related discipline; Advanced degree preferred.
  • 4+ years relevant professional development experience
  • Strong programming skills in Python, with experience in libraries such as Pandas and NumPy.
  • Solid understanding of logical thinking and software engineering fundamentals.
  • Knowledge of SQL and experience with relational databases.
  • Familiarity with data integration tools and technologies (e.g., Apache Airflow, Talend, Informatica) is a plus.
  • Excellent attention to detail and problem-solving skills.
  • Ability to learn quickly and adapt to new technologies and methodologies.
  • Strong communication and teamwork skills.
  • Must be willing to pivot with the needs of the organization and learn / grow skillsets
  • Must be able to understand user requirements from specifications provided by business/non-technical personnel
  • Must be able to quickly develop competency with new technologies and software

Additional Optional Requirements:

  • Experience building data ingestion workflows
  • Familiarity with cloud architecture such as Azure, GCP or AWS
  • Experience working with healthcare data standards and technologies (i.e., DICOM, FHIR, etc.)
  • Experience with OHDSI/OMOP
  • Experience designing data modelsΒ 
  • Experience working with anonymization / de-identification technologies

Β 

Β