Job Description
Req Id:  39164
Job Title:  Senior Data Science Engineer
City:  West Lafayette
Job Description: 

Job Summary

The Network for Computational Nanotechnology (NCN) is charged with providing computational, modeling, and data infrastructure to support the creation of digital twins (DTs) by SMART USA Institute.  These services will support DTs for the Birck Nanotechnology Center (BNC) and other SMART USA partners for research, development, and education/workforce development (EWD) efforts.  The services will involve collecting data from the source (e.g. BNC equipment and simulations), making the data AI-ready following FAIR principles, seamlessly connecting the data to AI models and visualization to inform decision-making, and ultimately publication, storage, and sharing.  Importantly, decision making includes using real-time information from multiple sources to update digital twins and the use of their forecasting ability to provide feedback to experimentalists.  The work will leverage resources from Purdue's Rosen Center for Advanced Computing (RCAC) and nanoHUB.

The Senior Data Science Engineer will work with the NCN team to develop scalable data pipelines and modeling workflows, ensure data quality and governance, design data architectures, and implement ontologies and knowledge graphs.  The role will also prepare AI-ready datasets, optimize data workflows for analytics and dashboards, and support decision-making tools that leverage advanced computational methods.  The position will design, build, and maintain advanced data infrastructure supporting research, education, and workforce development initiatives.  This position will enable reliable data collection, storage, processing, and delivery for scientific, engineering, and AI/ML applications.  The position is expected to have a 3-year duration and be renewable.

What We're Looking For:

  • Bachelor's degree in Engineering, Computer Science, Physical Science, Data Science or a related field
  • Four or more years of experience in data engineering, database development, or datapipeline implementation, including:  building ETL/ELT pipelines to transform raw > ccleaned> enriched data, database schema design and management for relational or non-relational systems, implementing data quality, lineage, and governance frameworks.
  • Consideration will be given to an e quivalent combination of required education and related experience
  • Demonstrated experience with one or more relational database systems such as MySQL, PostgreSQL, or MS SQL Server
  • Strong programming skills in Python, SQL and related data transformation tools
  • Experience with data architecture design, partitioning, indexing, and retention policies for performance and scalability
  • Familiarity with visualization/BI tools such as Tableau or Power BI
  • Ability to collaborate with domain experts to define metadata plans and integrate diverse data sources
  • Knowledge of data lifecycle management, archival processes, and data security principles
  • Ability to quickly understand new technology requirements and demonstrate skills learned
  • Excellent oral, written, and computer communication skills with strong analytical and troubleshooting skills
  • Ability to multi-task on a variety of activities and work effectively on multiple deadline-driven tasks
  • Self-motivated with the ability to think and work independently
  • Demonstrated ability to work with others

What is Helpful:

  • Advanced degree in Engineering, Data Science, or Physical Sciences discipline
  • Experience designing and implementing ontologies, taxonomies, or knowledge graphs
  • Familiarity with AI/ML data preparation, including anomaly detection for process control data sets
  • Knowledge of FAIR (Findable, Accessible, Interoperable, Reusable) data principles
  • Background with cloud data platforms (AWS, AZure, GCP) or big data technologies (Spark, Hadoop)
  • Experience integrating agentic AI systems with controlled access to datasets and analytical tools
  • Ability to scope, evaluate and deploy commercial management or analytics solutions
  • Experience with large-scale scientific or engineering data workflows
  • Specialized skills such as: big data technologies, dynamic web programming, or speculative/exploratory data driven analysis

What We Want You To Know:

  • Purdue will not sponsor employment authorization for this position
  • A background check is required for employment in this position
  • FLSA: Exempt (Not eligible for overtime)
  • Retirement Eligibility: Defined Contribution Waiting Period
  • Purdue University is an EO/EA University. 
Posting Start Date:  10/7/25