Machine Learning Operations Engineer (Exp. with CI/CD, Kubernetes & Docker, container networking, OOD/OOP) (100% Remote) – Direct Hire or C2H
- Job Title
- Machine Learning Operations Engineer (Exp. with CI/CD, Kubernetes & Docker, container networking, OOD/OOP) (100% Remote) – Direct Hire or C2H
- Job ID
- Ann Arbor, MI 48106
- Other Location
Title: Machine Learning Operations Engineer (Exp. with CI/CD, Kubernetes & Docker, container networking, OOD/OOP) (100% Remote) – Direct Hire or C2H
Direct Hire - $130-$140K + 15% Bonus
From our start in 2009, Conexess has established itself in 3 markets, employing nearly 200+ individuals nation-wide. Operating in over 15 states, our client base ranges from Fortune 500/1000 companies, to mid-small range companies. For the majority of the mid-small range companies, we are exclusively used due to our outstanding staffing track record
Who We Are:
Conexess is a full-service staffing firm offering contract, contract-to hire, and direct placements. We have a wide range of recruiting capabilities extending from help desk technicians to CIOs. We are also capable of offering project based work.
As a leader of innovation in the food and digital commerce space, our client is constantly testing new concepts, platforms, and technologies that drive outstanding consumer and employee experiences, which requires a disciplined data approach. This position will provide technical proficiency as a Machine Learning (ML) Operations Engineer who focuses on the development and integration of ML enabled software.
The candidate should be an expert in Docker Ops and containerization technologies. They must have knowledge about container orchestration tools like Kubernetes, optimizing Docker image construction for ML performance and CI/CD Pipelines. They will also be responsible for managing and improving our ML application CI/CD pipelines to industry best practices keeping up-to-date with the latest industry trends and technologies. This role functions an internal resource within the Analytics & Insights team support Data Scientist in ML development work.
RESPONSIBILITIES AND DUTIES
(60%) Machine Learning Operations Support
- Work with internal clients and the Data Science team to solve problems using ML.
- Supports ML software build, and optimize applications' containerization and orchestration with Docker and Kubernetes.
- Manage and improve existing automated CI/CD pipelines.
- Helps perform load testing and tuning of AI/ML models and runtime components supporting our production workloads
- Ensures ML microservices are able to access to real time data sources, designing options for data pipelines, as well as analytical packages defined externally from external vendors and desired for use by the A&I data science team.
- Coordinate monitoring of ML/AI model execution performance and optimize reliability of ongoing processing.
(40%) Provide guidance and support / expand use of ML
- Work with internal clients and the Data Science team to solve problems
- Consult with various data science team to architecture solutions, optimize software development, and deploy solutions into production environments.
- Work with development teams to determine requirements to ensure all tracking is in place for future analysis.
- Work as a bridge between data science and IT groups to support integration of AI/ML modeling into production environments.
- Create and execute test plans that help address questions from development teams and troubleshoot.
- Consult inter-departmentally on new product deployments and incremental improvements.
- Master’s degree (or Bachelor’s degree with equivalent experience) in a quantitative science such as statistics, mathematics, computer science, engineering, etc.
- Must have the ability to work independently, with minimal supervision
- Participate in on-call rotations.
- Experience with continuous integration tools and continuous delivery pipeline (e. g. Jenkins, Jfrog, Artifactory)
- At least two years of experience with Kubernetes & Docker in a production environment.
- Experience with container networking on Docker.
- Experience with application deployment by using CI/CD.
- Version control, software experience, Object oriented coding experience
- Experience with streaming data and real-time inference
- Comfortable creating and using APIs and integration into production environments
- Experience scaling a model to production load and monitoring
- Intermediate or Advanced Skills with at least one scripting and/or programming language, e.g. Python, R
- Operations on Linux and Windows-based systems.
The Following are a plus but not required:
- Experience training models with at least one deep learning (neural networks) framework (TensorFlow, CNTK, Torch, Caffe, etc.)
- Experience working with Big Data technologies such as HDFS, Hive, Spark, S3 storage, etc.