AI/ML Platform Engineer
Surge IT
Location
🇺🇸 United States
Type
Full-time
Salary
Undisclosed
Posted
2d ago
Job Description
We are seeking a hands-on Senior AI/ML Platform Engineer with 10+ years of IT experience and a strong track record of building, deploying, and operationalizing AI/ML systems. The ideal candidate is a doer who excels in implementing scalable, production-grade AI/ML solutions across cloud environments. Core
Requirements
10+ years of IT/engineering experience 3+ years of hands-on AI/ML development experience 4+ years working directly with AWS services (Lambda, EC2, S3, DynamoDB, IoT Core, API Gateway, Fargate/ECS) Proven experience deploying ML systems into production environments Strong coding skills and ability to build systems end-to-end Key Skills Deep Learning frameworks: TensorFlow, PyTorch, Keras LLMs, prompt engineering, NLP pipelines Python and Java as primary languages; strong engineering fundamentals FastAPI and microservices for ML inference Infrastructure-as-Code (Terraform) Kubernetes and Docker for scalable ML workloads Distributed/cloud systems design with AWS Edge-to-cloud system integration experience Hands-on build experience (not just design/architecture)
Responsibilities
Build and deploy production-grade ML/AI pipelines and services Develop LLM-powered and NLP-driven applications Write, optimize, and maintain high-quality Python-based ML code Implement scalable infrastructure using Terraform, AWS, and Kubernetes Build FastAPI-based inference services and cloud APIs • Collaborate with cross-functional engineering teams to deliver high-impact systems Troubleshoot, optimize, and own systems end-to-end as a hands-on engineer Preferred Skills Experience with distributed systems and microservices Strong understanding of ML model lifecycle, deployment patterns, and operational monitoring