Machine Learning Engineer, Specialist | Malvern, PA, USA
Vanguard Group, Inc.
Location
🇺🇸 Malvern, United States
Type
full_time
Salary
Undisclosed
Posted
4d ago
Job Description
Machine Learning Engineer, Specialist {"description": " The successful candidate will: • Bring strong proficiency and understanding of the AWS cloud platform and services, including (but not limited to) AWS SageMaker, AWS Lambda, Amazon S3, Step Functions, EMR, Glue, and other services supporting machine learning platforms. • Demonstrate an excellent understanding of the machine learning development lifecycle, including data engineering, exploratory data analysis, modeling, and ML implementation and operations. • Design and implement scalable machine learning solutions; develop predictive models using advanced deep learning and statistical techniques; collaborate with data science and engineering teams to integrate ML solutions; and perform rigorous model evaluation and optimization. • Be proficient in software development and well-versed in developer tools such as Python, VS Code, and Jupyter Notebooks. • Be passionate about advances in machine learning, with knowledge of supervised learning, reinforcement learning, deep learning, and GenAI. • Demonstrate knowledge of AWS security practices, including IAM, S3 bucket policies, security groups, and VPCs. • Understand best practices for model training, deployment, and operations, including hyperparameter optimization, model evaluation, and operationalizing ML solutions. • Utilize popular Python frameworks such as TensorFlow, PySpark, PyTorch, and Pandas. • Leverage software design patterns to develop modular, maintainable, and scalable code.
Responsibilities
: • Participate in end-to-end machine learning projects, from conception through deployment and ongoing support. • Collaborate with data scientists to solve complex machine learning challenges, including supervised learning, reinforcement learning, deep learning, and GenAI. • Work closely with methodology researchers to integrate insights and strategies that improve investor outcomes. • Solve complex problems using multilayered datasets, enhance existing libraries, frameworks, and models, and collaborate with data analysts, data engineers, and architects to identify data distribution differences that affect model performance. • Leverage data pipeline designs and support the development of data pipelines for model development. • Use software tools to build data pipelines in distributed computing environments (e.g., PySpark, Glue ETL). • Support the integration of model pipelines into production environments and develop an understanding of the SDLC for model production. • Review pipeline designs, make data model changes as needed, and document and review design changes with data science teams. • Support data discovery and automated ingestion for model development by analyzing raw data sources for data quality, applying business context, and addressing model development needs. • Engage with internal stakeholders to understand business processes, develop hypotheses, structure requests, and translate