As a Principal Machine Learning Engineer at New Relic, you will play a crucial role in exploring new innovations and performing proof of concepts to enhance our AI and machine learning solutions for optimizing and automating IT operations. You will be responsible for leading the evaluation of emerging technologies, identifying areas for improvement, and developing new features while ensuring the reliability and scalability of New Relic s AIOps platform.
What youll do
Lead the exploration of emerging AI and machine learning technologies and develop proof-of-concepts to assess their potential impact on New Relic s AIOps platform.
Collaborate with cross-functional teams to gather requirements, prioritize features, and define technical solutions based on the latest innovations.
Design, develop, and maintain AIOps solutions that can monitor and analyze IT data in real time, predict incidents, and automate responses.
Develop and implement machine learning models for anomaly detection, root cause analysis, leveraging both supervised and unsupervised learning techniques.
Implement data processing pipelines that can handle petabytes of data in real-time, leveraging technologies such as Apache Kafka, Apache Flink, and Apache Spark.
Develop and maintain custom integrations with third-party IT tools and systems to enhance the AIOps platform s capabilities.
Monitor the AIOps platform s performance and troubleshoot issues as they arise.
Contribute to the documentation and knowledge base to help other teams understand and use the AIOps platform effectively.
This role requires
A constantly evolving architecture, with hundreds of containerized services across multiple agile teams.
A hybrid datacenter/cloud environment ingesting over 200 petabytes of data per month, and accepting over 70 billion HTTP requests a day from our customers.
Java, Kafka, Istio, Kubernetes, MySQL, Go and that s just the beginning of the technologies in our stack.
Bachelor s or Master s degree in Computer Science, Computer Engineering,Data Science, or a related field.
A minimum of 10 years of software or data engineering experience
3+ years of experience designing and implementing AIOps solutions in a SaaS or cloud environment.
Extensive experience in LLMs, AI, and machine learning, with a deep understanding of relevant technologies and tools.
Demonstrated experience building AI proof-of-concepts
Strong understanding of Observability and the challenges associated with monitoring complex distributed systems.
Experience with generative AI techniques, such as GANs and VAEs, and a track record of applying them in practical settings.
Passion for continuous learning and staying up-to-date with the latest trends and technologies in the AI and Observability industries.
Strong collaboration skills and ability to work effectively with cross-functional teams.
Proven track record of delivering results and meeting or exceeding business objectives.