Graphcore Logo

Graphcore

Senior Machine Learning Engineer (Large Systems)

Posted 5 Days Ago
Be an Early Applicant
Easy Apply
In-Office
3 Locations
Senior level
Easy Apply
In-Office
3 Locations
Senior level
The Senior Machine Learning Engineer will develop and optimize AI models for Graphcore's hardware, focusing on performance at scale while collaborating with cross-functional teams.
The summary above was generated by AI
About Graphcore

Graphcore is one of the world’s leading innovators in Artificial Intelligence compute. 

It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry. 

As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.  

Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation. 

Job Summary

As a Senior Machine Learning Engineer in the Applied AI team at Graphcore, you will contribute to advancing AI technology by developing and optimising AI models tailored to our specialised hardware. You will work on large scale systems where performance is critical to the success of our projects. Working closely with the Software development and Research teams, you will play a critical role in identifying opportunities to innovate and differentiate Graphcore’s technology. We seek engineers with strong technical skills and an understanding of AI model implementation at scale, eager to make a tangible impact in this rapidly evolving field.


The Team

The Applied AI team’s role is to be proxies for our customers, we need to understand the latest AI models, applications, and software to ensure that Graphcore’s technology works seamlessly with the AI ecosystem and at scale. We build reference applications, contribute to key software libraries e.g. optimising kernels for efficiency on our hardware, and collaborate with the Research team to develop and publish novel ideas in domains such as efficient compute, model scaling and distributed training and inference of AI models for multiple modalities and applications.
If you're excited about advancing the next generation of AI models on cutting-edge hardware, we’d love to hear from you!

Responsibilities and Duties

  • Implement latest machine learning models and optimise them for performance and accuracy, scaling to 1000s of accelerators.
  • Test and evaluate new internal software releases, provide feedback to software engineering teams, make necessary code fixes, and conduct code reviews.
  • Benchmark models and key ML techniques to identify performance bottlenecks and improve model efficiency.
  • Design and conduct experiments on novel AI methods, implement them and evaluate results.
  • Collaborate with Research, Software, and Product teams to define, build, and test Graphcore’s next generation of AI hardware.
  • Engage with AI community and keep in touch with the latest developments in AI.

  

Candidate Profile

Essential:

  • Bachelor/Master's/PhD or equivalent experience in Machine Learning, Computer Science, Maths, Data Science, or related field.
  • Proficiency in deep learning frameworks like PyTorch/JAX.
  • Strong Python or C++ software development skills
  • Expertise in deep learning from model training to optimisation and evaluation.
  • Experience in distributed training or inference of ML models across 64+ accelerators.
  • Capable of designing, executing and reporting from ML experiments.
  • Developed deep understanding of performance bottlenecks and how to overcome them.
  • Ability to move quickly in a dynamic environment
  • Enjoy cross-functional work collaborating with other teams.
  • Strong communicator - able to explain complex technical concepts to different audiences.

Desirable:

  • Experience in one or more of:
    • MLOps for Kubernetes-based clusters
    • Building production systems with large language models
    • Efficient computing based on low-precision arithmetic.
  • Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models.
  • Familiarity with HPC systems and networking including Infiniband, NVLink, RoCE technologies.
  • Have contributed to open-source projects or published research papers in relevant fields.
  • Knowledge of cloud computing platforms.
  • Keen to present, publish and deliver talks in the AI community.
Benefits

In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.

Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications

Top Skills

Pytorch,Jax,Python,C++

Similar Jobs

11 Days Ago
In-Office
London, Greater London, England, GBR
Senior level
Senior level
Angel or VC Firm • Artificial Intelligence • Fintech • Software • Financial Services
As a Senior Machine Learning Engineer, you'll design, build, and scale machine learning systems, developing production-grade pipelines and collaborating with engineers and data scientists.
Top Skills: Aws SagemakerAzure MlDatadogDockerElkGcp Vertex AiGrafanaKubernetesMlflowNumpyPandasPrometheusPythonPyTorchScikit-LearnSQLTensorFlowWeights & BiasesXgboost
21 Hours Ago
Remote or Hybrid
Dunstable, Bedfordshire, England, GBR
Entry level
Entry level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Systems Platform Engineer I supports Storage and Backup deployments, troubleshooting, maintenance, and providing technical expertise across teams while ensuring compliance and documentation.
Top Skills: BackupCohesity NetbackupDatabase Systems AdministrationDellHpPureSanStorageUnixVeritas
21 Hours Ago
Easy Apply
In-Office
Windsor, Berkshire, England, GBR
Easy Apply
Mid level
Mid level
Artificial Intelligence • Big Data • Healthtech • Machine Learning • Software • Database • Analytics
The Partner Implementation Lead ensures successful deployment of TrakCare and HealthShare solutions, supports partners, supervises implementations, and drives continuous improvement. They collaborate with teams and clients to optimize healthcare solutions, overseeing quality and compliance with governance frameworks while providing guidance and knowledge transfer.
Top Skills: FhirHealthshareHl7IheIntersystems HealthconnectMirthMuleOracle FusionTrakcare

What you need to know about the Manchester Tech Scene

Home to a £5 billion digital ecosystem, including MediaCity, which consists of major players like the BBC, ITV and Ericsson, Manchester is one of the U.K.'s top digital tech hubs, at the forefront of advancements in film, television and emerging sectors like as e-sports, while also fostering a community of professionals dedicated to pushing creative and technological boundaries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account