Qureos

FIND_THE_RIGHTJOB.

System Development Engineer, AGI Infrastructure

India

DESCRIPTION


The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive engineers to play a pivotal role in the development/maintenance of industry-leading multi-modal and multi-lingual large language models (LLM). AGI team's mission is to leverage our hyper-scalable, general-purpose large model training and inference systems to develop and deploy cutting-edge sensory AI foundational models that revolutionize machine perception, interpretation and interaction, with humans and with the physical world.

We believe in “Work Hard. Have Fun. Make History” value by having a strong focus on sharing learning experiences from the front line with the development teams. So, the options for people in the team are vast. If you like mastering a domain and going deep, we need you. If you can juggle three tasks and coordinate with multiple people in the heat of an incident, we need you. If you love the benefits of process and methodical improvement, you will love it here. If you want to keep your head down, headphones on, and bash out code to support the team, we have a spot for you too.

You will be required to deeply understand technology landscapes, and evaluate the use of new technologies. You will be influential within your team and work with peers and senior leaders to define and revise the standards for operational excellence across systems. You will consistently tackle abstract issues that span multiple functional areas and drive your team to push for improvements that can scale across other teams, services, and platforms.

Key job responsibilities
Identify performance bottlenecks in compute infrastructure and propose solutions to address them.
Mentor junior members of the team to deliver results.
Provide support for cluster and node management, ensuring smooth operation of GenAI infrastructure.
Participate in design and code reviews and identify bottlenecks.
Troubleshoot and research root causes thoroughly and fix defects.
Continuously improve and automate our cluster/capacity/maintenance upgrades.
Candidates should be well-versed in core AWS services, including EC2 , Lambda , EKS etc.
Experienced in setting up and managing CI/CD pipelines using tools such as AWS CodePipeline, GitHub Actions, or similar platforms.
Familiarity with Infrastructure as Code (IaC) tools like AWS CloudFormation, Terraform, or the AWS CDK is a valuable asset. Furthermore, an understanding of networking concepts like VPC, subnets, and security groups, as well as configuring Load Balancers and Route 53, is desirable.
Should have hands-on experience in Kubernetes.

BASIC QUALIFICATIONS

  • 3+ years of administrative experience in networking, storage systems, operating systems and hands-on systems engineering experience
  • Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust
  • Experience with Linux/Unix
  • Experience with CI/CD pipelines build processes

PREFERRED QUALIFICATIONS

  • Experience with distributed systems at scale

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

© 2025 Qureos. All rights reserved.