Qureos

FIND_THE_RIGHTJOB.

Cloud Network Engineer II

United States

The High Performance Computing and Artificial Intelligence (HPC and AI) team is focused on building the next-generation distributed artificial intelligence supercomputer. Our goal is to enable breakthroughs in artificial intelligence by delivering unmatched computational power, scalability, and reliability. We design and develop advanced infrastructure that supports high-performance model training at scale, laying the groundwork for innovations that expand the boundaries of what artificial intelligence can achieve.

We are seeking a Cloud Network Engineer II who is passionate about designing and developing the infrastructure that powers large-scale artificial intelligence and high-performance computing systems. In this role, you will contribute to the design, deployment, and operation of network infrastructure, automation workflows, observability frameworks, and performance optimization systems. These components are essential for achieving ultra-low latency, high throughput, and efficient data movement at petabyte scale in distributed workloads.

As a Cloud Network Engineer II on the HPC and AI Infrastructure team, you will work at the intersection of artificial intelligence supercomputing and large-scale networking. Your contributions will directly impact the reliability and performance of distributed clusters, leveraging high-speed fabrics such as Ethernet and InfiniBand, and accelerated compute platforms including NVIDIA and AMD graphics processing units. This is a unique opportunity to help build the network infrastructure that ensures speed, reliability, and high availability at exascale levels, while collaborating across hardware, infrastructure, and platform teams.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

  • Network Design & Implementation: Architect and deploy high-throughput, low-latency physical network topologies (e.g., Clos, FatTree) using technologies such as InfiniBand and Ethernet to support AI model training and HPC workloads.
  • Infrastructure Automation: Develop and maintain automation frameworks for provisioning, validating, and monitoring physical network infrastructure at scale, ensuring consistency and reliability across data centres.
  • Operational Readiness: Serve as a Designated Responsible Individual (DRI) for physical network systems—monitoring health, responding to incidents, performing root-cause analysis, and driving improvements in availability and observability.
  • Tooling & Instrumentation: Build and integrate tooling for telemetry, diagnostics, and performance tuning of physical network components, enabling real-time visibility into link health, congestion, and jitter.
  • Cross-Functional Collaboration: Partner with hardware engineering, DataCentre operations, and software-defined networking teams to ensure seamless integration of physical and logical network layers.
  • Documentation & Standards: Own the documentation of physical network designs, cabling standards, and deployment procedures. Lead design reviews and ensure alignment with compliance and safety standards.
  • Innovation & Research: Stay current with advancements in optical networking, high-speed interconnects, and AI/HPC fabric technologies. Evaluate and integrate emerging solutions to improve scalability, efficiency, and performance.

Qualifications

Required Qualifications:
  • Master's Degree in Electrical Engineering, Optical Engineering, Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in network design, development, and automation
    • OR Bachelor's Degree in Electrical Engineering, Optical Engineering, Computer Science, Information Technology, or related field AND 2+ years technical experience in network design, development, and automation
    • OR equivalent experience.
  • 1+ year of experience designing, deploying, and supporting data center and backbone networks for distributed computing platforms such as artificial intelligence and machine learning clusters, high-performance computing systems, or hyperscale data centers
  • 1+ year of experience with network performance tuning (latency, jitter, throughput optimization) and hands-on experience with telemetry and observability tools for physical infrastructure
  • 1+ year of experience with Optical networking, high-speed interconnects (e.g., InfiniBand, Ethernet, NVLink), and fabric orchestration in large-scale environments and Network automation frameworks, structured cabling standards, and tools for link validation, diagnostics, and monitoring.
Other Requirements:
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
    • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Preferred Requirements:
  • Doctorate Degree in Electrical Engineering, Optical Engineering, Computer Science, Information Technology, or related field
    • OR Master's Degree in Electrical Engineering, Optical Engineering, Computer Science, Information Technology, or related field AND 3+ years technical experience in network design, development, and automation
    • OR Bachelor's Degree in Electrical Engineering, Optical Engineering, Computer Science, Information Technology, or related field AND 5+ years technical experience in network design, development, and automation
    • OR equivalent experience.
  • 1+ year(s) experience in high-performance networking environments, including:
    • Artificial Intelligence-specific networking technologies such as InfiniBand, Remote Direct Memory Access over Converged Ethernet (RoCE), and NVIDIA NVLink, including their physical deployment and performance characteristics
    • Artificial Intelligence accelerators (e.g., graphics processing units from NVIDIA or AMD, or tensor processing units) and their integration with physical networking infrastructure
    • Linux-based systems, including kernel-level networking, interface tuning, and low-level debugging of physical network issues
  • 1+ year(s) experience with telemetry and observability tools used to monitor physical network health, link performance, and congestion at scale
  • 8+ year(s) technical experience in network design, development, and automation, demonstrated by one of the following:
    • Direct experience in the field
    • 3+ years of experience with a Master’s Degree in one of the above fields
    • Practical experience or a Doctorate Degree in a related field
Cloud Network Engineering IC3 - The typical base pay range for this role across the U.S. is USD $100,600 - $199,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $131,400 - $215,400 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

  • Microsoft will accept applications for the role until October 22, 2025.

#azurecorejobs
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Similar jobs

No similar jobs found

© 2025 Qureos. All rights reserved.