Infinite pl, is a digital led tech firm driven to become a digital logistics pioneer by harnessing the power of people, data, and platforms. We are enabled through in-house, external, network, & other investment capabilities which we utilize to orchestrate & build innovative platforms that tackle complex problems within logistics & adjacent sectors.
Infinite pl’s mission is nothing short of a logistics revolution! We're here to enrich the experiences of governments, businesses, and residents around the world through cutting-edge digital solutions.
"We're not just players; we're game-changers."
-
We are seeking an experienced Operations Manager to lead the technical operations team responsible for running a mission-critical platform with a target availability of 99.99%. The role will ensure service stability, SLA achievement, ITSM process compliance, security and regulatory compliance, and continuous improvement across operations. The Operations Manager will own day-to-day service operations, major incident leadership, operational readiness, vendor coordination, and reporting to senior stakeholders.
-
Oversee stable operations for the platform with 99.99% availability.
-
Enforce and continuously improve ITSM processes (Incident, Problem, Change, Request, Release, Knowledge, CMDB).
-
Ensure SLA / SLO compliance, operational readiness, and performance reporting.
-
Maintain strong security posture and ensure adherence to applicable compliance requirements.
-
1) Service Operations Leadership
-
Oversee the platform operations team (NOC/Operations Engineers/SRE-like functions as applicable) to ensure reliable, secure, and high-performing services.
-
maintain clear operating rhythms: daily ops reviews, weekly service health checks, monthly SLA reviews, and quarterly service improvement plans.
-
Drive on-call readiness, shift coverage, escalation paths, and decision-making during critical events.
-
2) ITSM Process Ownership & Compliance
-
Own and enforce ITSM processes end-to-end.
-
Audit operational adherence and drive corrective actions for non-compliance.
-
3) SLA, Availability, and Reliability Management
-
Ensure continuous tracking and achievement (availability, response time, resolution time, performance).
-
Manage availability and resilience practices: redundancy validation, capacity planning, proactive monitoring, and performance tuning.
-
Lead post-incident reviews and drive measurable improvements.
-
4) Security & Compliance, Partner with security teams to ensure:
-
Timely patching and remediation
-
Secure configuration baselines
-
Audit readiness and evidence collection
-
Incident response alignment and reporting
-
Enforce least privilege access and periodic access reviews.
-
5) Monitoring, Observability, and Operational Tooling
-
Ensure comprehensive monitoring and alerting coverage for infrastructure, applications, APIs, databases, integrations, and security events.
-
Ensure operational toolchain effectiveness (ITSM tool, monitoring, CI/CD visibility, CMDB, asset management).
-
6) Stakeholder & Vendor Management
-
Act as the primary operations interface for internal stakeholders and external partners/vendors.
-
Manage vendor SLAs and ensure effective collaboration for incident resolution, patching, upgrades, and service improvements.
-
Provide clear operational communications during incidents and planned maintenance.
-
7) Reporting & Governance
-
Produce weekly/monthly service reports including SLA performance, availability, incidents, trends, risks, and improvement actions.
-
Maintain an operational risk register and ensure mitigation plans are executed.
-
Present service health and improvement plans to leadership.
-
Bachelor’s degree in Computer Science, Information Systems, Engineering, or equivalent experience.
-
5+ years in IT operations / production support roles, with 2+ years leading teams for critical services.
-
Strong hands-on understanding of operating high-availability platforms (24/7 environments).
-
Proven experience implementing and running ITSM processes in production (ITIL-aligned).
-
Deep understanding of incident/problem/change management, operational readiness, and service governance.
-
Experience with cloud and modern platform operations (e.g., cloud infrastructure, APIs, containerized services) is preferred.
-
Ability to define, track, and improve operational KPIs and reliability metrics.
-
Strong stakeholder management, structured communication, and decision-making under pressure.
-
ITIL Foundation / ITIL Managing Professional (or equivalent ITSM certification)
-
ISO 27001 awareness/certification or security-related certifications
-
Cloud certifications (GCP) is a plus
-
Full-time, includes on-call leadership and participation in major incident bridges as required.
Infinite pl ️ - where innovation meets logistics, and the journey is Infinitely boundless!
Let's disrupt logistics together and explore infinite opportunities!
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.