The Digital Application Operations professional is accountable for the stability, availability, and performance of all digital applications through best‑in‑class proactive monitoring and structured operational controls. The role owns service availability and performance reporting, leads incident root cause analysis with strict remediation to prevent recurrence, and acts as a production gatekeeper to safeguard live environments. It continuously challenges sub‑optimal solutions, driving their transformation into sustainable, high‑performing services while managing digital partners and external stakeholders. The role also ensures timely closure of audit findings through strong governance and embraces automation and innovation (e.g., Agentic AI, RPA) to streamline and modernize operations.
Key Responsibilities:
Application Operations & Service Stability
-
Ensure end‑to‑end operational stability, availability, and performance of digital applications (App, Web, APIs, CRM, ERP).
-
Drive proactive service monitoring across application, database, infrastructure, and network layers using enterprise monitoring tools (e.g., Dynatrace, ManageEngine).
-
Identify service degradation early, initiate corrective actions, and prevent impact to business services.
-
Continuously assess operational health and highlight architectural or solution weaknesses impacting long‑term sustainability.
Monitoring, Observability & Continuous Improvement
-
Establish best‑in‑class observability practices, defining meaningful service metrics, thresholds, and alerts aligned to business KPIs.
-
Own continuous improvement initiatives to uplift monitoring maturity from reactive to predictive and proactive operations.
-
Identify sub‑optimal or fragile solutions and drive their transformation into resilient, scalable, and high‑performing platforms.
-
Promote operational excellence by embedding learnings from incidents, trends, and performance analysis into service enhancements.
Incident Management, RCA & Remediation
-
Lead and coordinate response to major incidents, ensuring rapid containment, communication, and service restoration.
-
Own post‑incident Root Cause Analysis (RCA), ensuring clear identification of root causes and contributing factors.
-
Drive remediation actions to closure and enforce controls to eliminate repeated incidents.
-
Track incident trends and systemic issues, escalating structural risks and recommending corrective measures.
Change, Release & Production Governance
-
Act as the production gatekeeper for new changes, releases, and deployments to safeguard live environments.
-
Establish and enforce well‑defined governance controls for production changes, including readiness validation and risk assessment.
-
Oversee production deployments to ensure minimal disruption, rollback readiness, and service continuity.
-
Continuously strengthen release and change controls to balance delivery velocity with operational stability.
Disaster Recovery, Resilience & Continuity Management
-
Own Disaster Recovery (DR) plans, runbooks, and recovery procedures for all B2B services.
-
Ensure DR strategies, RTOs, and RPOs are aligned with business criticality, regulatory expectations, and contractual obligations.
-
Plan, coordinate, and execute regular DR tests, documenting results, gaps, and improvement actions.
-
Ensure lessons learned from DR exercises and real incidents are embedded into operational processes and system designs.
-
Maintain DR documentation and evidence to support audits, regulators, and risk committees.
Service Performance Reporting & Stakeholder Communication
-
Own regular reporting on service availability, performance, incidents, and operational risks to management and business stakeholders.
-
Translate technical operational metrics into clear, business‑relevant insights and actions.
-
Maintain transparent communication during incidents and service issues, ensuring stakeholder confidence and alignment.
-
Collaborate closely with digital, delivery, infrastructure, and security teams across the service lifecycle.
Vendor & External Stakeholder Management
-
Manage operational relationships with external digital partners and technology vendors from a services perspective.
-
Ensure vendor services meet contractual SLAs, quality expectations, and operational standards.
-
Challenge vendors on root causes, remediation quality, and long‑term operational improvements.
-
Align vendor delivery with the bureau’s operational, governance, and resilience expectations.
Audit, Risk & Compliance
-
Own remediation of audit findings related to application operations, availability, and controls.
-
Drive resolution with technical teams and vendors within agreed timelines.
-
Transform operational processes by establishing preventive controls to avoid repeat audit observations.
-
Maintain operational documentation, runbooks, escalation matrices, and evidentiary artifacts to support audits and compliance needs.
Innovation, Automation & Future‑Ready Operations
-
Actively seek and drive adoption of innovative solutions to streamline and modernize application operations.
-
Champion automation initiatives such as Agentic AI, AIOps, and RPA to reduce manual effort and improve responsiveness.
-
Evaluate emerging operational technologies and lead their controlled adoption into production environments.
-
Contribute to the evolution of operations toward intelligent, self‑healing, and scalable service models.
The responsibilities and duties outlined above are not exhaustive and may evolve over time. The role may require additional tasks and responsibilities as assigned by the line manager or higher authorities, in alignment with organizational needs.
What We’re Looking For
Education:
-
Bachelor’s degree in Computer Science or Engineering
-
Master’s degree in Computer Science, Software Engineering
-
Proficient in English
-
Proficient in Arabic
-
Proficiency in Russian, Turkish and Central Asian languages is a plus
Experience:
-
Minimum of 10 years of experience as Software Engineer, System Engineer, Support Engineer, or Solution Architect
-
Minimum of 5 years of managerial experience
-
Good understanding of database systems (Azure Cosmos DB, Azure SQL Database)
-
Good understanding of networking & firewalls (F5, Palo Alto, Fortinet, Azure Front Door, Azure Networks)
-
Strong understanding of Azure cloud infrastructure (Azure IaaS, Azure Blob Storage, Azure Compute, Azure PaaS, Azure Web App, Azure AKS, Azure Functions, Azure Logic Apps, Azure DevOps)
-
Strong understanding of digital solutions, iOS apps, Android apps, web/portal technologies, API gateways (Mulesoft, Azure API Management)
-
Good understanding of MS Dynamics CRM and ERP
Why Join Us?
-
Opportunities for continuous learning, certifications, and professional development.
-
Dedicated to building a diverse and inclusive workplace where everyone feels valued and empowered to be their authentic selves.
-
Committed to building a stable, forward-thinking organization where innovation thrives.