About the Team

Our Operations function is part of the Tech Team responsible for the reliability, performance, and security of the infrastructure and platforms that power GTI’s digital products. Spanning on-premises infrastructure, cloud environments, and managed services, the team ensures that GTI’s systems remain available, performant, and resilient across the UK and beyond.

Working closely with Product Engineers, Software Engineers, and wider Technology teams, the Operations team manages the databases, monitoring, networking, and cloud infrastructure that underpin GTI’s platforms. The team takes a proactive approach to observability, incident management, and continuous improvement, building the operational foundations that enable delivery teams to move with confidence and speed.

As cloud hybrid, AI and automation continue to reshape infrastructure management, the team continuously evolves its tooling, processes, and capabilities to improve reliability, security, and efficiency across GTI’s growing platform estate.

About this Role

As an Operations Engineer, you will play a central role in managing and improving the databases, infrastructure and operational systems that keep GTI’s platforms running across on-premises and AWS cloud environments.

Initially your main responsibility will be the administration and optimisation of GTI’s database estate spanning MySQL, MSSQL, and PostgreSQL across both clustered on-premises deployments and AWS alongside managing cloud backups, monitoring infrastructure, and the broader operational environment that includes Linux systems, Windows and VMware virtualisation. Knowledge of AWS services would be beneficial but an active interest in learning AWS services will be fine as you will be taught and learn in the role.

This is a hands-on technical role requiring solid database administration experience, good infrastructure engineering capability around Linux, Windows, scripting, backups, and monitoring. You will be required to respond to take ownership of operational tasks and improvements across the stack, ensuring systems remain performant, secure, and well-monitored which includes out of hours response to critical events. The role also requires out of hours work to update and patch critical systems to minimise impact to our customers.

The ideal candidate will be technically well-rounded, highly dependable, and genuinely motivated by the challenge of keeping complex, distributed systems running at their best. They will bring a proactive mindset to monitoring, incident response, and continuous improvement, and will be comfortable working across a broad range of infrastructure and cloud technologies.

Key Metrics of Success

High availability, security and performance of database systems
Reliable and tested backup coverage across all critical systems, with clear recovery objectives met
Comprehensive monitoring coverage across infrastructure, databases, and services
Stable, well-maintained VMware, Linux and Windows environments with low unplanned downtime
Proactive identification and resolution of infrastructure risks before they impact product availability

Database Reliability & Performance

Outcome: GTI’s database estate operates reliably, performantly, and securely across all environments.

High availability, security and performance maintained across MySQL, MSSQL, and PostgreSQL environments
Databases managed, monitored, and optimised
Database incidents identified and resolved quickly, minimising impact on product availability
Database configurations, schemas, and access controls kept secure and up to date

Backup & Disaster Recovery

Outcome: All critical systems and data are reliably backed up with recovery processes that are tested and trusted.

Backup coverage maintained across all critical databases and infrastructure components
Backup processes automated, monitored, and regularly validated through recovery testing
Clear recovery time and recovery point objectives defined and met for all critical systems
Backup and recovery documentation kept current and accessible

Monitoring & Observability

Outcome: Infrastructure, databases, and services are comprehensively monitored, enabling fast detection and response to issues.

Monitoring servers and services maintained and continuously improved
Monitoring coverage extended across new systems and services
Alerting thresholds and escalation paths kept accurate, reducing noise and improving signal quality
Operational visibility improved through dashboards, reporting, and proactive health checks

General Operations & Infrastructure Stability

Outcome: GTI’s Linux, Windows, VMware, and networking environments remain stable, secure, and well-maintained.

Linux & Windows systems patched, maintained, and performing reliably across the server estate
VMware virtualisation environment managed effectively, with capacity and performance kept in check

Innovation, Ownership & Continuous Improvement

Outcome: Operations Engineering continuously improves reliability, efficiency, and capability across GTI’s infrastructure estate.

Proactive identification of risks, inefficiencies, and opportunities for operational improvement
Automation of routine operational tasks to reduce manual effort and improve consistency
Infrastructure documentation kept accurate, comprehensive, and accessible to the wider team
Continuous development of technical skills and adoption of modern operational practices

Methodical, dependable, and highly attentive to detail
Proactive approach to monitoring, incident response, and continuous improvement
Comfortable managing a broad range of technologies across environments
Ownership mindset – takes responsibility for systems and follows through on problems
Comfortable working in fast-paced environments where priorities can shift
Genuine interest in cloud technologies, automation, and modern operational practices
Collaborative team player who contributes to a culture of reliability and continuous improvement

Work Experience, Knowledge & Skills

Database administration experience across MySQL, MSSQL, and PostgreSQL
Experience managing clustered database environments
Experience configuring and managing cloud backup solutions
Hands-on experience with monitoring platforms
Strong Linux systems administration skills
Experience with VMware virtualisation environments
Working knowledge of networking concepts including DNS, routing and firewalls
Knowledge of AWS services would be beneficial but an active interest in learning AWS services will be fine as you will be taught and learn in the role.
Experience working with scripting or automation tooling is advantageous
Familiarity with change management processes and operational documentation practices
Technologies you’ll be using
Ubuntu Linux
MySQL, MS SQL, Postgres
Bash/python scripting
VMWare
Dell hardware
Cisco Meraki, pfSense
AWS Aurora/RDS, NLB/ALB, Route 53, DMS, SES, s3, Lambda, CloudFront
IaC: Puppet, Terraform, Cloud Formation
Prometheus, Grafana, Loki, CheckMK
GitLab
MS Entra/AD

Operations Engineer

Summary

Required Skills

Details

Description