The role is ideally suited to candidates with at least 10 years of experience in modern, enterprise IT infrastructure, and proficient in Agile principles and methodology.
Some scheduled out of hours required along with availability to participate in an out of hours support rota.
Establishing automation of operations tasks
Implementation, and customization of tools for application monitoring, performance, alerting & self-healing e.g., Nagios, Zabbix, SolarWinds, AppDynamics
Ensuring stable operations including backups, recovery & monitoring
Working with internal staff and 3rd party vendors to update and communicate environment maintenance schedules, refresh schedules, and planned outages
Working with internal staff and 3rd party vendors to troubleshoot and resolve system and application delivery issues
Working with globally distributed teams to ensure a 24 x 7 service
Ensure Infrastructure readiness for all environments (Testing, Production, Demo) based on business requirements
Application release deployments and upgrades (along with rollback) across all environments
Version control of software
Where possible implementing Infrastructure as Code (IaC)
Scripting and automating failure recovery processes
Continuous enhancement in application monitoring and self-healing capabilities
Assess and document existing architecture highlighting any gaps or shortcomings identified
Work with IT Operations Manager to define and document future non-functional requirements including:
o Environments o Performance o Service response times o Scalability o Resiliency and redundancy including Database (replication, multi-instance, globalization etc.)
Technical touchpoint for all Level 2+ technology challenges including for internal & external APIs
Job Requirements:
Traditional Unix/Linux hosting environment, preferably Ubuntu 20.
Cron, one or more programming/scripting languages
Web Services; Nginx, XML, Kafka (or similar messaging system)
Networking: Routers, NAT, DNS, Firewall rule sets
ELK stack / Papertrail or similar log management platform