Sr. Site Reliability Engineer
IT, Development Operations & Security (London-UK)
Our mission at Zynga is to connect the world through games by building games around core social experiences to deliver deep player engagement, organic acquisition and long term retention. Our portfolio of games include – CSR Racing 2, FarmVille, Hit it Rich! Slots, Words With Friends and Zynga Poker.
Based at our at NaturalMotion Games studio in London, renowned for making games that wow people. For the past three years NaturalMotion Studios have been named one of the Best Places to Work in the UK games industry by gamesindustry.biz.
Hiring Update: The safety of our candidates and team members is our top priority. During the COVID-19 pandemic, our workforce transitioned to working from home, with all interviewing and onboarding being conducted virtually until further notice.
As a Senior Site Reliability Engineer, you will be dedicated to strategizing, creating, and implementing reliability systems throughout various spaces in our infrastructure. You will be assisting with day-to-day requests and offering expertise to junior engineers - special projects will also be a key part of this role. Fixing critical issues and responding to incidents are expected on a rotating on-call basis.
Key Roles & Responsibilities
Monitoring & Incident Management:
- Improve the studio’s reliability through monitoring, rapid response, communication and coordination.
- Develop and manage the deployment architecture for the application, develop the monitoring architecture and implement monitoring agents, dashboards, escalations and alerts
- Routinely identify operational problems by observing and studying system architecture, functionality and performance results. Troubleshoot procedures with studio architect and investigate surfaced issues; and handling incidents
- Identify operational priorities by assessing operational objectives. Determine project objectives; such as; efficiency, cost savings, energy conservation, operator convenience, safety, environmental quality; estimating relevance, time, and costs
Development & Data Analysing:
- Develop operational solutions by defining, studying, estimating, and screening alternative solutions; calculating economics; determining impact on all systems
- Create new tools to facilitate automated monitoring of the studio’s operational environment
- Anticipate operational problems by studying operating targets, modes of operation, unit limitations; monitoring unit performance
- Improve operational quality results by studying, evaluating, and recommending process re architecting, implementing changes, contributing information and opinion to unit design and modification teams
- Provide operational management information by collecting, analysing, and summarizing operating and engineering data and trends
- Update job knowledge by participating in educational opportunities; reading professional publications; maintaining personal networks; participating in professional organisations
- Accomplish engineering and organisation mission by completing related results as needed
Required Skills & Experience
- Mastery of Systems Linux and Networking administration
- High level understanding of Linux/Unix operating systems
- Strong systems engineering and troubleshooting skills
- Strong understanding of TCP/IP, SSL, DNS
- Ability to create and maintain technical documentation
- Good understanding of webserver configuration and management (Apache, Nginx)
- Knowledge in Load Balancing concepts
- Experience with service performance monitoring and automation
- Experience with systems and application security
- Ability to analyse and troubleshoot in networking, performance, system and infrastructure issues using Linux/Unix standard tools.
- Ability to administer networking firewalls
- Cloud Management: AWS Expertise (EC2,VPC,S3, RDS, Route53 Integration (DNS),Code deploy,IAM,ACM)
- Monitoring Systems
- Nagios, Sensu, Grafana, Munin, Check_MK, Cloudwatch, and/or DataDog
- Backend - Graphite, Prometheus, influxdb
- Writing checks & scripts in various languages if needed
- Log/Application Level (Splunk, Elastic Search, Apache)
- Ability to diagnose infrastructure as a whole
- Database fundamentals
- Administer and maintain MySQL and other open source databases
- Write and perform basic queries to evaluate database stability, integrity and performance
- Scripting: Shell scripting (BASH) and Python
- Configuration management
- Chef or Ansible. Puppet
- Provisioning - Packer, Terraform , Could Formation
- Containerisation - Docker swarm or kubernetes or AWS ecs, eks, CI/CD Jenkins, AWS CI/CD
- Source code management (preferably Git or SVN)
Desirable Skills & Experience
- Basic knowledge of containers. I.E [Docker/Kubernetes]
- Python, PHP, BASH
- ITIL Standard Practices for Change Management
- NoSql databases (Couchbase, mongodb, etc)
What do we offer?
- Competitive salary, discretionary annual bonus scheme and Zynga stock allowance
- Contributory pension scheme
- 25 days holiday, plus public holidays and Christmas shutdown
- Private medical care and healthcare cash plan
- Life insurance and critical illness insurance
- Discounted gym membership and free weekly yoga class
- Flexible working hours
- Free fruit, snacks & soft drinks provided daily, as well as free lunch on Fridays
- Annual season ticket loan and cycle to work scheme
- Summer and Christmas parties and Happy Hour every Friday
We are an equal opportunity employer and we are committed to building a diverse and talented workforce. We do not discriminate on the basis of race, sex, religion, colour, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, medical condition, disability, or any other class or characteristic protected by applicable law. We welcome job-seekers, players, employees, and partners from all backgrounds to join us!
We will consider all qualified job-seekers with criminal histories in a manner consistent with applicable law.
We are committed to providing reasonable accommodations to qualified individuals with physical or mental disabilities in order to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us at accommodationRequest@zynga.com to request an accommodation associated with your application for an open position.