Cloud Ops Manager
We're looking for an experienced Cloud Ops and Infrastructure manager to join our growing Dev & Cloud Ops group. The ideal candidate will be a person that has hands-on experience developing operational based tools, managing Production Environments and experience supporting highly available, large Scale web applications. Most importantly, the right individual will be highly motivated, with a passion for delivering technical solutions in a fast-paced environment and automating anything possible. The Ideal candidate would have experience managing multiple services on multiple Cloud platforms (AWS, GCP, Azure and more), experience delivering, maintaining and externalising high SLA, seasoned with deep understanding of customer impact and care.
In this role you will:
- Work with cutting edge technology in the cloud and hardware computing space
- Oversee and own our overall Production deployment, maintenance and enhancements processes, procedures, as well as availability, scalability, operability and assuring top notch SLA tracking.
- Oversee a team of engineers focused on introducing new technologies and systems, deploying services to multiple cloud environments and regions, and pushing our Production excellence and offering to the next levels.
- Solve technical problems, provide guidance to various teams (internal & external), and continually improve our systems, deployments, operations, and overall cloud activities.
- Collaborate in a DevOps environment where you will work closely with software developers, QA and E2E engineers, Global SRE team, and our Global Support group.
- You are a leader who pays attention to the details and ensures that major infrastructure projects are delivered on time and with urgency.
An ideal candidate would have advanced knowledge of:
- Fundamental knowledge of servers/computers hardware and software
- Experience working in cloud computing, virtualization and containers experience - Multiple cloud vendors, Docker, K8S in Production and more
- Excellent problem solving skills with a desire to take on responsibility
- Excellent written and verbal communication skills with ability to communicate technical issues to non-technical and technical audiences
- Advanced networking knowledge- Load balancers, firewalls, VPNs, TCP/IP - troubleshooting, performance tuning
- Web/Application servers - Apache, Nginx, Tomcat and so on
- Monitoring systems and SLA tracking
- Everything as code approach - at least 5 years of relevant work experience, including with Linux systems requiring the use of languages like Java, Python, Bash and so on
- 5+ years experience managing a fast paced production team, supporting infrastructure and applicative issues, deployments and maintenance.
- Previous work experience in a fast paced, ultra-growing start-up environment
- Experience using and administering software version control systems (SVN, Git, etc.)
- Familiarity with Atlassian Suite (Confluence, JIRA, etc.)
- Hands on experience administering and supporting high scale Production workloads
- Experience with automating systems maintenance at scale (thousands of servers)
- Solid understanding of current web and internet technologies like Apache, Tomcat, Nginx, CDN, DNS, Databases and so on
- Experience with managing large scale infrastructures with code - experience with tools like Ansible, Terraform and such - a plus.
- Experience with Helm Charts and deploying workloads into K8S production environments and systems - Great advantage