Find your next role

Discover amazing opportunities across our network of companies committed to gender equality in the workplace.

Search

My job alerts

SRE

IBM

Operations

Bengaluru, Karnataka, India

Posted on Dec 21, 2024

Apply now

Introduction
At IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, lets talk.

Your Role and Responsibilities

In this Site Reliability Engineer role, you will work closely with entire IBM Cloud organization to maintain and operationally improve the IBM cloud infrastructure. You will focus on the following key responsibilities:

Ability to respond promptly to production issues and alerts 24×7
Execute changes in the production environment through automation
Implement and automate infrastructure solutions that support IBM Cloud products and services to reduce toil.
Partner with other SRE teams and program managers to deliver mission-critical services to IBM Cloud
Build new tools to improve automated resolution of production issues
Monitor, respond promptly to production alerts, Execute changes in Production through automation
Support the compliance and security integrity of the environment
Continually improve systems and processes regarding automation and monitoring

Required Technical and Professional Expertise

Excellent written and verbal communication skills.
Minimum 3+ years’ experience in handling large production systems environment
Must be extremely comfortable using and navigating within a Linux environment
Ability to do low level debugging and problem analysis by examining logs and running Unix commands
Must be efficient in writing and debugging scripts
3-5+ years of experience in Virtualization Technologies and Automation / Configuration Managements
- Automation and configuration management tools/solutions: Ansible, Python, bash, Terraform, GoLang etc. (at least one)
- Virtualization technologies: Citrix Xen Hypervisor (Preferred), KVM(also preferred), libvirt, VMware vSphere, etc. (at least one)
- Monitoring technologies: Zabbix, Sysdig, Grafana, Nagios, Splunk, etc. (at least one)
Working knowledge with Container technologies: Kubernetes, Docker, etc.
Flexibility to work on shifts to handle production systems

Preferred Technical and Professional Expertise

Good experience in public cloud platforms, Kubernetes clusters and Strong Linux skills for managing services across microservices platform, good SRE knowledge in Cloud Compute, Storage and Network services

Apply now

See more open positions at IBM

Privacy policy Cookie policy