Find your next role
Discover amazing opportunities across our network of companies committed to gender equality in the workplace.
IBM
At IBM Infrastructure & Technology, we design and operate the systems that keep the world running. From high-resiliency mainframes and hybrid cloud platforms to networking, automation, and site reliability. Our teams ensure the performance, security, and scalability that clients and industries depend on every day. Working in Infrastructure & Technology means tackling complex challenges with curiosity and collaboration. You’ll work with diverse technologies and colleagues worldwide to deliver resilient, future-ready solutions that power innovation. With continuous learning, career growth, and a supportive culture, IBM provides the opportunities to build expertise and shape the infrastructure that drives progress.
As an Availability Manager for IBM Cloud, you will act as an Incident Commander leading real-time cloud Services critical impacting events engaging expertise from a variety of IBM areas (Infrastructure, Software, Consulting, etc.) and subcontractors/vendors. You will employ specific processes and tools to drive the response team to mitigation of Critical Impacting Events (CIEs) for IBM Cloud’s platforms and services.
Additionally, you will play the role of a Problem Manager and throughout the RCA process lifecycle you will provide formal executive level post-mortem coaching feedback and Client Impact Reports (CIR) updates to our Services. This requires extensive research ability, logical cause and effect analysis and the tenacity to ask probing questions to uncover the underlying causes, trends, and weaknesses in the environment. We are a learning organization, meaning every situation gives us information that we will use to continually evolve our people, process, and technology/tooling strategy.
Your primary responsibilities will include:
• Lead and steer Incident Management driving to Service resolution.
o Perform situational appraisal, assess CIE severity/priority and appraise User impact extent from client-facing teams
o Mobilize and coordinate recovery efforts across necessary support functions, personnel and leadership to expedite end-to-end troubleshooting, fault domain isolation and urgent resolution
o Escalate and engage senior leaders to expedite handling and resolution
o Maintain a multi-tiered plan of action tracking time-bound deliverables and actions
o Maintain a heightened level of sensitivity for future / potential business impact and risk to customers
o Record and maintain incident record with recovery process noting areas of improvement
o Provide timely, concise and clear client and internal leadership and stakeholders centric communications
• Availability Manager focal for key customer account
o Acquire understanding of the customer’s business needs and challenges by engaging with them regularly and gathering.
o Advanced knowledge of the customer's contract specifics and customization in order to adapt service delivery accordingly.
o Build customer partnership and consultancy as well as direct engagement on high priority incidents.
• Train, coach, and review proper Problem Management with the problem owners.
o Identify areas of improvement for problem owners to target problem resolution and identify additional areas to the overall time to resolution.
o Utilize tooling and technical knowledge to assure services and components are designed and delivered to meet their availability targets.
o Provide a holistic view of the clouds environment and make recommendations to improve overall service.
o Identify and/or lead Service Improvement Programs (SIP) for chronic conditions
o Maintain focus on time-bound deliverables and actions cloud-availability-manage
• Perform incident and alerting trend and pattern analysis, collaborating across teams to proactively detect and address emerging instability across the platform and services to drive holistic, preventative stability improvements.
• Focuses on individual/team objectives and development of professional effectiveness.
• Lead strategic areas of importance to the service team.
• Recognized as incident and problem management thought leaders and subject matter experts
• Enterprise incident command and control.
• Understanding of industry methodologies (5 Whys Root Cause Methodology, Failure Modesand Effects Analysis, Kepner-Tregoe, etc.)
• Fundamental and/or working knowledge of Cloud technologies
• Knowledge and experience working with any number of enterprise technologies including but not limited to Compute (Server/ OS), Database, Network, Storage, Middleware, Perimeter Security (Firewall, VPN, Host / Application Security)
• Working knowledge and experience with Service Now.
• ITIL V4 proficient.
Soft skills / abilities required for you to be successful in this role include:
• Critical Thinking, Problem Solving, Active Listening and Deductive Reasoning
• Leadership – Capacity, Capability and Competency (“Leaders inspire others to take action”)
• Command and Control presence
• Ability to “command the room” in a professional manner
• Ability and confidence to act decisively and take constructive feedback onboard
• Exercise influence over others across various levels of the organization (manage up, downand across)
• Ability to multi-task effectively and make sound judgments in a dynamic and high impact setting
• Capable of constructively challenging assumptions and information that does not reflect accurately on the situation at hand
• Excellent phone / video presence and written / verbal communication skills
• Strong relationship management and client-centric mindset
• Ability to work on-call rotations, during the business time-frames and occasional weekends, holidays and after hours.
• Communicates fluently in English verbally and written
• Professional working proficiency in French verbally and written
• ITIL V4 certification
• Kepner-Tregoe certification
• Working knowledge of Financial Services
In a world where technology never stands still, we understand that, dedication to our clients success, innovation that matters, and trust and personal responsibility in all our relationships, lives in what we do as IBMers as we strive to be the catalyst that makes the world work better.
Being an IBMer means you’ll be able to learn and develop yourself and your career, you’ll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.
Our IBMers are growth minded, always staying curious, open to feedback and learning new information and skills to constantly transform themselves and our company. They are trusted to provide on-going feedback to help other IBMers grow, as well as collaborate with colleagues keeping in mind a team focused approach to include different perspectives to drive exceptional outcomes for our customers. The courage our IBMers have to make critical decisions everyday is essential to IBM becoming the catalyst for progress, always embracing challenges with resources they have to hand, a can-do attitude and always striving for an outcome focused approach within everything that they do.
Are you ready to be an IBMer?
IBM’s greatest invention is the IBMer. We believe that through the application of intelligence, reason and science, we can improve business, society and the human condition, bringing the power of an open hybrid cloud and AI strategy to life for our clients and partners around the world.
Restlessly reinventing since 1911, we are not only one of the largest corporate organizations in the world, we’re also one of the biggest technology and consulting employers, with many of the Fortune 500 companies relying on the IBM Cloud to run their business.
At IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing and blockchain. Now it’s time for you to join us on our journey to being a responsible technology innovator and a force for good in the world.
IBM is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, genetics, pregnancy, disability, neurodivergence, age, or other characteristics protected by the applicable law. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
For additional information about location requirements, please discuss with the recruiter following submission of your application.