Want to work in a dynamic environment with the latest cloud technologies? Want to learn Splunk from the inside and grow your career in exciting ways? Splunk is looking for self-starting individuals to be a part of the Splunk Incident Response Team (SIRT). The SIRT manages incidents that affect the availability and performance of Splunk platform and products for our customers globally. The SIRT is an always-on / always-active team making sure that each of our customers has an outstanding experience. We’re looking for an Incident Commander to join our team in supporting and supervising our ever-expanding product range.
As a member of the Splunk Incident Response Team, you will be responsible for leading cohesive response to high profile customer impacting incidents. In this role, you will be part of a team of global incident commanders responsible for managing high priority incidents from initial triage through to post incident review forums. This is a senior role at Splunk requiring an individual who can take charge in high stress situations and give direction to both customer personnel and to Splunk engineers to drive expeditious resolution of incidents. We are looking for a natural leader with proven knowledge of incident management frameworks, a demonstrable understanding of distributed systems environments and the ability to communicate clearly and effectively to technical and business audiences.
Responsibilities:
Use the Splunk Incident Management Process to restore normal service operations as quickly as possible thus reducing the Mean Time to Repair on business operations- Assemble and lead the response team using strong methodical troubleshooting techniques
- Capture and document key events and milestones during the lifecycle of the incident and communicate status accordingly to internal and external audiences as required
- Set clear incident resolution objectives (exit criteria) and timings
- Establish accurate expectations from response teams to ensure customer satisfaction throughout the process
- Supervise and manage incidents fully to ensure accurate information is captured
- Own Incident Commander responsibilities, contribute to post incident review, and follow through with action plans assigned to you
- Coordinate with global peers to hand-off active incidents using the follow-the-sun principle
- An eye for Continuous Service Improvement programs to drive more efficiency in People, Process and Technology in an effort to improve the Customer Experience
Requirements:
This is an opportunity for candidates with some incident management experience, excited about the technology and want to be part of a global team. You will be progressing your career in the Incident Management space with the support and tools to succeed.
You have a bachelor’s and 2+ years of major incident response and management experience or equivalent work experience- Have a clear understanding of the ITIL Incident framework
- You can think outside the box and work on multiple tasks simultaneously while dynamically prioritizing based on changing conditions
- Ability to work multi-functionally and to influence and execute across groups
- You enjoy problem solving and analyzing global-scale distributed systems
- You have outstanding interpersonal and communication skills
- You remain calm and collected in stressful situations, such as a major service outage
- You are willing to work a 4x10 hour weekly shift model including weekends and holidays
- Negotiation, mediation, and conflict management skills
- Strong leadership skills