EY Outage Management Specialist in United States

Outage Management Specialist

Core Business Services

Requisition # UNI00BHE

Post Date May 08, 2018

Join our Core Business Services (CBS) team and you will help support the important business enablement functions that keep our organization running strong. As a CBS professional, you will work across teams to provide the knowledge, resources and tools that help EY deliver exceptional quality service to our clients, win in the marketplace and support EY’s growth and profitability. Major teams within CBS include Finance, Information Technology, Human Resources, Enterprise Support Services, Brand Marketing and Communications, Business Development, Knowledge and Risk Management.

With so many offerings, you have the opportunity to develop your career through a broad scope of engagements, mentoring and formal learning. That’s how we develop outstanding leaders who team to deliver on our promises to all of our stakeholders, and in so doing, play a critical role in building a better working world for our people, for our clients and for our communities. Sound interesting? Well this is just the beginning. Because whenever you join, however long you stay, the exceptional EY experience lasts a lifetime.

Job Summary:

The Outage Specialist drives the restoration and recovery of high priority and severity (P1Sx) incidents as a member of the Outage Management Center which operates under the overall remit of Remote Support Services. The role manages, communicates and documents all restoration activities supporting P1Sx incidents. This includes escalation to other Information Technology (IT) teams, 3rd party vendors and other external teams for collaboration.

Position requires the consultant be able to work Monday through Friday, 16:00 – 01:00 GMT.

The role is generally an individual contributor but carries high exposure within IT and as such influences management throughout IT Services, to guide and direct collaborative activities necessary to restore all services to Business As Usual (BAU). The role is generally guided by the Outage Shift Lead for Outage Management.

Essential Functions of the Job:

Drives the validation, declaration and remediation of P1Sx incidents assigned to the Outage Management Center:

  • Verifies the specific incident impact and, as required, the declaration of P1Sx in accordance with industry standards of Information Technology Infrastructure Library (ITIL) for major incident management within the IT infrastructure

  • Drives the major incident recovery process, identifying and initiating time critical and complex remediation activities to restore impacted infrastructure services to BAU

  • Identifies, monitors and reinforces appropriate recovery actions within the agreed timelines of the Operation Level Agreements (OLA) for P1Sx incidents

  • Escalates to other IT teams and external vendors as needed

  • Assembles appropriate parties and chairs the P1Sx chat and/or conference call, maintaining a sense of urgency throughout and acting as an escalation point to expedite service restoration

  • Regularly communicates outage status, from inception to completion, to all key stakeholders

  • Ensures appropriate updates are made to all knowledge repositories for changes/gaps (identified) during an Outage

  • Coaching of junior staff, providing guidance for their activities and constructive feedback as needed and required to maintain the highest level of quality standards in the team’s efforts

Analytical/Decision Making Responsibilities:

The role requires a strong analytical acumen and solution orientation to influence and guide coordinated efforts for P1Sx incidents within the IT Infrastructure and to avoid potential financial and operational risks to IT and EY’s customers. The role possesses the proper critical thinking skills to take a systematic approach, consider all relevant data and make informed decisions regarding all aspects of the Outage Management function while driving quality in restorative activities. The role is expected to act independently, guided by the Priority and Severity criteria, to make timely and informed decisions aligned to the role’s remit or escalate as appropriate to management or others in IT as needed. The role’s insight and proposed actions should strive to be strategic, look beyond ‘stop gap’ measures and seek opportunities to drive improvements in the Outage Management model.

Knowledge and Skills Requirements:

  • Maintains advanced interpersonal and collaborative skills to engage strategically with peers and other senior executives of the firm, in cross business discussions as part of incident recovery activities and to build and maintain a solid network for effective teamwork and knowledge sharing. Uses these communication skills to challenge insightfully, to direct Major Incident Management and remediation to processes, propose credible options as solutions, and position GCSM’s role in quality technical service management and IT’s infrastructure business support.

  • Projects well-defined consultative skill to conduct effective questioning to break down complex issues into core elements, formulate appropriate ideas and negotiate those ideas clearly and concisely to advance a cooperative engagement by all levels of the organization including senior and/or executive management.

  • Manifests a strong analytical and problem solving ability to negotiate complex and conflicting issues in a timely manner, handle ambiguity as well as multiple and shifting priorities and to drive GCSM related decisions that are both financially sound and operationally feasible.

  • Adapts communication style to the style of others. Develops rapport and remains calm under pressure. Builds strong relationships across all levels of a matrixed, geographically and culturally dispersed organization utilizing advanced oral and written English communication skills.

  • Maintains a broad knowledge of the regional and global infrastructures, application technology and IT’s overall technical operating environment to support proper recognition and impact of major service issues and to position improvement opportunities as part of GCSM’s technical support.

  • Possesses the proper time and incident management disciplines to direct and influence improvement initiatives and drive aligned activities throughout the incident life cycle for major incidents within the IT infrastructure.

  • Possesses an advanced knowledge in Major Incident Management and specifically the Priority and Severity (PxSX) criteria to recognize the impact and aligned criteria to scale an Incident up to and including P1S1’s (Major Incidents). Recognizes where criteria is met and leadership escalation is needed

  • Possesses a solid working knowledge of the Information Technology Infrastructure Library (ITIL) to recognize appropriate aspects in the Incident, Problem, and Change processes and improvements. Supports the team in ITIL familiarity and/or certification for staff as needed or necessary.

  • Manifests appropriate leadership and influence management skills to guide aligned resources in major incident recovery activities, providing the sound planning and the ability to coach others in technical processes and practices as part of individual or team mentoring as appropriate

Supervision Responsibilities:

The role is an individual contributor but is expected to influence and direct ad-hoc teams comprised of cross IT technical staff aligned to the recovery effort on a high visibility incident. All aligned staff can be remote based and/or in a work from home setting that will require distance management skills across locations, cultures and time zones. The role itself is generally guided by the Outage Management Lead in GCSM.

Other Requirements:

The role may also require the periodic allocation of additional time on the job to support multiple demands and escalating issues or to accommodate teams or staff in other time zones.

Job Requirements:


o Degree in Computer Science or a related discipline


o 3-4 years’ experience of ITIL Major Incident Management and co-ordination in a large organization

Certification Requirements:

o ITIL Foundational level

Suggested Additional Certification:

o ITIL intermediate level, preferably in Service Operation

o Kepner Tragoe High Severity Incident Management

o Microsoft Certified Desk Top Support Technician (MCDST) or similar technical qualification