Job Description
Position: Incident Manager
Description:
Engineer III - Major Incident and Event Manager :
Main Responsibilities: - Undertakes immediate efforts to ensure effective and rapid response and restoration (Crisis/ P1 / P2).
- Advocate for Tier 2 and Tier 3 technical teams, and business units.
- Researches, identifies, and proposes viable solutions for major incident process.
- Perform incident management functions per Information Technology Infrastructure Library (ITIL) and serves as the incident owner throughout the lifecycle.
- Research issues and escalations, convening escalation bridges with appropriate Tier 2 and Tier 3 groups as necessary.
- Develops, tracks, and presents key Incident Management metrics.
- Deconstructs major incidents to identify issue lifecycle versus root cause.
- Coordinates identification and resolution of major incidents with resolvers.
- Obtains and documents accurate updates on the work being done to resolve the outage.
- Documents/updates appropriate communications, phone portals and service portals wherever applicable during outage.
- Coordinates the logistics around and conducts related audits of major incidents, including sample selection, documentation, and communication of results.
- Ensures compliance with requirements, processes, and procedures. Ensures timely completion, management, and control of deliverables.
- Ensures conformance to and provides high level of expertise on incident tool(s), knowledge management tool(s) and quality management tool(s), processes, and procedures.
- Performs as technical evaluator for support plans and Knowledge Articles for known issues. Reviews and makes recommendations of improvements to knowledge management documentation.
- Contributes analysis and documentation to Known Error Database.
- Interprets and implements incident standards and requirements.
- Adheres to and maintains high levels of expertise in all incident management support processes, procedures, and expectations established by management.
- Assists with the updating of SOPs, work instructions, checklists, and various other documents.
- Accountable for supporting the strategic planning and design of the Monitoring & Event Management framework.
- Ensure the integration, correlation, and consolidation of events across domains is standardized and centralized in the global event management platform (AIOps).
- Identify opportunities for standardization and process improvement, with goal of enhancing the customer experience.
- Proactively collaborate with all service owners (esp. CX, Domains and Managed Service Providers) to ensure that the event management framework meets the expectations of all key stakeholders.
- Proactively identifies training opportunities to execute on the organization’s overall goals.
- Meets or exceeds all Goals and Objectives and Service Level Targets.
- Provides input to senior team members regarding outage related actions/activities.
- Work on-call hours that would include 24/7 coverage per the SOPs.
Requirements: Experience Requirements: - 7+ years of experience in Critical Incident Management.
- 5 years of experience in ITIL Event Management.
- Demonstrated experience using ServiceNow ITSM (Incident, Major incident and Event Management) products.
- Practical experience designing, implementing, and supporting ITIL improvements.
Educational Background: - Bachelor’s degree in Computer Science, Information Management or similar technical field from an accredited institution required.
- Significant experience may be considered in lieu of degree: minimum of twelve (12) years of relevant work experience required.
Professional Certification: - ITIL v3 foundation or higher.
Knowledge/Understanding of: - Major ITSM processes incl. Critical Incident management, Problem management, Event Management and Request Management.
- Current business practices and computing systems, IT development methodologies and operations.
- Program and project management and planning, process mapping.
- Healthcare issues, information systems, management issues, and current trends.
- Conceptualizing business strategies while implementing information systems and technology strategic direction.
Skills: - Highly tenacious, combined with high stress resistance.
- Uses logic, methods, and tools to solve problems with effective solutions.
- Ability to coordinate and drive conference calls.
- Excellent organizational and time management skills.
- Displays basic Project and Problem Management skills and abilities.
- Ability to recognize errors and correct to meet organizational standards.
- Ability to troubleshoot problems and work with other groups to find solutions.
- Extremely detail oriented.
- Capability of multi-tasking, managing multiple events simultaneously.
- Proven ability to analyze and report on various levels of data and metrics.
- Ability to follow outlined processes and procedures.
- Ability to communicate effectively and diplomatically across all levels of the organization.
- Ability to follow verbal and written instructions.
- Ability to work independently with little supervision.
Abilities: - Be a subject matter expert in a complex fast-paced business environment.
- Present issues and challenges in senior management forums.
- Work with a team of professionals from various disciplines.
- Lead through times of change, disruption, and growth.
Benefits: Commitment to Team Culture: - Participate in the company Team Culture.
- Develops professional and effective working relationships built on respect and cooperation with Team Members, Members, and associates outside of the organization.
Other Potential Benefits: - Opportunity to work with cutting-edge global event management platform (AIOps).
- Professional growth and training opportunities.
- Dynamic, challenging, and collaborative environment.
- 24/7 schedule flexibility and variety in workday (on-call rotations).
Job Tags
Work experience placement, Immediate start,