At CVS Health, we're building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care.
As the nation's leading health solutions company, we reach millions of Americans through our local presence, digital channels and more than 300,000 purpose-driven colleagues - caring for people where, when and how they choose in a way that is uniquely more connected, more convenient and more compassionate. And we do it all with heart, each and every day.
Position Summary
As CVS Health continues to grow, we are looking for experienced leaders to ensure our systems remain stable and performant as we scale. The Lead Director of Observability Engineering is a critical leadership role within Solutions Engineering and Infrastructure, reporting directly to the Executive Director of Observability and Performance. This position is responsible for leading a team of engineers focused on building next generation Observability systems to enable effective collection, consolidation, and analysis of metrics, events, logs & traces, generate performance and stability insights, and forecast system and environmental capacity needs as part of the broader Enterprise Observability team. In addition to helping define the roadmap for the next 3-5 years, you will be interacting with many other managers and their teams at CVS Health who rely on the Observability ecosystem to deliver stable and scalable services to our customers.
Responsibilities
Program Development and Modernization -Develop a plan to rationalize and modernize observability platforms, delivering an efficient observability ecosystem that meets the unique needs of CVS Health. Spearhead technology enablement for the transition of services from numerous legacy platforms, improving operational visibility and predictability. This involves designing and implementing complex solutions to collect, process, and manage structured and unstructured data at massive scale, optimizing built and purchased platforms to ensure efficient and performant operations, and ensuring solutions align with the organization's goals.
Team Leadership and Mentoring: Provide guidance and leadership to the Observability Engineering team. This involves hiring and developing talent, mentoring and supporting team members, assigning tasks, and ensuring projects are on track. Foster collaboration and knowledge sharing within the team.
Architecture and Design: Help define the overall architecture of the Observability environment, including observability standards, data models, integrations, and security controls. ensure our platforms are scalable, reliable, and aligned with best practices. Leverage open source and commercial software to deliver and maintain resilient, reliable, cost-effective platforms tailored to the needs of CVS Health.
Project Management: Engage executives, department heads, and IT teams to plan, execute, and oversee Observability projects. This includes planning, coordinating, and overseeing systems implementation and service transition projects, managing project timelines and resources, and confirming that deliverables are met within budget and scope. This also involves communicating project progress to stakeholders.
Stakeholder Management - Cultivate and maintain relationships with application owners ensuring our observability standards, products, and services continue to meet their evolving needs. Establish partnerships to identify opportunities for automation, innovation, and operational excellence.
Required Qualifications
1. 10+ years of experience Leading Software Development teams developing and managing applications for IT operations, SRE, logging and/or observability, with at least 5 years in a leadership role within a large enterprise (Fortune 100).
2. 10 + years' of experience designing, developing, and implementing observability systems for large-scale, distributed systems, encompassing legacy and modern technologies; Experience leading a major logging and observability platform migration. Demonstrable experience building custom monitoring solutions.
3. Proven experience building and implementing operational data models. Experience designing and deploying data lakes and data pipelines at massive scale in an enterprise environment. Experience with enterprise demand analysis, capacity planning, and performance engineering.
4. Deep knowledge of, and experience with on-premises infrastructure, cloud infrastructure, and application architectures; Strong background in cloud-native technologies and architectures (e.g., Kubernetes, Docker, microservices) and an understanding of the unique challenges they pose to observability.
5. Proven experience developing automation solutions and workflows for deployment, event correlation, and incident remediation.
6. Experience and/or expertise with the following: