Resilience and Reliability Architect
Job Description
Site Reliability Engineering (SRE) Architect / Consultant (M)
Company Overview
Join EY and help shape the future with confidence. EY empowers professionals through globally connected teams, advanced technology, and a collaborative work culture focused on innovation, flexibility, and growth.
Position
Site Reliability Engineering (SRE) Architect / Consultant (M)
Role Summary
As an SRE Architect / Consultant, you will help enterprises modernize IT operations by integrating software engineering principles into service delivery. The role focuses on improving system reliability, resilience, scalability, automation, and operational efficiency while optimizing infrastructure costs.
Key Responsibilities
Define and implement SLA/SLO/SLI frameworks for products and services.
Design resilient and scalable system architectures across product lifecycles.
Develop and implement observability solutions to monitor reliability and performance.
Automate operational processes to reduce manual effort and improve efficiency.
Optimize IT infrastructure and operations costs using FinOps practices.
Assess organizational SRE maturity and create strategic roadmaps for improvement.
Troubleshoot performance, scalability, and availability issues.
Analyze thread dumps, heap dumps, and JVM performance metrics.
Collaborate with cross-functional teams to improve platform reliability and operational excellence.
Required Skills & Experience
12+ years of experience in software engineering, IT operations, or infrastructure reliability.
Hands-on experience with:
Java / J2EE
Apache Tomcat or IBM HTTP Server
Tomcat / WebSphere
Oracle or similar RDBMS
Strong experience with CI/CD and IaC tools:
Azure DevOps
GitLab CI/CD
Jenkins
Terraform
AWS CloudFormation
Ansible
Expertise in cloud and container technologies:
AWS / Azure / GCP
Docker
Kubernetes
OpenShift
Experience with observability and monitoring tools:
Dynatrace
AppDynamics
Splunk
ELK Stack
Strong Linux (RHEL) administration and monitoring knowledge.
Experience with REST APIs, SOA, ESB, and microservices architecture.
Knowledge of Java performance tuning, JVM optimization, and application server tuning.
Familiarity with automation scripting languages such as Python.
Experience with Git, Jira, and Confluence.
Preferred Skills
AI/ML and Data Analytics exposure.
Experience with cloud-native architectures and Spring Boot.
Understanding of queuing models, threading, and system scalability concepts.
What EY Offers
Collaborative and inclusive work environment.
Hybrid work model with flexibility.
Learning and development opportunities.
Competitive compensation and benefits.
Global exposure and career growth opportunities.
Equal Opportunity Statement
EY is an equal opportunity employer committed to diversity, inclusion, and accessibility.