Join JPMorgan Chase and unleash your potential in defining the industry of site reliability. Rise to the challenge as a technology pioneer and trailblazer with your expertise.
As a Manager of Software and Site Reliability Engineering at JPMorgan Chase within the Risk Technology team, you oversee your team’s daily activities and help align priorities to the goals of the team. You work with key stakeholders and product owners to translate availability targets for products and services into service level indicators and service level objectives while enforcing error budgets. You build designs and leverage existing solutions to execute and deliver. You are a key influencer in your organization’s strategy and contribute to overall planning.
Job responsibilities
Proactively contributes to knowledge-sharing across the JPMorgan Chase technology community Demonstrates personal and professional resilience while navigating difficult situations with composure and tact Champions site reliability culture and practices and influences the adoption of a site reliability customer-centric mindset Educates Product Owners, Dev Leads, and Developers on service level objectives, service level indicators, and error budget policies Leads your application or platform stakeholders in establishing reasonable service level objectives and error budgets with your customers Manages the day-to-day activities of your team and escalate issues to necessary stakeholders in a timely fashion Engages team members using relevant listening strategies and questioning techniques to help identify solutions and influence action plans Promotes safety, fairness, respect, and trust across the firm and within your team Shares relevant technical knowledge to projects of moderate complexity
Required qualifications, capabilities, and skills
Formal training or certification on Software Engineering or Site Reliability Engineering concepts and 5+ years applied experience. Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices Demonstrated ability to drive the adoption of site reliability culture and practices within a team Experience in leading small teams of technologists and managing projects Ability to lead a small team to look for roadblocks, run tests, and solve related problems Fluency in at least one programming language such as (e.g., Python, Java Spring Boot, .Net, etc.) Proficiency in software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.) Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.) Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker, etc.) Experience with troubleshooting common networking technologies and issues Experience in public cloud, preferably AWS (EKS, EMR, S3, SQS , Athena, RDS)
Preferred qualifications, capabilities, and skills
Ability to code and demonstrate data fluency