Own your opportunity to work alongside federal civilian agencies. Make an impact by providing services that help the government ensure the well being of U.S. citizens.
Job DescriptionCloud Developer Sr Advisor
GDIT is looking to hire a lead Site Reliability Engineer (SRE) to help take a cloud team to the next level. You will work with the government and other team members to identify and assist in enhancing the reliability of this agency's core cloud infrastructure.
As an SRE you will act as an Account Manager for core AWS accounts responsible for overseeing services running in this agencies infrastructure AWS accounts.
HOW A SITE RELIABILITY ENGINEER WILL MAKE AN IMPACT
You will need to develop a deep understanding of how systems inter-operate within the infrastructure, including upstream and downstream dependencies.
Responsible for reviewing all AWS infrastructure deployments to identify upstream and downstream impacts and ensure test processes fully validate feature and integration.
Ensure that monitoring, logging, and alerting for services running in core infrastructure accounts are properly configured and provide actionable information.
In collaboration with government stakeholders, develop and maintain a logging and monitoring strategy for the infrastructure platform.
Conduct and coordinate 5 Y’s and other blameless post-mortem activities in the event of an incident.
Participate in continuous improvement activities such as technical debt analysis, and contributing to the reliability standards and practices of the team
Work with team DevOps engineers to improve deployment process and introduce automated testing.
Audit resources in accounts under your responsibility; identify areas for improvement or technical debt and collaborate with program and government partners to prioritize.
Assist the cloud infrastructure team and other teams in troubleshooting wide area integration issues
Commit changes to our infrastructure codebase as necessary
WHAT YOU’LL NEED TO SUCCEED:
Required Experience: 10+ years AWS infrastructure design and deployment. 3+ years in an SRE role working in complex systems.
Required Technical Skills: IaC background including CDK or CloudFormation. Lead experience configuring and using logging and monitoring systems including CloudWatch, Splunk or Instana.
Required Skills and Abilities: Ability to analyze infrastructure dependencies. Experience overseeing infrastructure deployments including developing testing procedures. Strong communication skills. Ability to work with government stakeholders. Prior experience in a cross-cutting SRE role.
Preferred Skills: AWS Solutions Architect Professional or DevOps Engineer Professional Certification.
Location: Remote with on-site client meetings
GDIT IS YOUR PLACE:
Full-flex work week to own your priorities at work and at home
401K with company match
Comprehensive health and wellness packages
Internal mobility team dedicated to helping you own your career
Professional growth opportunities including paid education and certifications
Cutting-edge technology you can learn from