Oklahoma City, OK, USA
57 days ago
Sr. Site Reliability Engineer

Site reliability engineers will be dedicated full-time to creating software tools, metrics and processes that improve the reliability of applications, sites, and systems in production. Primarily responsible for ensuring the integrity, functionality, and reliability of applications and sites. Additionally, the Senior Site Reliability engineer will mentor junior team members.

RESPONSIBILITIES

Architect solutions that that proactively reduce or eliminate errors and incidents in production systems. Review and approve software development and processes created by junior site reliability engineers. Review code and approve error logging and monitoring in new software development across all company developed applications. Take responsibility for removing, isolating, or remediating errors, debugs, warnings, or other kinds of messages from existing logs to improve overall log content and usefulness. Establish, implement, and track reliability metrics (MTTR, MTTD, MTBF) Effectively respond to escalated site reliability issues any time of the day while on-call. Conduct regular research on best practices and new technology for monitoring, alerting, error tracking and detection and application performance. Mentor and guide junior site reliability engineers
Confirm your E-mail: Send Email