Bangalore
208 days ago
Principal Site Reliability Engineer
Who are we?
Smarsh empowers its customers to manage risk and unleash intelligence in their digital communications. Our growing community of over 6500 organizations in regulated industries counts on Smarsh every day to help them spot compliance, legal or reputational risks in 80+ communication channels before those risks become regulatory fines or headlines.  Relentless innovation has fueled our journey to consistent leadership recognition from analysts like Gartner and Forrester, and our sustained, aggressive growth has landed Smarsh in the annual Inc. 5000 list of fastest-growing American companies since 2008.
About the team
Are you an SRE with excellent Observability, Containerization and Orchestration skills? As a Site Reliability Engineer (SRE) in the Smarsh SaaS Operations team, you'll be part of a team who measures and improves production performance reliability through sustainable engineering practices for our suite of applications. Toil will be your number one enemy, observability your closest friend and your mission will be to drive operational burden as close to zero as you can.ResponsibilitiesResponsible for technical direction at the platform solutions level. Is able to weigh the pros and cons of various solutions and credibly argue for the best pathWork closely with Product Management and the rest of the engineering team to define features and their implementations with careful attention to quality, scalability, and maintainabilityCan break down complex technical solutions into abstractions that the rest of the team and understandCan investigate and solve complex bugs, performance, and scalability issuesCollaborates with multiple agile teams to ensure their solutions integrate effectivelyTrack work in ticketing system (JIRA)Participate in Pull Request reviews. Provide and receive feedback to continuously improve. Other duties as assigned.Desired skills & experienceA minimum 10+ years industry experienceMasters in CS or equivalentMust have experience in Azure or AWS, either running some large-scale app there or migrating to Azure/AWS. Experience operating Cloud Foundry in production environments Experience managing CI/CD systems (Concourse, Jenkins, TravisCI etc.) Experience deploying and/or operating ELK stack Experience with container technologies and orchestration platforms (Docker, Kubernetes, Cloud Foundry) Experience working with monitoring and observability tools (We use Datadog and New Relic) Familiarity with working with PostgreSQL and MongoDB Background working in a multi-platform environment (Linux, Windows) Experience with running on a cloud platform, AWS preferred (S3, RDS, SQS) Familiarity with Agile/Scrum/Kanban methodologies Familiarity of programming/scripting languages (ie. Python, Bash, PowerShell, Go, etc.) Additional SkillsExpert programming skills in relevant languagesExceptional analytical and problem-solving skillsStrong communication and collaboration skillsDeep understanding of modern software architectureDeep domain knowledge of the industry, platform, and existing processesFault-tolerant design & maintenanceKnowledge and understanding of modern software programming/engineering.Product delivery lifecycle - requirement refinement through opsAbout our culture
Smarsh hires lifelong learners with a passion for innovating with purpose, humility and humor. Collaboration is at the heart of everything we do. We work closely with the most popular communications platforms and the world’s leading cloud infrastructure platforms. We use the latest in AI/ML technology to help our customers break new ground at scale. We are a global organization that values diversity, and we believe that providing opportunities for everyone to be their authentic self is key to our success. Smarsh leadership, culture, and commitment to developing our people have all garnered Comparably.com Best Places to Work Awards. Come join us and find out what the best work of your career looks like.
Confirm your E-mail: Send Email