Austin, TX, 78703, USA
2 days ago
Site Reliability Engineer - Performance Testing
The application window is expected to close on 3/20/25. Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received. Strong preference is for candidates to be located in Austin, TX, Bay Area Metro, Atlanta Metro, Ann Arbor, MI, Boston Metro, Fulton/DC Metro, and Research Triangle Park. Meet the Team Join us in making our services faster and more reliable! We foster a "kinder than necessary" culture, where collaboration, curiosity, and continuous learning drive our work. As part of this team, you’ll contribute across the stack—from code and infrastructure to database optimization. We work closely with engineering teams to refine service architecture, guide performance testing, and provide the tools and insights needed to optimize systems. We manage performance test infrastructure and an in-house load testing tool, enabling teams to run meaningful tests, benchmark services, and prevent performance degradation. Partnering with SRE, we stress-test infrastructure and offer strategic recommendations to enhance reliability. Our mission is to promote a culture that values performance, ensuring our services stay efficient, scalable, and resilient—all while supporting each other with empathy and respect. Your Impact As a Site Reliability Engineer you'll play a pivotal role in ensuring the reliability and scalability of our platform. You will be responsible for maintaining and enhancing our in-house load generation tool, managing our performance testing infrastructure, and collaborating closely with engineering and SRE teams. This role is exciting because you'll directly influence the performance of critical services by building testing frameworks, troubleshooting complex issues, and ensuring that we deliver high-quality, performant systems at scale. You'll get hands-on with cutting-edge technologies like Kubernetes, AWS, and observability tools, while also shaping testing strategies that align with service architectures. Key Responsibilities: * Maintain and Enhance Load Generation Tools: Oversee the management and continual improvement of our internal load generation tool, ensuring it meets the needs of our performance testing efforts. * Test Infrastructure Management: Manage and optimize our test infrastructure, built on Kubernetes (K8s) and EC2-based AWS deployments. Collaborate with other teams to ensure the infrastructure supports scalable, efficient testing. * Performance Test Planning & Execution: Work directly with engineering teams to develop detailed performance test plans tailored to specific services. Ensure the execution of these tests, track their progress, and resolve any issues that arise. * Tooling and Observability: Use observability tools like DataDog, OpenSearch, and Grafana to collect, analyze, and report on performance test metrics. Identify potential performance bottlenecks and work with teams to resolve them. * Python Scripting & Automation: Write performance tests and automation scripts in Python to validate service performance and scalability. Ensure tests are robust, efficient, and provide valuable insights. * Troubleshooting and Problem Resolution: Troubleshoot test failures, infrastructure issues, and performance bottlenecks in Kubernetes, EC2, and MySQL RDS environments. Ensure test environments are stable, and performance testing runs smoothly. * Collaboration & Partnerships: Partner with engineering teams to understand the architecture of services and develop test plans that align with their goals. Collaborate closely with SRE teams to performance test infrastructure components and ensure overall platform health. * Performance Reporting: Identify, report, and analyze any performance-related issues encountered during tests. Provide clear and actionable recommendations to improve service performance. Minimum Qualifications: * 5+ years of experience in performance testing, SDT (Software Development in Test), or infrastructure management. * Experience with Python for writing automated performance tests and tools. * Experience with Kubernetes (K8s), EC2, and AWS resources for deploying and managing test environments. * Professional work experience with MySQL RDS and cloud-based infrastructure, with a demonstrated ability to troubleshoot performance issues. * Experience with Argo Workflows for orchestrating tests in Kubernetes and using observability tools like DataDog, OpenSearch, and Grafana. Preferred Qualifications: * SDT Experience: Strong background in the principles and practices of software development in test to build robust, scalable testing solutions. * SRE Experience: Experience working with SRE teams on infrastructure and performance testing, contributing to overall system reliability and performance. * Software Engineering Background: Solid understanding of software engineering principles to help integrate performance testing effectively within the broader development lifecycle. * Experience with performance testing tools like JMeter, Gatling, or similar. * Experience with CI/CD pipelines and integrating performance testing into continuous integration processes. * Background in infrastructure or DevOps roles with expertise in cloud platforms like AWS and container orchestration tools. \#WeAreCisco #WeAreCisco where every individual brings their unique skills and perspectives together to pursue our purpose of powering an inclusive future for all. Our passion is connection—we celebrate our employees’ diverse set of backgrounds and focus on unlocking potential. Cisconians often experience one company, many careers where learning and development are encouraged and supported at every stage. Our technology, tools, and culture pioneered hybrid work trends, allowing all to not only give their best, but be their best. We understand our outstanding opportunity to bring communities together and at the heart of that is our people. One-third of Cisconians collaborate in our 30 employee resource organizations, called Inclusive Communities, to connect, foster belonging, learn to be informed allies, and make a difference. Dedicated paid time off to volunteer—80 hours each year—allows us to give back to causes we are passionate about, and nearly 86% do! Our purpose, driven by our people, is what makes us the worldwide leader in technology that powers the internet. Helping our customers reimagine their applications, secure their enterprise, transform their infrastructure, and meet their sustainability goals is what we do best. We ensure that every step we take is a step towards a more inclusive future for all. Take your next step and be you, with us! Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.
Confirm your E-mail: Send Email