Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
OCI Data Services has a portfolio of OLTP Services, such as, Oracle DB cloud infrastructure, Key Value nosql store, PostgreSQL, Cache, Aries Storage Platform for ADB-S. The key priorities include - driving large customer adoption; critical features that customers need; rapid build out of data services in new regions; Strategic initiatives like multi-cloud support and Operational Excellence to handle unexpected events. We have need for a Senior Principal SRE engineers (focused on cloud Reliability) in the team.
Job Description:
Strong SRE (Software Reliability Engineer) skills with prior cloud SRE experience (or SDE with strong passion for reliability); ability to work on multiple cloud services in parallel 10+ years’ experience delivering and operating large scale, highly available cloud services Strong knowledge of Java or Python or C++ or C# etc. Proficient with data structures, algorithms, operating systems, distributed systems fundamentals, networking protocols (TCP/IP, HTTP) and standard network architectures Deep understanding of databases, NoSQL systems, storage and distributed persistence technologies Strong understanding of Linux Strong problem Root Cause Analysis (RCA), problem resolution, performance tuning skills, incident management and incident resolution skills Experience with Oracle Databases (or equivalent) or building multi-tenant, virtualized infrastructure and/or working on open source softwareCareer Level - IC5