Role Proficiency:
Resolve complex trouble tickets spanning across different technologies and fine tune infrastructure for optimum performance and/or provide technical and people leadership (Hierarchical or Lateral)
Outcomes:
1) Mentor new team members in understanding customer infrastructure and processes2) Perform ticket data analysis for incident reduction3) Perform capacity planning based on increased demand 4) Perform root cause analysis to find out corrective and preventive actions after every major incidents and escalations5) Work on problem tickets for finding permanent solutions of repeated issues6) Review and approve roll out and roll back plan for change implementation and ensure adherence for preventing unauthorized changes7) Identify opportunities for continued service improvement and delivery excellence which contributes to cost and optimization benefits to customerMeasures of Outcomes:
1) SLA Adherence2) Time bound resolution of elevated tickets - OLA3) Manage ticket backlog timelines - OLA4) Adhere to defined process – Number of NCs in internal/external Audits5) Number of KB articles created6) Number of incidents and change ticket handled 7) Number of elevated tickets resolved8) Number of successful change tickets9) % Completion of all mandatory training requirementsOutputs Expected:
Resolution:
Understand Priority and Severity based on ITIL practiceresolve trouble ticket within agreed resolution SLA
Troubleshooting:
Escalation/Elevation:
L2
L3 etc)
adhere to OLA. Elevate to next level
work on elevated tickets from L1
Tickets Backlog/Resolution:
manage ticket backlogs/last activity as per defined process. Resolve incidents and SRs within agreed timelines. Execute change tickets for infrastructure
Installation:
software and patches
Runbook/KB:
Collaboration:
resolve L1 tickets with help from respective tower. Collaborate with other team members for timely resolution of tickets. Actively participate in team/organization-wide initiatives. Co-ordinate with UST ISMS teams for resolving connectivity related issues
Stakeholder Management:
Strategic:
policy management and data retention management. Support definition of the IT strategy for the function’s relevant scope and be accountable for ensuring the strategy is tracked
benchmarked and updated for the area owned.
Process Adherence:
Process/efficiency Improvement:
including coordination of function specific tasks and close collaboration with Finance.
Process Implementation:
Compliance:
interface to local organization
mitigation of findings
etc.) Work closely with ISRM (Information Security Risk Management). Coordinate overall objective setting preparation and facilitate process in order to achieve consistent objective setting in function Job Description. Coordination Support for CSI across all services in CIS and beyond.
Training:
Performance Management:
track
report and seek continues feedback from peers and manager. Set goals for team members and mentees and provide feedback. Assist new team members to understand the customer environment
day-to-day operations and people management
for example roster
transport and leaves. Prepare weekly/Monthly/Quarterly governance review slides.
Skill Examples:
1) Good communication skills (Written verbal and email etiquette) to interact with different teams and customers 2) Modify / Create runbooks based on suggested changes from juniors or newly identified steps3) Ability to work on an elevated server ticket and solve4) Networking:a. Trouble shooting skills in static and Dynamic routing protocolsb. Should be capable of running netflow analyzers in different product lines5) Servera. Skills in installing and configuring active directory DNS DHCP DFS IIS patch managementb. Excellent troubleshooting skills in various technologies like AD replication DNS issues etc.c. Skills in managing high availability solutions like failover clustering Vmware clustering etc.6) Storage and Back upa. Ability to give recommendations to customers. Perform Storage & backup enhancements. Perform change management.b. Skilled in in core fabric technology Storage design and implementation. Hands on experience on backup and storage Command Line Interfacesc. Perform Hardware upgrades firmware upgrades Vulnerability remediation storage & backup commissioning and de-commissioning replication setup and management.d. Skilled in server Network and virtualization technologies. Integration of virtualization storage and backup technologiese. Review the technical diagrams architecture diagrams and modify the SOP and documentations based on business requirements.f. Ability to perform the ITSM functions for storage & backup team and review the quality of ITSM process followed by the team.7) Clouda. Skilled in any one of the cloud technologies - AWS Azure GCP.8) Toolsa. Skilled in administration and configuration of monitoring tools like CA UIM SCOM Solarwinds Nagios ServiceNow etcb. Skilled in SQL scriptingc. Skilled in building Custom Reports on Availability and performance of IT infrastructure building based on the customer requirements9) Monitoringa. Skills in monitoring of infrastructure and application components10) Databasea. Data modeling and database design Database schema creation and managementb. Identify the data integrity violations so that only accurate and appropriate data is entered and maintained.c. Backup and recoveryd. Web-specific tech expertise for e-Biz Cloud etc. Examples of this type of technology include XML CGI Java Ruby firewalls SSL and so on.e. Migrating database instances to new hardware and new versions of software from on premise to cloud based databases and vice versa.11) Quality Analysis:a. Ability to drive service excellence and continuous improvement within the framework defined by IT OperationsKnowledge Examples:
1) Good understanding of customer infrastructure and related CIs.
2) ITIL Foundation certification3) Thorough hardware knowledge 4) Basic understanding of capacity planning5) Basic understanding of storage and backup6) Networking:a. Hands-on experience in Routers switches and Firewallsb. Should have minimum knowledge and hands-on with BGPc. Good understanding in Load balancers and WAN optimizersd. Advance back and restore knowledge in backup tools7) Server:a. Basic to intermediate powershell / BASH/Python scripting knowledge and demonstrated experience in script based tasksb. Knowledge of AD group policy management group policy tools and troubleshooting GPO sc. Basic AD object creation DNS concepts DHCP DFSd. Knowledge with tools like SCCM SCOM administration8) Storage and Backup:a. Subject Matter Expert in any of the Storage and Backup technology9) Tools:a. Proficient in the understanding and troubleshooting of Windows and Linux family of operating systems10) Monitoring:a. Strong knowledge in ITIL process and functions11) Database:a. Knowledge in general database management b. Knowledge in OS System and networking skillsAdditional Comments:
We are looking for an experienced Service Owner – AWS Operation and Engineering to lead and manage enterprise AWS cloud infrastructure and services. This role is responsible for ensuring high availability, security, compliance, and cost optimization while aligning AWS operations with business and IT objectives, particularly in the BFSI (Banking, Financial Services, and Insurance) domain.
Key Responsibilities: 1. AWS Operations & Service Ownership:Define and execute the AWS Operation & Engineering strategy in alignment with business needs.
Oversee the design, deployment, administration, and optimization of AWS environments.
Ensure high availability, reliability, and scalability of AWS workloads and services.
2. Security, Compliance & Governance:Implement and maintain AWS security best practices, IAM, role-based access controls (RBAC), and compliance frameworks (PCI-DSS, ISO 27001, NIST, GDPR).
Work closely with security teams to ensure cloud security, monitoring, and threat mitigation.
Enforce data protection, encryption, and backup policies using AWS-native services.
3. AWS Automation & DevOps:Drive automation using Terraform, CloudFormation, AWS CLI, and AWS CDK.
Manage CI/CD pipelines using AWS DevOps tools (CodePipeline, CodeBuild, CodeDeploy, CodeCommit).
Optimize AWS auto-scaling, infrastructure as code (IaC), and container orchestration (EKS, ECS, Fargate).
4. Networking & Hybrid Cloud Management:Manage AWS VPC, Direct Connect, Transit Gateway, Route 53, and VPN solutions.
Support hybrid cloud connectivity between AWS and on-premises data centers.
5. Performance Monitoring & Cost Optimization:Utilize AWS CloudWatch, CloudTrail, and X-Ray to monitor and optimize performance.
Implement AWS Cost Explorer, Budgets, and FinOps practices to optimize spending.
6. High Availability & Disaster Recovery (DR):Develop and implement AWS DR solutions using AWS Backup, Route 53 Failover, and Multi-Region architectures.
Ensure business continuity planning (BCP) and DR testing for AWS workloads.
7. Incident & Problem Management:Lead resolution of AWS-related incidents, perform root cause analysis (RCA), and drive continuous improvement.
Implement ITIL best practices for incident, problem, and change management.
8. Stakeholder & Vendor Management:Collaborate with CIOs, cloud architects, security teams, and application owners to ensure smooth AWS operations.
Manage AWS service providers and vendors, ensuring SLAs are met.
Required Qualifications & Experience: Education & Certification:Bachelor’s/Master’s degree in Computer Science, Cloud Engineering, or a related field.
Preferred certifications:
AWS Certified Solutions Architect – Professional
AWS Certified DevOps Engineer – Professional
AWS Certified Security – Specialty
Experience:12+ years of experience in IT infrastructure/cloud operations, with 5+ years in AWS cloud engineering and leadership roles.
Strong expertise in AWS IaaS/PaaS services, Kubernetes (EKS), and microservices architectures.
Deep knowledge of AWS networking (VPC, Load Balancers, AWS Firewall Manager, Transit Gateway, Route 53).
Experience with multi-cloud strategies and hybrid cloud architectures.
Strong understanding of SIEM solutions, SOC operations, and cybersecurity frameworks.
Technical Expertise:Hands-on experience with AWS Lambda, API Gateway, Step Functions, EventBridge, and serverless architectures.
Proficiency in Infrastructure as Code (Terraform, CloudFormation, Ansible, or Pulumi).
Expertise in AWS Security Hub, GuardDuty, Macie, AWS WAF, and AWS Shield.
Experience managing AWS RDS (PostgreSQL, MySQL, SQL Server, Aurora), DynamoDB, and NoSQL databases.
Soft Skills & Leadership:Strong team leadership, stakeholder engagement, and problem-solving skills.
Ability to drive automation, cost savings, and operational excellence.
Excellent communication skills to collaborate with technical and non-technical stakeholders.
Preferred Skills:Knowledge of Zero Trust Architecture (ZTA) and Secure Access Service Edge (SASE).
Experience with AI-driven cloud optimization and predictive analytics.
Familiarity with ITSM tools (ServiceNow, Remedy) and DevSecOps methodologies.