EOS RPO

Senior System Operations Engineer -Application Support

Posted Apr 17, 2026
Project ID: R-533029
Location
Bangalore, karnatka
Hours/week
45 hrs/week

In this role, you will:

  • Lead or participate in managing all installed systems and infrastructure within the Systems Operations functional area

  • Contribute in increasing system efficiencies and lowering the human intervention time on related tasks

  • Review and analyze moderately complex operational support systems, application software, and system management tools to ensure the highest levels of systems and infrastructure availability

  • Work with vendors and other technical personnel for problem resolution

  • Lead team to meet technical deliverables while leveraging solid understanding of technical process controls or standards

  • Collaborate with vendors and other technical personnel to resolve technical issues and achieve highest levels of systems and infrastructure availability

Required Qualifications:

  • 4+ years of Systems Engineering, Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education

Desired Qualifications:

  • 4+ years in Production Support / SRE / DevOps / Platform Operations for business-critical applications.

  • Proven track record supporting 24x7 platforms with strict SLAs and high availability requirements.

  • Experience working in ITIL-aligned environments (Incident, Problem, Change).

  • Strong troubleshooting skills across Linux/Unix, system processes, CPU/memory, threads, disk, network basics.

  • Working knowledge of application architectures: microservices, distributed systems, batch + online workloads.

  • Proficiency in log analysis and observability tools (e.g., Splunk/ELK, Grafana, Prometheus, AppDynamics, Dynatrace—any equivalent).

  • Solid understanding of HTTP, TLS, DNS, load balancing, reverse proxy, and typical failure patterns (timeouts, 503/504, connection pool saturation).

  • Hands-on with databases (Oracle / Postgres / SQL Server etc.): query basics, locks, slow queries, connection pooling, indexing concepts.

  • Familiarity with messaging/streaming systems (Kafka/RabbitMQ) and troubleshooting lag/offset/consumer issues (good-to-have).

  • Ability to write scripts for automation in Python / Shell / PowerShell.

  • Comfortable with runbooks, automation tools, CI/CD basics, and reducing manual toil. Understanding of SLO/SLI, monitoring, alert tuning, and reliability best practices.

  • Strong incident handling skills: triage, mitigation, communication, and structured follow-through.

  • Knowledge of RCA techniques (5 Whys, fishbone, timeline-based analysis) and converting findings into preventive actions.

  • Experience with change management and release support; able to assess risk and enforce operational readiness.

  • Excellent written and verbal communication for stakeholder updates (technical + business-friendly). Ability to collaborate across Dev, QA, DBAs, Network, Cloud/Infra teams.

  • Calm under pressure, structured thinker, strong ownership. Bias for root-cause and prevention over repeated firefighting. High attention to detail and commitment to operational excellence.


Similar jobs

+ Search all jobs