EOS RPO
Senior Software Engineer
Project title - Senior Software Engineer
What you will do
Architect Data & application Systems at Scale: Design large-scale, distributed data architectures capable of processing massive datasets, integrating AI-driven workflows, and supporting real-time and batch processing needs.
Drive System Design and Execution: Architect, develop, and deploy application and data processing engine — enabling autonomous orchestration of data pipelines, analytics, and business workflows.
Lead End-to-End Productionization : Own the full lifecycle — build, containerize, deploy, and maintain apps across environments using Kubernetes, Terraform, and CI/CD systems, spark.
Define and Enforce Engineering Standards: Set best practices around code quality, observability, scalability, and operational readiness for AI and data systems.
Guide Data and Infrastructure Decisions: Lead design discussions for data modeling, system integrations, and infrastructure architecture on Google Cloud Platform (GCP) and related technologies.
Enable Observability, Traceability & Reliability: Establish end-to-end monitoring and observability frameworks (e.g., Prometheus, Grafana, OpenTelemetry) for data services, ensuring deep traceability of agentic interactions.
Collaborate Across Teams and Domains:Partner with Product, Data Science teams to align Gen AI initiatives with data engineering objectives — ensuring production reliability and scalability.
Mentor and Grow Engineers: Provide technical mentorship to senior and mid-level engineers, fostering a culture of innovation, ownership, and operational excellence.
Champion Operational Efficiency: Identify and automate pain points in deployment, scaling, and monitoring. Drive continuous improvement in system reliability, performance, and developer productivity.
Own System-Level Decision Making: Evaluate new technologies, frameworks, and design patterns across LLMs and data infrastructure to guide the organization’s technical direction.
What you will bring
6+ years of experience in Software Engineering, with significant focus on Data systems, integration, and distributed architectures.
Deep expertise in Scala and distributed data frameworks like Spark data processing.
Solid understanding of Scala or Java for high-performance and backend workloads.
Expertise in Kubernetes (scaling, scheduling, service orchestration) and containerization practices.
Proficiency with Terraform for cloud infrastructure provisioning and management.
Deep understanding of CI/CD, automation, and cloud-native operations.
Working knowledge of GCP (BigQuery, Cloud Run, Vertex AI, GCS).
Strong experience designing for observability, monitoring, and traceability across microservices and AI agents.
Knowledge on Airflow or equivalent for data orchestration.
Familiarity with large language models (LLM)
Experience with retrieval-augmented generation (RAG) systems
Strategic mindset with the ability to balance technical depth and system-wide trade-offs.