EOS RPO
Senior software Engineer-
We are seeking a highly analytical and technically proficient Data Engineer to join our data platform team. In this role, you will be responsible for building high-performance data pipelines, optimizing complex transformations, and managing large-scale data movements between Oracle environments and modern big data ecosystems.
The ideal candidate is a Python and PySpark expert who can navigate the complexities of traditional relational databases while scaling data processing in a distributed environment.
### Key ResponsibilitiesData Pipeline Engineering: Design, develop, and deploy scalable ETL/ELT pipelines using Python and PySpark to process large, complex datasets.
Database Management: Act as a subject matter expert for Oracle Database integrations, including schema design, stored procedure optimization, and high-volume data ingestion.
Advanced SQL Development: Write and optimize complex SQL queries for data extraction, transformation, and analysis, ensuring peak performance across distributed systems.
ETL Orchestration: Build and maintain robust data integration workflows, ensuring data quality, consistency, and reliability across the enterprise data warehouse.
Performance Tuning: Conduct deep-dive performance tuning on Spark jobs and SQL queries to reduce processing time and optimize resource utilization.
Data Modeling: Collaborate with data architects to design and implement efficient data models that support both operational and analytical requirements.
Programming: Expert-level proficiency in Python for data engineering (including pandas, NumPy, and scripting).
Big Data Processing: Strong hands-on experience with PySpark (Spark SQL, DataFrames, and RDDs) in a production environment.
Database Mastery: Deep expertise in Oracle Database (PL/SQL, performance tuning, and indexing) and general relational database management.
SQL: Advanced SQL skills, including the ability to write complex analytical queries and window functions.
ETL/ELT Frameworks: Strong understanding of data integration patterns, change data capture (CDC), and batch/real-time processing.
Tools: Proficiency with version control (Git) and familiarity with orchestration tools (e.g., Airflow or Autosys).
Experience: 5+ years of experience in Data Engineering or a similar backend-heavy data role.
Cloud Exposure: Familiarity with cloud data platforms (Azure, AWS, or GCP) and how they integrate with on-premises Oracle systems.
Architecture: Understanding of Data Lake and Data Warehouse design principles (Star/Snowflake schema).
Problem Solving: Ability to troubleshoot distributed processing issues and resolve data bottleneck challenges.