Bangalore, Karnataka, India
62 days ago
Data Engineer, 3

Job Description & Responsibilities

Develop data products, infrastructure and data pipelines leveraging AWS and Databricks/Snowflake Data Platform ecosystem or prior experience with services such as Redshift, Kinesis, EMR, Lambda etc. and tools  such as Glue, Apache Spark, Job Scheduler etc. Designing and providing support for ETL / ELT / File Movement of data using Databricks or Snowflake with PySpark,Python and Spark SQL. Develop new data models and end to data pipelines. Working with Leads and Architects on developing robust and scalable data pipelines to ingest, transform, and analyse large volumes of structured and unstructured data from diverse data sources.  Experience with Pipeline optimisation for performance, reliability, and scalability. Contributes to initiatives to enhance data quality, governance and security across the organisation, ensuring compliance with guidelines and industry best practices. Builds innovative solutions to acquiring and enriching data from a variety of data sources.  Conducts insightful and physical database design, designs key and indexing schemes and designs partitioning.  Participates in building and testing business continuity & disaster recovery procedures per requirements

What we are looking for

Bachelor’s degree in Computer Science /Information Systems/Engineering/related field  6-8  years of experience in Data Platform Ecosystem of Databricks/Snowflake or Lakehouse Ecosystem Prior experience in Apache Spark performance tuning and debugging  Experience in Workflow Scheduler eg. KubeFlow, AirFlow, Oozie etc.  SQL experience eg. SparkSql, Impala, BigQuery, Presto/Trino, StarRocks etc.  Experience debugging and reasoning about production issues is desirable  Analytical problem-solving capabilities & experience.  Strong knowledge of Modern Data Warehouse technologies Extensive knowledge of cloud technologies such as AWS and GCP. Excellent SQL and python skills. Experience in deploying and scheduling code bases in a data development environment. Comfortable working alongside cross-functional teams interacting with Product Managers, Infrastructure Engineers, Data Architects Monitor system performance and troubleshoot any issues Perform incident investigation, diagnosis and provide resolution

Preferred Skills

Delta Lake Unity Catalog, PySpark,Spark SQL, Scala, Cloud Services Platform (e.g., GCP, or AZURE, or AWS), and AGILE concepts

 

Confirm your E-mail: Send Email