Madrid
17 days ago
Senior Data Engineer
Edelman is a voice synonymous with trust, reimagining a future where the currency of communication is action. Our culture thrives on three promises: boldness is possibility, empathy is progress, and curiosity is momentum. 
At Edelman, we understand diversity, equity, inclusion and belonging (DEIB) transform our colleagues, our company, our clients, and our communities. We are in relentless pursuit of an equitable and inspiring workplace that is respectful of all, reflects and represents the world in which we live, and fosters trust, collaboration and belonging.
We are currently seeking a Senior Data Engineer with 5-7 years of experience. The ideal candidate will have the ability to work independently within an AGILE working environment and experience working with cloud infrastructure leveraging tools such as Apache Airflow, Databricks, DBT, and Snowflake. Familiarity with real-time data processing and AI implementation, including generative AI, is highly advantageous.
Why You'll Love Working With Us:At Edelman, we believe in fostering a collaborative and open environment where every team member’s voice is valued. Our data engineering team thrives on innovation and embraces cutting-edge technologies to solve real-world challenges.
We are at an exciting point in our journey, leveraging Generative AI (GenAI), Large Language Models (LLMs), and advanced Retrieval-Augmented Generation (RAG) techniques to build intelligent, data-driven systems that deliver powerful PR insights. You'll also work on developing agentic workflows that autonomously orchestrate tasks, enabling scalable and dynamic solutions.
Our data stack is modern and efficient, designed to process large-scale information, automate analysis pipelines, and integrate seamlessly with AI-driven workflows. This is an excellent opportunity to make a significant impact on projects that push the boundaries of AI-powered insights and automation.
If you're passionate about building high-performance data systems, working with cutting-edge AI frameworks, and solving complex challenges in a supportive, forward-thinking environment, you'll thrive here!Responsibilities:Design, build, and maintain scalable and robust data pipelines to support analytics and machine learning models, ensuring high data quality and reliability for both batch & real-time use cases.Design, maintain, and optimize data models and data structures in tools such as Snowflake and Databricks.Leverage Databricks and Cloud-native solutions for big data processing, ensuring efficient management of Spark jobs and seamless integration with other data services.Utilize PySpark and/or Ray to build and scale distributed computing tasks, enhancing the performance of machine learning model training and inference processes.Monitor, troubleshoot, and resolve issues within data pipelines and infrastructure, implementing best practices for data engineering and continuous improvement.Integrate generative AI capabilities into data pipelines and workflows to support advanced use cases such as data augmentation, automated content generation, and natural language processing.Collaborate with machine learning engineers to optimize generative AI workflows, ensuring seamless deployment and scalability in production environments.Develop APIs and tools to enable internal teams to consume generative AI models and services efficiently.Stay informed about advancements in generative AI technologies and recommend their adoption to improve business processes and analytics capabilities.Diagrammatically document data engineering workflows and generative AI integrations.Collaborate with other Data Engineers, Product Owners, Software Developers, and Machine Learning Engineers to implement new product features by understanding their needs and delivering on time.Qualifications:Minimum of 5 years of experience deploying enterprise-level scalable data engineering solutions.Strong examples of independently developed data pipelines end-to-end, from problem formulation, raw data, to implementation, optimization, and results.Proven track record of building and managing scalable cloud-based infrastructure on AWS (incl. S3, Dynamo DB, EMR).Experience implementing and managing AI model lifecycles in production, including generative AI models.Familiarity with tools like OpenAI API, Hugging Face Transformers, or equivalent platforms for generative AI.Strong experience using Apache Airflow (or equivalent), Snowflake, and Lucene-based search engines.Advanced SQL and Python knowledge with associated coding experience.Experience with Databricks (Delta format, Unity Catalog).Strong experience with DevOps practices for continuous integration and continuous delivery (CI/CD).Experience wrangling structured & unstructured file formats (Parquet, CSV, JSON).Understanding and implementation of best practices within ETL and ELT processes.Data quality best practice implementation using tools like Great Expectations.Real-time data processing experience using Apache Kafka (or equivalent) is advantageous.Knowledge of generative AI model architectures and their integration into scalable systems.Proven ability to work independently with minimal supervision.Takes initiative and is action-focused.Mentors and shares knowledge with junior team members.Strong ability to collaborate within cross-functional teams.Excellent communication skills with the ability to communicate with stakeholders across varying interest groups.Fluency in spoken and written English.#LI-RT9
We are dedicated to building a diverse, inclusive, and authentic workplace, so if you’re excited about this role but your experience doesn’t perfectly align with every qualification, we encourage you to apply anyway. You may be just the right candidate for this or other roles.
Confirm your E-mail: Send Email