Data EngineerES

The Role

Hybrid Theory’s continued growth has led to a need for a Data Engineer to join the team in our Data Hub in Barcelona. There are a number of new projects about to start or just started – including a complete re-write of our ETL and Machine Learning pipelines, and a restructuring of our reporting system – which require a passionate Data Engineer to contribute. This work will deliver enormous value to our product offering and position us as best in our industry.

Our ETL gathers data from our own services and various 3rd party sources. This data is then fed into our data lake and warehouse, where several processes extract value from it. There is an ongoing modernisation and rewriting effort in this area, mainly focused on Spark and SQL. To give a sense of scale, we consume around 25 billion event lines each month. These need to be validated, cleaned, matched with identifiers and then stored. As you can imagine, it is a data logistics and data management challenge, where you can find plenty of room for learning and contributing as you grow into the role.

You will work together with the lead data engineer and the rest of the data and engineering team to work in the areas mentioned above. You will help improve the design, optimise and model. You may write production level code in Python and SQL (with some Scala if you want to learn), interacting with our data lake (AWS S3, Databricks, Delta Lake), Kafka and our data warehouse (AWS Redshift) to clean, structure and populate data going into the DW and from the DW to the reporting subsystems. You will also be working extensively with AWS – we are heavy users of their services. If you are starting in any of these technologies, you will be mentored and trained.

What we need from you

  • Having worked professionally in some data environment or having some experience as a software engineer and wanting to grow into the data space.
  • Some programming abilities, with a preference for testing and validation. Knowledge of Python (or Scala) would be a plus, but you would be trained if you didn’t know something.
  • Some knowledge of SQL (up to types of joins) is a strong requirement. You would be trained on any additional use case (indexing, optimisation, columnar storage, windowing, key distribution…).
  • A basic understanding of Agile principles (we work using a full Scrum environment).
  • An interest in writing great code.
  • Open to collaboration and willing to learn.

What you’ll get from us

  • Being able to work with large scale datasets.
  • Getting to know a wide variety of technologies: we always try to use the best tool for the job.
  • Experience with cloud-based technologies.
  • Learn from a senior Agile team.
  • Grow into your responsibilities over time: become an expert.
  • Working remotely and interacting with our distributed team (mainly in Barcelona and London but with offices in the US, Singapore and Australia).
  • Flexible working hours, fitness fund and private health insurance.

Hybrid Theory is an established company with the feel of a start-up. We like to maintain and cultivate our company culture - flat structure, autonomy for employees, friendly and sociable team. Come and check out us!

About Hybrid Theory

At Hybrid Theory we use big-data to help companies find new customers.

We do this by harnessing technology and talent to power data-driven advertising across the customer journey, through expert operational and data management capabilities, deep customer and industry intelligence, and flawless campaign activation.

A global business with colleagues worldwide serving North America, EMEA and APAC regions. We are recognized for our capabilities through several award nominations and wins, including best Trading Team at the Drum Marketing Awards 2020. Our biggest recognition though is our 90% retention rate which has contributed to us enjoying substantial yearly growth.

In case you are interested, please contact data.careers at HybridTheory.com
with ref. Hybrid Theory - Data Engineer.
Keywords: Python, Scala, ETL, SQL, AWS, Kafka, Hadoop, Big Data, Agile, Scrum, Clean Code, TDD.