As one of the UK’s leading digital retailers, Sainsbury’s Argos generates a lot of data and we are always looking for better ways to dig into this data to find new insights and new information that will allow us to improve our understanding of our customers’ needs and to optimise every aspect of our business.
The next stage of our journey is to make use of the very latest Big Data tools from the Hadoop ecosystem to accelerate our ability to understand our ever-increasing amounts of data.
To help with this journey, we are looking for a Senior Data Engineer with recent experience in Big Data development, but also experience with traditional data and data warehousing environments.
The role requires someone who can provide design leadership for the existing Data Engineers and to help guide their work, providing technical assistance where needed and someone who can review and verify the quality of the delivered work.
The role also requires combining existing internal and external sources of data to construct higher order data structures that allow specialist Reporting, Analytic and Machine Learning teams to generate actionable intelligence for the wider business.
As well as bringing technical skills, the ideal candidate is someone who can work with people throughout the company. This person needs to work with source system teams to ensure a full understanding of the data being loaded and to build relationships with the consumers of the data to ensure that we are creating the right data structures for their needs.
The person can be based in either London or Milton Keynes, but able to travel occasionally to the alternate office and will share support responsibility for the operational system.
Please note that this is not a Hadoop Administrator role.
Taking charge of the design and implementation of data structures being built within the Sainsbury’s Argos Big Data Platform
Organising the Data Engineers in the team as they load, cleanse, combine and remodel the data so it can be consumed by users across the business
Design Authority for development work and reviewing the work delivered by other team members
Being a Subject Matter Expert on data storage within a Hadoop Big Data Platform
Working with Source System owners to understand their data and to integrate it into the Big Data Platform
Working with data consumers (Machine Learning, Data Scientists and Reporting Analysts) to understand their requirements and to create the optimal data structures for their needs
Ensuring that security is always applied to the data to ensure compliance with all the relevant security requirements
Balancing the needs of ad-hoc analysis with the scheduled processing of data for the wider business community
Providing thought leadership on all aspects of Big Data and to be constantly looking for new ways to enhance our data
The core Hadoop infrastructure, including HDFS, YARN, Hive, Impala, Sqoop and Flume. Exposure to the Cloudera stack is a bonus.
In-memory computing with Spark and HBase.
Using Apache Kafka as a streaming data source.
Programming experience in Python or Java and Linux Shell scripting.
Data Loading with ETL tools such as Talend or Informatica.
NoSQL products such as MongoDB.
Cloud computing with AWS, including working with S3 and EMR.
Traditional database development with one or more of the major databases - SQL Server, Oracle, Teradata or MySQL.
Data Warehousing and the creation of OLAP Dimensions and Facts from OLTP data sources.
Data cleansing, data enrichment and reconciliation with source systems.
Data modelling with experience in creating optimal data structures.
Query performance tuning.
An understanding of Data Security, Encryption and GDPR
Working within an Agile development team.