About the Role -
As a Data Engineer in the Data Infrastructure team, you will build platforms and tools that churn through, process & analyze petabytes of data and lead a robust team. You will work on technologies such as Apache Kafka, Apache Spark, Aerospike, Redshift to build a scalable infrastructure that delivers recommendations to our users in real-time.
The pace of our growth is incredible – if you want to tackle hard and interesting problems at scale, and create an impact within an entrepreneurial environment, join us!
Your Key Responsibilities -
You will work closely with Software Engineers & ML engineers to build data infrastructure that fuels the needs of multiple teams, systems and products
You will automate manual processes, optimize data-delivery and build the infrastructure required for optimal extraction, transformation and loading of data required for a wide variety of use-cases using SQL/Spark
You will build stream processing pipelines and tools to support a vast variety of analytics and audit use-cases
You will continuously evaluate relevant technologies, influence and drive architecture and design discussions
You will work in cross-functional team and collaborate with peers during entire SDLC
What to Bring -
BE/B.Tech/BS/MS/PhD in Computer Science or a related field (ideal)
Minimum 4 years of work experience building data warehouse and BI systems
Strong Java skills
Experience in either Go or Python (plus to have)
Experience in Apache Spark, Hadoop, Redshift, Athena
Strong understanding of database and storage fundamentals
Experience with the AWS stack
Ability to create data-flow design and write complex SQL / Spark based transformations
Experience working on real time streaming data pipelines using Spark Streaming or Storm.