Data Engineer

Posted 4 months ago
Gurugram, Haryana
Application deadline closed.

Job Description

Data Engineer


Gurugram, Haryana

What You’ll Do:
• Build ETLs and data processing workflows for various projects
• Migration of Hadoop, MongoDB, and Mysql to AWS( S3, Glue, EMR, Kinesis, Lambda, Athena, Quicksight)
• Build Data Marts for different Business Streams using Python, spark, Unix, Mysql, Hive
• Maintaining HDFS cluster and Storing data over HDFS
• Maintaining any issues in Hadoop Administration (Data Node additions, removal of Data Nodes, Performance maintenance of Hadoop Services(Hive, Spark, etc)
• Build data quality tracking mechanisms and address inevitable disruptions in data ingestion and processing
• Address questions and concerns from Data consumers (Data Science and Analytics)
• Building Data Lake in AWS and near real-time pipelines using Kinesis/Kafka/spark streaming

What Makes You A Great Fit:
• Experience with and detailed knowledge of data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools
• Experience in Data Engineering (Data Warehousing, Data… Mart, Data Modelling, Facts, Dimensions, Star/Snowflake Schema)
• Experience working with SQL and No SQL Databases/query processing like Mysql, Hive, Athena and MongoDB
• Good programming skills in Python, Java, or Scala. (Python preferred)
• Experience building products in a cloud-based environment, especially AWS and its services like EC2, Lambda, S3, EMR, S3, Glue, EMR, Kinesis, Athena, Quicksight, Cloudwatch, etc
• Experience working on Spark and MapReduce.
• Experience working on orchestration tools like Airflow
• Experience in maintaining Hadoop Cluster(Hortonworks preferred)
• In-depth knowledge of Linux/UNIX to process large data sets
• Knowledge of Optimisation techniques of Spark, Hive, Athena
• Working knowledge of Kafka or Spark Streaming