Gurugram, Haryana (+1 other)
What You’ll do:
• You will be responsible for delivering high-value next-generation products on aggressive deadlines and will be required to write high-quality, highly optimized/high-performance and maintainable code
• Manage ETL/ELT pipelines of various Microservices
• Work on distributed/big-data system to build, release and maintain an Always On scalable data processing and reporting platform
• Work on relational and NoSQL databases
• Build scalable architectures for data storage, transformation and analysis
What makes you a great fit:
• Experience in writing pyspark for data transformation.
• Experience with detailed knowledge of data warehouse technical architectures, ETL/ ELT, reporting/analytic tools, and data security
• Experience leading data warehousing and analytics projects, including using AWS technologies – Redshift, S3, EC2, Data-pipeline, and other big data technologies
• Exposure in at least one ETL tool.
• Exposure in at least one reporting tool like… Redash/Tableau/similar will be a plus.
• Familiarity with Linux/Unix scripting
• Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS MapReduce, Sqoop, Storm, Spark, Kafka, Yarn, Oozie, and Zookeeper.
• Solid experience building APIs (REST), Java services, or Docker Microservices
• Experience with data pipelines using Apache Kafka, Storm, Spark, AWS Lambda or similar technologies
• Experience working with terabyte data sets using relational databases (RDBMS) and SQL
• Experience with Hadoop, MPP DB platform, other NoSQL (Mongo, Cassandra) technologies will be a big plus