Bengaluru, Karnataka (+1 other)
The data engineer is focused on making data correct and accessible, and building scalable systems to access/process it. Another major responsibility is helping AI/ML Engineers write better code. Data engineers also build scalable, high performance data intensive services.
– We have data pipelines processing aggregate and statistical data. Should we store this
in Redshift, in flat files in S3, or somewhere else?
– How should we structure our data pipelines?
– We need to track various data points to identify our customers in various locations,
including from different devices, and determine that two seemingly disparate users are
actually the same. How can we do this efficiently and effectively?
Your job is to understand what we’re trying to build, make informed choices about this and then get us onboard.
Ideal candidate will be skilled at processing/querying medium sized data sets, configuring/understanding medium scale deploys of big data systems… (Spark, Dask, Kafka, etc), and building custom tools when needed.
Example interview questions:
– Consider the query `SELECT * FROM foo INNER JOIN bar ON foo.x = bar.x WHERE foo.primary_key = ?`. What happens if you run this in Postgres? How does that differ if you run it in Redshift, or SparkSQL?
– Suppose we store a table in flat CSV files on S3. What kinds of jobs is this good for, and bad for? How is Parquet or BerkeleyDB different?
– What data structures are good for storing a graph, assuming the common query is finding a connected component?
On these questions, I’m primarily interested in computer science fundamentals. A good answer might be “a B-Tree, with keys structured as …”. A bad answer might be be “use Neo4J, I don’t know how it works but it’s fast”.
Note: This role is perfect for an engineer who wants to get into AI/ML. Your direct responsibilities will put you on the boundary of engineering and AI. If you want to break into the field, this is the role for you.
Simpl focuses on E-Commerce, Payments, Credit, and Finance Technology. Their company has offices in Bengaluru. They have a mid-size team that’s between 51-200 employees.
You can view their website at http://getsimpl.com or find them on Twitter, Facebook, LinkedIn, and Product Hunt