Thursday, February 18, 2016

Data Science at Instacart

https://tech.instacart.com/data-science-at-instacart/

Data Science at Instacart

At Instacart, we are revolutionizing the way people buy groceries. We give busy professionals, parents and elderly back valuable time they don’t have to spend shopping. We also give flexible work opportunities to thousands of personal shoppers, and we extend the reach and sales volume for our hundreds of retail partners.
We work incredibly hard to make Instacart easy to use. Our site and app are intuitive – you fill your shopping cart, pick the hour you want delivery to occur in, and then the groceries are handed to you at your doorstep. But achieving this simplicity cost effectively at scale requires an enormous investment in engineering and data science.

What are a few of the teams where data science plays a critical role at Instacart?

Fulfillment
At its core, Instacart is a real-time logistics platform. We are in the business of moving goods from A (a store) to B (your front door) as efficiently and predictably as we can. At any given time, for every market we operate in, we can have thousands of customers expecting delivery in the coming hours. We also can have thousands of shoppers actively working or waiting to be dispatched through our mobile shopper application.
Our fulfillment algorithm decides in real time how to route those shoppers to store locations to pick groceries and deliver them to customers’ door-steps in as little as one hour. We re-compute this optimization every minute, because the world is constantly changing. We have to balance speed (some shoppers shop faster, some stores are less busy) with efficiency (can we deliver multiple orders simultaneously) with quality (does the customer get the exact groceries they want) and timeliness (is the order delivered within the hour it is due – no earlier, no later).
Example of near-optimal combinations of orders for our drivers to deliver (noise added to addresses to protect privacy)
Example of near-optimal combinations of orders for our drivers to deliver (noise added to addresses to protect privacy)
Optimizing multiple objectives while routing thousands of shoppers every minute to fulfill millions of orders is a tremendous data science challenge.

Supply & Demand
Instacart operates a very dynamic and complex fulfillment marketplace. Our consumers place orders (demand) and our shoppers fulfill those orders (supply) in as little as an hour. If supply exceeds demand in a market, we lose money and reduce shopper happiness due to shoppers sitting idle. If instead demand exceeds supply in a market, we lose revenue and customers due to limited availability and busy pricing. Our shoppers work with us to make money, and so they will only be happy if they’re able to keep busy. On the other side, our customers change their lifestyles because of our product, and so we need to be there for them when they want us.
Jeremy and Morgane discussing demand forecasting
Jeremy and Morgane discussing demand forecasting
Balancing supply and demand requires sophisticated systems for forecasting customer and shopper behavior down to individual store locations by hour of day many days into the future. We then create staffing plans that blend multiple different labor role types to optimize our efficiency while ensuring high availability for our customers. This is made even more challenging by the many different queues we must manage across stores and division of labor. Then in real time, we have to estimate our capacity for orders every time a user visits our site or one of our apps, and then dynamically control availability and busy pricing to smooth demand and create the optimal customer experience.
These systems operate over multiple time horizons, have to solve for multiple competing objectives, and control for many erratic sources of variation (shopper behavior, weather, special events, etc.). We will always have huge opportunities to make improvements here.
Search & Personalization
Instacart isn’t just grocery delivery, we’re creating a better grocery shopping experience. A majority of grocery shopping is about finding the food on your list. In a traditional grocery store, the search engine is the customer’s two feet. At Instacart, it’s a massive algorithm that can mine billions of searches to ensure every product a customer wants is at the edge of their fingertips.
At a physical grocery store, you have to discover new products on your own. But at Instacart, we can curate the experience for you through personalization. What could be more personal than food? We have an intimate relationship with it every day – we put it in our bodies! As much as movie recommendations were critical to the success of Netflix, so too are product recommendations critical to Instacart.
The search team - Sharath, Vincent, Raj and Jon from left
The search team – Sharath, Vincent, Raj and Jon from left
Our consumers order large basket sizes of diverse foods over and over and over again from us. We have more density on our user behavior than any e-commerce company I have ever seen. We are just beginning to use that data to provide incredibly valuable personalized experiences for our users on Instacart in search, in product discovery and in suggestions we make to our users. We A/B test everything, and are thinking really hard about the long term impacts of our changes.
Through investments in search and personalization, Instacart has the opportunity to go beyond convenience in shopping online, and into a future where everyone finds more food they love faster.

How does data science work at Instacart?

We have made the conscious decision to embed our data scientists into our product teams, side-by-side with their engineers, designers and product managers and reporting into the engineering leader for the team. So to answer this question, you first have to understand how engineering works at Instacart.
At Instacart, we place a high value on ownership, speed and ultimately shipping products that have a huge measure-able impact. In engineering, we have organized to optimize for these values. We have many product teams, each of which have full-stack and/or mobile developers, designers, analysts, product managers and engineering leaders dedicated to them. Some teams are only 3 people, others are up to 10. Each team completely owns their ‘product’, and defines their key metrics and sets their roadmap weekly.
We align all of these teams to a small (three or fewer) set of company wide goals that are updated whenever they are achieved or exceeded. These company goals are concise, measurable and time-bound objectives set by our board and executive team that the entire company is committed to. We are obsessively focused on them, and are incredibly transparent about our status and progress on these goals – our CEO sends detailed notes on each weekly.
So every product team answers the question every week “what can we do to have the biggest impact on our company’s goals this week?”. They are then empowered to do whatever they need to within their product to achieve those goals. It’s their ideas, their creativity, their collaboration, their resourcefulness, and their hard work that really moves the needle.
Jeremy presenting on visualization at a Friday engineering lunch
Jeremy presenting on visualization at a Friday engineering lunch
For technology companies, data science can either be an integral component to huge value creation, or an expensive and distracting hobby. Many factors determine the outcome, but how you organize your data scientists is one of the biggest contributing factors. By embedding our data scientists into product teams, we’ve ensured that they are as integral a part of their teams as they can be. As the VP of data science, it’s my job to make sure that the data scientists stay connected, have the mentorship they need, and are having the biggest impact they can within their teams.
The data scientists have a tremendous amount of traction in this model. Their ideas can directly shape not only product innovation, but also data collection and infrastructure innovation to fuel future product ideas. They work directly with their team to bring their products the ‘last mile’ to production. This lets data scientists put new ideas into production in days (from inception), and to rapidly iterate on those ideas as they receive feedback from their consumers. This also gives data scientists a holistic a view of their product, and helps to ensure they are optimizing for the right objectives as effectively as possible.

What are some areas Instacart is expecting to invest in data science in the near future?

Shoppers
Our shoppers are very important to our company. They shop for our customers in the stores, communicate with them live to resolve any issues, and bring the food to their doorstep thousands of times every hour. We can use data science to optimize how we on-board these shoppers and ensure they are successful. We can also optimize and personalize our shopper application to ensure our shoppers can do their jobs quickly and effectively.
Partners
In many companies, advertising is a necessary evil. At Instacart, we have been able to integrate advertising in a way that is a clear win for the advertiser, for the customer and Instacart! Our Deals program lets consumer packaged goods companies offer discounts to our customers (they love them!). Ensuring that customers see the deals they would be most interested in, and that the advertisers get a high ROI for their spend is a huge data science opportunity for Instacart.

What do you look for in Data Scientists when recruiting?

Our organizational structure works because we have amazing talent. You can’t move as fast as we do, with as much distributed ownership as we have, all while solving challenges like ours without the right people.
Bala, Mathieu and Sherin discussing batching (from left)
Bala, Mathieu and Sherin discussing batching (from left)
Our values form the corner-stone of our culture, and these in particular are key for hiring data scientists:
Customer Focus
“Everything we do is in service to our customers. We will work tirelessly to gain the trust of our customers, and to improve their lives. This is the first priority for everyone at Instacart.”
We seek to understand the problems we work on as holistically as we can, and to reason through the physics of the system and how our many constituents (consumers, shoppers, our partners) will experience the changes we drive. We look for candidates that naturally think about problems from a “first principles” basis for what is best for the end user.
Take Ownership
“We will take full ownership of our projects. We take pride in our work and relentlessly execute to get things completely finished.”
In data science, this means improving algorithms and analyzing data are never enough. We own the problem, the solution, the implementation and the measurement – along with everyone else on our team. Simply put, until the desired impact has been measured, our work isn’t done. We look for candidates who crave this opportunity for impact.
Sense of Urgency
“We work extremely fast to drive our projects to completion and we will not rest until they are done.”
Many data science teams think about impact in quarters, months or weeks. Our teams regularly iterate on hard problems in a matter of days – from R&D to implementation and measurement. We look for candidates with a bias towards action, and the fortitude to pursue aggresive goals relentlessly.
Highest Standards
“We put our heart and soul into the projects to deliver the highest quality work product. We only produce work that we are proud of.”
With ownership and a mandate for urgency comes a great responsibility – we must maintain the highest standards possible for the work we produce, as it has the potential to impact millions of consumers, thousands of shoppers and hundreds of retail partners. We look for exceptional candidates who can do amazing work, and are always seeking better ways – be they new algorithms, new processes or new implementations.
Humility
“We appreciate that great ideas can come from anywhere and we will be humble and open minded in considering the ideas of others.”
Many of our best data science ideas have come from Instacart employees in the field – working directly with our shoppers in our stores, or interacting directly with our customers. Ensuring our eyes are wide open to these ideas, and that we collaborate openly within our teams and are always open to questioning our biases and assumptions is critically important. We look for candidates who are conscious of their limitations, and always open to the ideas of others – wherever those ideas may come from.

What Data Science roles is Instacart recruiting for?

We are looking for data scientists with expertise in forecasting, predictive modeling, ads optimization, search and recommendations. We are also looking for operations research scientists with expertise in planning, logistics and real time control systems. Our team uses Python, R, SQL (Postgres & Redshift) and Spark extensively, so mastery of some of those tools and technologies is also helpful.

No comments:

Post a Comment