How DoorDash provides an accurate and reliable ETA

22 March 2024

Mar 28, 2024

Providing an accurate and reliable estimated time of arrival (ETA) for every Dasher @ DoorDash

With more than 2 billion orders annually, our dynamic engineering challenge is to improve and maintain accuracy at scale while managing a variety of conditions within diverse delivery and merchant scenarios.

To address these challenges, we've developed a NextGen ETA Machine Learning (ML) system

Using Deep Learning to Enhance Accuracy
Leveraging Probabilistic Models for More Accurate ETAs
Evaluating the Accuracy of a Probabilistic Forecast - Calibration and Accuracy

RAFT: Adapting Language Model to Domain Specific RAG

From: UC Berkeley, Microsoft Research

When integrating Large Language Models (LLMs) into various applications, it often becomes necessary to incorporate new information, such as domain-specific knowledge or proprietary data, through techniques like retrieval-augmented generation (RAG)-based prompting or fine-tuning.

Retrieval Augmented Fine Tuning (RAFT), a straightforward and powerful fine-tuning recipe to enhance the model's performance in answering questions within specific domains in an "open-book" setting

Analogy: How to prepare a LLM for an Exam?

Retrieval Aware Fine-Tuning (RAFT), presents a novel recipe to prepare fine-tuning data to tailor the models for domain-specific open-book setting, equivalent to in-domain RAG

Pollen-Vision: Unified interface for Zero-Shot vision models in robotics

From : Hugging Face Blogs

This is a guest blog post by the Pollen Robotics team. We are the creators of Reachy, an open-source humanoid robot designed for manipulation in the real world.

visual perception : Enables robots to identify objects, recognize people, navigate spaces, and much more

pollen-vision library, a first step towards empowering our robots with the autonomy to grasp unknown objects. This library is a carefully curated collection of vision models chosen for their direct applicability to robotics. Pollen-vision is designed for ease of installation and use, composed of independent modules that can be combined to create a 3D object detection pipeline, getting the position of the objects in 3D space (x, y, z).

The Core Models of Pollen-Vision

OWL-VIT (Open World Localization - Vision Transformer, By Google Research): This model performs text-conditioned zero-shot 2D object localization in RGB images. It outputs bounding boxes (like YOLO)
Mobile Sam: A lightweight version of the Segment Anything Model (SAM) by Meta AI. SAM is a zero-shot image segmentation model. It can be prompted with bounding boxes or points.
RAM (Recognize Anything Model by OPPO Research Institute): Designed for zero-shot image tagging, RAM can determine the presence of an object in an image based on textual descriptions, laying the groundwork for further analysis.

Try pollen-vision

Wanna try pollen-vision? Check out our Github repository !

DigitalDrip

Discussion about this post