Providing an accurate and reliable estimated time of arrival (ETA) for every Dasher @ DoorDash
From : DoorDash Engineering Blog
With more than 2 billion orders annually, our dynamic engineering challenge is to improve and maintain accuracy at scale while managing a variety of conditions within diverse delivery and merchant scenarios.
To address these challenges, we've developed a NextGen ETA Machine Learning (ML) system
Using Deep Learning to Enhance Accuracy
Leveraging Probabilistic Models for More Accurate ETAs
Evaluating the Accuracy of a Probabilistic Forecast - Calibration and Accuracy
Read More : DoorDash Engineering Blog
RAFT: Adapting Language Model to Domain Specific RAG
From: UC Berkeley, Microsoft Research
When integrating Large Language Models (LLMs) into various applications, it often becomes necessary to incorporate new information, such as domain-specific knowledge or proprietary data, through techniques like retrieval-augmented generation (RAG)-based prompting or fine-tuning.
Retrieval Augmented Fine Tuning (RAFT), a straightforward and powerful fine-tuning recipe to enhance the model's performance in answering questions within specific domains in an "open-book" setting
Analogy: How to prepare a LLM for an Exam?
Retrieval Aware Fine-Tuning (RAFT), presents a novel recipe to prepare fine-tuning data to tailor the models for domain-specific open-book setting, equivalent to in-domain RAG
Read More : UC Berkeley, Microsoft Research
Pollen-Vision: Unified interface for Zero-Shot vision models in robotics
From : Hugging Face Blogs
This is a guest blog post by the Pollen Robotics team. We are the creators of Reachy, an open-source humanoid robot designed for manipulation in the real world.
visual perception : Enables robots to identify objects, recognize people, navigate spaces, and much more
pollen-vision
library, a first step towards empowering our robots with the autonomy to grasp unknown objects. This library is a carefully curated collection of vision models chosen for their direct applicability to robotics. Pollen-vision
is designed for ease of installation and use, composed of independent modules that can be combined to create a 3D object detection pipeline, getting the position of the objects in 3D space (x, y, z).
The Core Models of Pollen-Vision
OWL-VIT (Open World Localization - Vision Transformer, By Google Research): This model performs text-conditioned zero-shot 2D object localization in RGB images. It outputs bounding boxes (like YOLO)
Mobile Sam: A lightweight version of the Segment Anything Model (SAM) by Meta AI. SAM is a zero-shot image segmentation model. It can be prompted with bounding boxes or points.
RAM (Recognize Anything Model by OPPO Research Institute): Designed for zero-shot image tagging, RAM can determine the presence of an object in an image based on textual descriptions, laying the groundwork for further analysis.
Try pollen-vision
Wanna try pollen-vision? Check out our Github repository !