The CLEAR Benchmark
  • The CLEAR Benchmark: Continual LEArning on Real-World Imagery
  • Introduction
    • Motivation of CLEAR Benchmark
    • Visio-Linguistic Dataset Curation
    • 📊Evaluation Protocol on CLEAR
    • 🚀1st CLEAR Challenge (CVPR'22)
    • About us
  • Documentation
    • Download CLEAR-10/CLEAR-100
    • Avalanche Integration
Powered by GitBook
On this page
  • Why Continual Learning?
  • Temporal Evolution of Visual Concepts
  1. Introduction

Motivation of CLEAR Benchmark

A continual/lifelong learning benchmark capturing natural distribution shifts of Internet imagery over a decade

PreviousThe CLEAR Benchmark: Continual LEArning on Real-World ImageryNextVisio-Linguistic Dataset Curation

Last updated 1 year ago

Why Continual Learning?

Most of the successes in nowadays vision and learning community are achieved on static benchmarks that did not change ever since they are released:

In real world, such "IID" assumption does not usually hold. Therefore, researchers have put efforts in the field of continual (or incremental/lifelong) learning, aiming for learning systems that are more robust under distribution shifts.

Yet, most of the existing works focus on combatting the catastrophic forgetting nature of neural networks, a phenomenon commonly observed on popular continual benchmarks with extreme distribution shifts between tasks such as "Permuted-MNIST", "Split-CIFAR", "Incremental-ImageNet", and so on..

Made from existing vision datasets, these benchmarks usually contain synthetic distribution shifts via randomly shuffling pixels, or splitting labels into disjoint subsets. Instead, we posit that a more practical continual learning benchmark should reflect how the real world is changing, such as when AVs moving to a new city, and when seeing brand new car models:

Temporal Evolution of Visual Concepts

In the context of visual recognition, we observe that a lot of visual concepts in Internet imagery are evolving over time, i.e., temporal evolution of visual concepts.

We will discuss next how we curate the CLEAR benchmark with an efficient visio-linguisitic dataset curation approach, as well as some of the valuable assets made available for the vision&learning community.

Therefore, we propose to make the CLEAR benchmark featuring such natural continual learning scenarios. We select dynamic visual concepts that are common in Internet image collections to form the label space of and .

CLEAR-10
CLEAR-100
ImageNet (2010) and COCO (2015) are the modern test stones of visual recognition and detection algorithms. However, they did not model the temporal dynamic aspect of real world.
Popular continual learning benchmarks that do not align with practical applications.
Examples of real world distribution shifts for AVs.
The visual concept of "computer" naturally evolved from 2004 to 2014 as laptops became more popular than bulky desktops.
Label space of CLEAR-10 and CLEAR-100.