The All-Seeing Project

This is the official implementation of the following papers:

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World

The name "All-Seeing" is derived from "The All-Seeing Eye", which means having complete knowledge, awareness, or insight into all aspects of existence. The logo is Millennium Puzzle, an artifact from the manga "Yu-Gi-Oh!")

News and Updates 🚀🚀🚀

July 01, 2024: All-Seeing Project v2 is accepted by ECCV 2024! Note that the model and data have already been released in huggingface.
Feb 28, 2024: All-Seeing Project v2 is out! Our ASMv2 achieves state-of-the-art performance across a variety of image-level and region-level tasks! See here for more details.
Feb 21, 2024: ASM, AS-Core, AS-10M, AS-100M is released!
Jan 16, 2024: All-Seeing Project is accepted by ICLR 2024!
Aug 29, 2023: All-Seeing Model Demo is available on the OpenXLab now!

Schedule

Release the ASMv2 model.
Release the AS-V2 dataset.
Release the ASM model.
Release the full version of AS-1B.
Release AS-Core, which is the human-verified subset of AS-1B.
Release AS-100M, which is the 100M subset of AS-1B.
Release AS-10M, which is the 10M subset of AS-1B.
Online demo, including dataset browser and ASM online demo.

Introduction

The All-Seeing Project [Paper][Model][Dataset][Code][Zhihu][Medium]

All-Seeing 1B (AS-1B) dataset: we propose a new large-scale dataset (AS-1B) for open-world panoptic visual recognition and understanding, using an economical semi-automatic data engine that combines the power of off-the-shelf vision/language models and human feedback.

All-Seeing Model (ASM): we develop a unified vision-language foundation model (ASM) for open-world panoptic visual recognition and understanding. Aligning with LLMs, our ASM supports versatile image-text retrieval and generation tasks, demonstrating impressive zero-shot capability.

The All-Seeing Project V2 [Paper][Model][Dataset][Code][Zhihu][Medium]

All-Seeing Dataset V2 (AS-V2) dataset: we propose a novel task, termed Relation Conversation (ReC), which unifies the formulation of text generation, object localization, and relation comprehension. Based on the unified formulation, we construct the AS-V2 dataset, which consists of 127K high-quality relation conversation samples, to unlock the ReC capability for Multi-modal Large Language Models (MLLMs).

All-Seeing Model v2 (ASMv2): we develop ASMv2, which integrates the Relation Conversation ability while maintaining powerful general capabilities. It is endowed with grounding and referring capabilities, exhibiting state-of-the-art performance on region-level tasks. Furthermore, this model can be naturally adapted to the Scene Graph Generation task in an open-ended manner.

Circular-based Relation Probing Evaluation (CRPE) benchmark: We construct a benchmark called Circular-based Relation Probing Evaluation (CRPE), which is the first benchmark that covers all elements of the relation triplets (subject, predicate, object), providing a systematic platform for the evaluation of relation comprehension ability.

License

This project is released under the Apache 2.0 license.

🖊️ Citation

If you find this project useful in your research, please consider cite:

@article{wang2023allseeing,
  title={The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World},
  author={Wang, Weiyun and Shi, Min and Li, Qingyun and Wang, Wenhai and Huang, Zhenhang and Xing, Linjie and Chen, Zhe and Li, Hao and Zhu, Xizhou and Cao, Zhiguo and others},
  journal={arXiv preprint arXiv:2308.01907},
  year={2023}
}
@article{wang2024allseeing_v2,
  title={The All-Seeing Project V2: Towards General Relation Comprehension of the Open World},
  author={Wang, Weiyun and Ren, Yiming and Luo, Haowen and Li, Tiantong and Yan, Chenxiang and Chen, Zhe and Wang, Wenhai and Li, Qingyun and Lu, Lewei and Zhu, Xizhou and others},
  journal={arXiv preprint arXiv:2402.19474},
  year={2024}
}

Name	Name	Last commit message	Last commit date
Latest commit History 35 Commits 35 Commits
all-seeing-v2	all-seeing-v2
all-seeing	all-seeing
.gitignore	.gitignore
README.md	README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The All-Seeing Project

News and Updates 🚀🚀🚀

Schedule

Introduction

The All-Seeing Project [Paper][Model][Dataset][Code][Zhihu][Medium]

The All-Seeing Project V2 [Paper][Model][Dataset][Code][Zhihu][Medium]

License

🖊️ Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

Folders and files

Latest commit

History

Repository files navigation

The All-Seeing Project

News and Updates 🚀🚀🚀

Schedule

Introduction

The All-Seeing Project [Paper][Model][Dataset][Code][Zhihu][Medium]

The All-Seeing Project V2 [Paper][Model][Dataset][Code][Zhihu][Medium]

License

🖊️ Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages