Add evaluation code for multimodal understanding.

This commit is contained in:
Xiaokang Chen 2024-10-23 14:09:31 +08:00 committed by GitHub
parent b5a50686d6
commit 78aa8ad49d
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -64,6 +64,8 @@ Janus is a novel autoregressive framework that unifies multimodal understanding
## 2. News
**2024.10.23**: Evaluation code for reproducing the multimodal understanding results from the paper has been added to VLMEvalKit. Please refer to [this link]( https://github.com/open-compass/VLMEvalKit/pull/541).
**2024.10.20**: (1) Fix a bug in [tokenizer_config.json](https://huggingface.co/deepseek-ai/Janus-1.3B/blob/main/tokenizer_config.json). The previous version caused classifier-free guidance to not function properly, resulting in relatively poor visual generation quality. (2) Release Gradio demo ([online demo](https://huggingface.co/spaces/deepseek-ai/Janus-1.3B) and [local](#gradio-demo)).
## 3. Model Download