๐ฅ MMLongBench-Doc Leaderboard
NeurIPS 2024 Datasets and Benchmarks Track (Spotlight)
๐ MMLongBench-Doc is a long-context, multimodal document understanding benchmark designed to evaluate the performance of large multimodal models on complex document understanding tasks.
๐ This leaderboard tracks the performance of various models on the MMLongBench-Doc benchmark, focusing on their ability to understand and process long documents with both text and visual elements.
๐ง You can use the official GitHub repo or VLMEvalKit to evaluate your model on MMLongBench-Doc. We provide the official evaluation results of GPT-4.1 and GPT-4o.
๐ To add your own model to the leaderboard, please send an Email to yubo001@e.ntu.edu.sg or zangyuhang@pjlab.org.cn then we will help with the evaluation and updating the leaderboard.
๐ Leaderboard Statistics
- Total Models: 9
- Best Score: 49.7
- Lowest Score: 25.1
Model | Release Date | HF Model | MoE | Parameters | Open Source | ACC Score |
---|---|---|---|---|---|---|
2025-04 | - | 45.9B activated (456B total) | โ | 49.7 |