news

Sep 30, 2024 AcFormer(Code) and DreamClear(Code) are accepted to NeurIPS 2024.
Sep 12, 2024 InfiMM-WebMath-40B is on HF and ArXiv.
Aug 08, 2024 Law of Vision Representation in MLLMs is released on ArXiv.
Mar 08, 2024 VITAR is released: Vision Transformer with Any Resolution
Mar 08, 2024 InfiMM-HD is ON: A Leap Forward in High-Resolution Multimodal Understanding.
Jan 18, 2024 InfiMM models are released at HuggingFace InfiMM. This is another open-source reproduction based on Flamingo architecture. We held the top position on the MMMU leaderboard at the time of our submission (Jan 1 2024). :rocket:
Jan 18, 2024 Check out our findings on Visual Instruction Fine-tuning: COCO is “ALL” You Need for Visual Instruction Fine-tuning
Jan 11, 2024 Our survey on MLLM Reasoning is available online: A Comprehensive Survey on Emerging Trends in Multimodal Reasoning
Dec 04, 2023 InfiMM-Eval is ON: Complex Open-ended Reasoning Evaluation for MLLMs.