diff --git a/README.md b/README.md index 88ac552..9ded2d5 100644 --- a/README.md +++ b/README.md @@ -11,8 +11,8 @@
-
-
+
+
@@ -37,6 +37,19 @@
---
+## Repository layout
+
+This **ThinkSound** GitHub repository hosts two related projects on separate branches:
+
+| Branch | Project | Documentation |
+|--------|---------|----------------|
+| **`master`** | **ThinkSound** (NeurIPS 2025) — unified Any2Audio generation with CoT-guided flow matching | This file: **`README.md`** |
+| **`prismaudio`** | **PrismAudio** — follow-up work (ICLR 2026) on video-to-audio with multi-dimensional CoT-RL | **`README.md`** on the [`prismaudio`](https://github.com/liuhuadai/ThinkSound/tree/prismaudio) branch |
+
+For **ThinkSound**, use branch **`master`** (this README). For **PrismAudio**, check out **`prismaudio`** and follow **`README.md`** there.
+
+---
+
**ThinkSound** is a unified Any2Audio generation framework with flow matching guided by Chain-of-Thought (CoT) reasoning.
PyTorch implementation for multimodal audio generation and editing: generate or edit audio from video, text, and audio, powered by step-by-step reasoning from Multimodal Large Language Models (MLLMs).
@@ -45,10 +58,11 @@ PyTorch implementation for multimodal audio generation and editing: generate or
---
## 📰 News
-- **2026.01.26** 🎉 PrismAudio has been accepted to the **ICLR 2026 Main Conference**! We plan to release the project in February 2026.
-- **2025.11.25** 🔥[Online PrismAudio Demo](http://prismaudio-project.github.io/) is live - try it now!
-- **2025.11.25** 🔥[PrismAudio paper](https://arxiv.org/pdf/2511.18833) released on arXiv, the first multi-dimensional CoT-RL framework for Video-to-Audio Generation!
-- **2025.09.19** 🎉 ThinkSound has been accepted to the **NeurIPS 2025 Main Conference**!
+- **2026.03.24** 🔥 **PrismAudio** is released in the same repo on branch [`prismaudio`](https://github.com/liuhuadai/ThinkSound/tree/prismaudio) — see **`README.md`** there for setup and models.
+- **2026.01.26** 🎉 PrismAudio accepted to **ICLR 2026 Main Conference** (code/docs on `prismaudio`).
+- **2025.11.25** 🔥 [Online PrismAudio Demo](http://prismaudio-project.github.io/) is live.
+- **2025.11.25** 🔥 [PrismAudio paper](https://arxiv.org/pdf/2511.18833) on arXiv — multi-dimensional CoT-RL for video-to-audio.
+- **2025.09.19** 🎉 **ThinkSound** accepted to the **NeurIPS 2025 Main Conference**!
- **2025.09.01** Our AudioCoT dataset is now open-sourced and available on [Hugging Face](https://huggingface.co/datasets/liuhuadai/AudioCoT)!
- **2025.07.17** 🧠 Finetuning enabled: training and finetuning code is now publicly available, along with clear usage instructions to help you customize and extend ThinkSound with your own data.
- **2025.07.15** 📦 Simplified installation and usability: dependencies on PyPI for easy cross-platform setup; Windows `.bat` scripts automate environment creation and script running.
@@ -61,6 +75,19 @@ PyTorch implementation for multimodal audio generation and editing: generate or
---
+