Skip to content
@OpenDCAI

OpenDCAI

Define the future of Data-centric AI together

OpenDCAI

Website Google Scholar X Bilibili RedNote Stars Followers

👋 Welcome

✨We are dedicated to advancing research and open-source tools in Data-Centric Artificial Intelligence (DCAI).✨

🚀Our goal is to develop effective and efficient DCAI systems and algorithms that support and enhance the performance of AI models and applications.

🤝 Community

QR_en

Pinned Loading

  1. DataFlow DataFlow Public

    Easy Data Preparation with latest LLMs-based Operators and Pipelines.

    Python 3.3k 298

  2. MyScaleDB MyScaleDB Public

    Forked from OriginHubAI/MyScaleDB

    AI Database for unified, scalable SQL + vector data management, search and analytics

    C++ 42 1

  3. DataFlex DataFlex Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    Python 451 48

  4. Paper2Any Paper2Any Public

    Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

    Python 2.2k 153

  5. AgentFlow AgentFlow Public

    The First Unified Agent Data Synthesis Framework for Custom Agentic Task with all-in-one envrionment

    Python 95 7

  6. OpenWorldLib OpenWorldLib Public

    Unified Codebase for Advanced World Models.

    Python 720 37

Repositories

Showing 10 of 36 repositories
  • One-Eval Public

    Automated system for LLM evaluation via agents. Doc as below:

    OpenDCAI/One-Eval’s past year of commit activity
    Python 110 Apache-2.0 19 5 1 Updated Apr 28, 2026
  • DataFlow-KG Public
    OpenDCAI/DataFlow-KG’s past year of commit activity
    Python 5 Apache-2.0 1 0 0 Updated Apr 28, 2026
  • Mycel Public

    Connect people, agents, and teams for the next era of human-AI collaboration.

    OpenDCAI/Mycel’s past year of commit activity
    Python 104 MIT 13 4 3 Updated Apr 28, 2026
  • mycel-sdk Public
    OpenDCAI/mycel-sdk’s past year of commit activity
    Python 0 MIT 0 1 0 Updated Apr 28, 2026
  • OpenWorldLib Public

    Unified Codebase for Advanced World Models.

    OpenDCAI/OpenWorldLib’s past year of commit activity
    Python 720 Apache-2.0 37 6 1 Updated Apr 28, 2026
  • Open-NotebookLM Public

    An Open Source implementation of Notebook LM.

    OpenDCAI/Open-NotebookLM’s past year of commit activity
    Python 66 Apache-2.0 16 5 6 Updated Apr 28, 2026
  • Paper2Any Public

    Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

    OpenDCAI/Paper2Any’s past year of commit activity
    Python 2,214 Apache-2.0 153 9 4 Updated Apr 27, 2026
  • OpenPrism Public

    Open-source implementation of AI-powered academic writing workspace inspired by OpenAI Prism, featuring LaTeX editing, PDF preview, and intelligent AI assistance

    OpenDCAI/OpenPrism’s past year of commit activity
    TypeScript 271 26 3 (1 issue needs help) 4 Updated Apr 21, 2026
  • Flash-MinerU Public

    Ray-powered accelerator for MinerU, turning PDF → Markdown into a scalable, cluster-ready data infrastructure. 基于 Ray 的 MinerU 加速层,将 PDF → Markdown 构建为可扩展、面向集群的数据基础设施。

    OpenDCAI/Flash-MinerU’s past year of commit activity
    Python 50 7 3 0 Updated Apr 20, 2026
  • DataFlex Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    OpenDCAI/DataFlex’s past year of commit activity
    Python 451 Apache-2.0 48 1 0 Updated Apr 17, 2026

Most used topics

Loading…