The open-source SDK for AI evaluation, observability, and optimization
π Docs β’ π Website β’ π¬ Community β’ π― Dashboard
- What is Future AGI?
- Installation
- Authentication
- 30-Second Examples
- Quick Start
- How It Works
- Core Use Cases
- Real-World Use Cases
- Why Choose Future AGI?
- Supported Integrations
- Documentation
- Language Support
- Support & Community
- Contributing
- Testimonials
- Roadmap
- Troubleshooting & FAQ
Your agent passed every eval. Then it hallucinated a refund policy that doesn't exist. Future AGI gives you the tools to catch that β datasets, prompt versioning, knowledge bases, evaluations, and guardrails. One SDK, one feedback loop.
# Get started in 30 seconds
pip install futureagi
export FI_API_KEY="your_key"
export FI_SECRET_KEY="your_secret"π Get Free API Keys β’ View Live Demo β’ Read Quick Start Guide
- π― Evaluations β 50+ metrics, LLM-as-judge, and custom rubrics powered by the Critique AI agent
- β‘ Guardrails β Real-time safety checks with sub-100ms latency
- π Datasets β Programmatically create, version, and manage training and test datasets
- π¨ Prompt Workbench β Version control, A/B testing, and deployment labels for prompts
- π Knowledge Base β Document management and retrieval for RAG applications
- π Analytics β Model performance, token costs, and behavior insights
- π€ Simulate β Test your AI system against realistic scenarios before users hit it
- π Observability β OpenTelemetry-native tracing across 50+ frameworks
pip install futureaginpm install @future-agi/sdk
# or
pnpm add @future-agi/sdkRequirements: Python >= 3.10 | Node.js >= 14
Get your API credentials from the Future AGI Dashboard:
export FI_API_KEY="your_api_key"
export FI_SECRET_KEY="your_secret_key"Or set them programmatically:
import os
os.environ["FI_API_KEY"] = "your_api_key"
os.environ["FI_SECRET_KEY"] = "your_secret_key"Create and manage datasets with built-in evaluations:
from fi.datasets import Dataset
from fi.datasets.types import (
Cell, Column, DatasetConfig, DataTypeChoices,
ModelTypes, Row, SourceChoices
)
# Create a new dataset
config = DatasetConfig(name="qa_dataset", model_type=ModelTypes.GENERATIVE_LLM)
dataset = Dataset(dataset_config=config)
dataset = dataset.create()
# Define columns
columns = [
Column(name="user_query", data_type=DataTypeChoices.TEXT, source=SourceChoices.OTHERS),
Column(name="ai_response", data_type=DataTypeChoices.TEXT, source=SourceChoices.OTHERS),
Column(name="quality_score", data_type=DataTypeChoices.INTEGER, source=SourceChoices.OTHERS),
]
# Add data
rows = [
Row(order=1, cells=[
Cell(column_name="user_query", value="What is machine learning?"),
Cell(column_name="ai_response", value="Machine learning is a subset of AI..."),
Cell(column_name="quality_score", value=9),
]),
Row(order=2, cells=[
Cell(column_name="user_query", value="Explain quantum computing"),
Cell(column_name="ai_response", value="Quantum computing uses quantum bits..."),
Cell(column_name="quality_score", value=8),
]),
]
# Push data and run evaluations
dataset = dataset.add_columns(columns=columns)
dataset = dataset.add_rows(rows=rows)
# Add automated evaluation
dataset.add_evaluation(
name="factual_accuracy",
eval_template="is_factually_consistent",
model="gpt-4o-mini",
required_keys_to_column_names={
"input": "user_query",
"output": "ai_response",
"context": "user_query",
},
run=True
)
print("β Dataset created with automated evaluations")Version control and A/B test your prompts:
from fi.prompt import Prompt, PromptTemplate, ModelConfig
# Create a versioned prompt template
template = PromptTemplate(
name="customer_support",
messages=[
{"role": "system", "content": "You are a helpful customer support agent."},
{"role": "user", "content": "Help {{customer_name}} with {{issue_type}}."},
],
variable_names={"customer_name": ["Alice"], "issue_type": ["billing"]},
model_configuration=ModelConfig(model_name="gpt-4o-mini", temperature=0.7)
)
# Create and version the template
client = Prompt(template)
client.create() # Create v1
client.commit_current_version("Initial version", set_default=True)
# Assign deployment labels
client.assign_label("Production", version="v1")
# Compile with variables
compiled = client.compile(customer_name="Bob", issue_type="refund")
print(compiled)
# Output: [
# {"role": "system", "content": "You are a helpful customer support agent."},
# {"role": "user", "content": "Help Bob with refund."}
# ]A/B Testing Example:
import random
from openai import OpenAI
from fi.prompt import Prompt
# Fetch different variants (returns Prompt instances)
variant_a = Prompt.get_template_by_name("customer_support", label="variant-a")
variant_b = Prompt.get_template_by_name("customer_support", label="variant-b")
# Randomly select and use
selected = random.choice([variant_a, variant_b])
compiled = selected.compile(customer_name="Alice", issue_type="refund")
# Send to your LLM provider
openai = OpenAI(api_key="your_openai_key")
response = openai.chat.completions.create(model="gpt-4o", messages=compiled)
print(f"Using variant: {selected.template.name}")
print(f"Response: {response.choices[0].message.content}")Manage documents for retrieval-augmented generation:
from fi.kb import KnowledgeBase
# Initialize client
kb_client = KnowledgeBase(
fi_api_key="your_api_key",
fi_secret_key="your_secret_key"
)
# Create a knowledge base with documents
kb = kb_client.create_kb(
name="product_docs",
file_paths=["manual.pdf", "faq.txt", "guide.docx"]
)
print(f"β Knowledge base created: {kb.kb.name}")
print(f" Files uploaded: {len(kb.kb.files)}")
# Update with more files
updated_kb = kb_client.update_kb(
kb_name=kb.kb.name,
file_paths=["updates.pdf"]
)
# Delete specific files
kb_client.delete_files_from_kb(file_names=["updates.pdf"])
# Clean up
kb_client.delete_kb(kb_ids=[kb.kb.id])| Feature | Use Case | Benefit |
|---|---|---|
| Datasets | Store and version training/test data | Reproducible experiments, automated evaluations |
| Prompt Workbench | Version control for prompts | A/B testing, deployment management, rollback |
| Knowledge Base | Evaluations and synthetic data | Intelligent retrieval, document versioning |
| Evaluations | Automated quality checks | No human-in-the-loop, 100% configurable |
| Protect | Real-time safety filters | Sub-100ms latency, production-ready |
| Feature | Future AGI | Traditional Tools | Other Platforms |
|---|---|---|---|
| Evaluation Speed | β‘ Sub-100ms | π Seconds-Minutes | π’ Minutes-Hours |
| Human in Loop | β Fully Automated | β Required | β Often Required |
| Multimodal Support | β Text, Image, Audio, Video | ||
| Setup Time | β±οΈ 2 minutes | β³ Days-Weeks | β³ Hours-Days |
| Configurability | π― 100% Customizable | π Fixed Metrics | βοΈ Some Flexibility |
| Privacy Options | π Cloud + Self-hosted | βοΈ Cloud Only | βοΈ Cloud Only |
| A/B Testing | β Built-in | β Manual | |
| Prompt Versioning | β Git-like Control | β Not Available | |
| Real-time Guardrails | β Production-ready | β Not Available |
Future AGI works seamlessly with your existing AI stack:
LLM Providers
OpenAI β’ Anthropic β’ Google Gemini β’ Azure OpenAI β’ AWS Bedrock β’ Cohere β’ Mistral β’ Ollama β’ vLLM
Frameworks
LangChain β’ LlamaIndex β’ CrewAI β’ AutoGen β’ Haystack β’ Semantic Kernel
Vector Databases
Pinecone β’ Weaviate β’ Qdrant β’ Milvus β’ Chroma β’ FAISS
Observability
OpenTelemetry β’ Custom Logging β’ Trace Context Propagation
| Language | Package | Status |
|---|---|---|
| Python | futureagi |
β Full Support |
| TypeScript/JavaScript | @future-agi/sdk |
β Full Support |
| REST API | cURL/HTTP | β Available |
- π§ Email: support@futureagi.com
- πΌ LinkedIn: Future AGI Company
- π¦ X (Twitter): @FutureAGI_
- π° Substack: Future AGI Blog
We welcome contributions! Here's how to get involved:
- π Report bugs: Open an issue
- π‘ Request features: Start a discussion
- π§ Submit PRs: Fork, create a feature branch, and submit a pull request
- π Improve docs: Help us make our documentation better
See CONTRIBUTING.md for detailed guidelines.
"Future AGI cut our evaluation time from days to minutes. The automated critiques are spot-on!"
β AI Engineering Team, Fortune 500 Company
"The prompt versioning alone saved us countless headaches. A/B testing is now trivial."
β ML Lead, Healthcare Startup
"Sub-100ms guardrails in production. Game changer for our customer-facing AI."
β CTO, E-commerce Platform
- Datasets with automated evaluations
- Prompt workbench with versioning
- Knowledge base for RAG
- Real-time guardrails (sub-100ms)
- Multi-language SDK (Python + TypeScript)
- Bulk Annotations for Human in the Loop
- On-premise deployment toolkit
Import Error: `ModuleNotFoundError: No module named 'fi'`
Make sure Future AGI is installed:
pip install futureagi --upgradeAuthentication Error: Invalid API credentials
- Check your API keys at Dashboard
- Ensure environment variables are set correctly:
echo $FI_API_KEY
echo $FI_SECRET_KEY- Try setting them programmatically in your code
How do I switch between environments (dev/staging/prod)?
Use prompt labels to manage different deployment environments:
client.assign_label("Development", version="v1")
client.assign_label("Staging", version="v2")
client.assign_label("Production", version="v3")Can I use Future AGI without sending data to the cloud?
Yes! Future AGI supports self-hosted deployments. Contact us at support@futureagi.com for enterprise on-premise options.
What LLM providers are supported?
All major providers: OpenAI, Anthropic, Google, Azure, AWS Bedrock, Cohere, Mistral, and open-source models via vLLM/Ollama.
Need more help? Check our complete FAQ or join our community.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Built with β€οΈ by the Future AGI team and contributors.
If Future AGI helps you ship better AI, a β helps more teams find us.
π futureagi.com Β· π docs.futureagi.com Β· βοΈ app.futureagi.com
