Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions index.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ notebook = "27_First_RAG_Pipeline.ipynb"
aliases = []
completion_time = "10 min"
created_at = 2023-12-05
dependencies = ["datasets>=2.6.1", "sentence-transformers>=4.1.0", "mistral-haystack"]
dependencies = ["datasets>=2.6.1", "sentence-transformers>=4.1.0", "mistral-haystack", "transformers-haystack"]
featured = true

[[tutorial]]
Expand All @@ -35,7 +35,7 @@ notebook = "29_Serializing_Pipelines.ipynb"
aliases = []
completion_time = "10 min"
created_at = 2024-01-29
dependencies = ["transformers[torch]"]
dependencies = ["transformers-haystack"]

[[tutorial]]
title = "Preprocessing Different File Types"
Expand Down Expand Up @@ -98,7 +98,7 @@ notebook = "34_Extractive_QA_Pipeline.ipynb"
aliases = []
completion_time = "10 min"
created_at = 2024-02-09
dependencies = ["accelerate", "sentence-transformers", "datasets", "transformers<5"]
dependencies = ["sentence-transformers", "datasets", "transformers-haystack"]

[[tutorial]]
title = "Evaluating RAG Pipelines"
Expand Down Expand Up @@ -154,7 +154,7 @@ notebook = "41_Query_Classification_with_TransformersTextRouter_and_Transformers
aliases = []
completion_time = "25 min"
created_at = 2024-10-15
dependencies = ["sentence-transformers>=4.1.0", "gradio", "torch", "sentencepiece", "datasets", "accelerate", "transformers<5"]
dependencies = ["sentence-transformers>=4.1.0", "gradio", "datasets", "transformers-haystack"]

[[tutorial]]
title = "Retrieving a Context Window Around a Sentence"
Expand Down Expand Up @@ -258,6 +258,6 @@ notebook = "49_TurboQuant_Quantization_with_HuggingFace.ipynb"
aliases = []
completion_time = "20 min"
created_at = 2026-03-30
dependencies = ["haystack-ai", "turboquant-vllm", "transformers"]
dependencies = ["haystack-ai", "turboquant-vllm", "transformers-haystack"]
featured = false
python_version = "3.12"
2,173 changes: 1,068 additions & 1,105 deletions tutorials/27_First_RAG_Pipeline.ipynb

Large diffs are not rendered by default.

181 changes: 24 additions & 157 deletions tutorials/29_Serializing_Pipelines.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,7 @@
"metadata": {
"id": "cFFW8D-weE2S"
},
"source": [
"# Tutorial: Serializing LLM Pipelines\n",
"\n",
"- **Level**: Beginner\n",
"- **Time to complete**: 10 minutes\n",
"- **Components Used**: [`HuggingFaceLocalChatGenerator`](https://docs.haystack.deepset.ai/docs/huggingfacelocalchatgenerator), [`ChatPromptBuilder`](https://docs.haystack.deepset.ai/docs/chatpromptbuilder)\n",
"- **Prerequisites**: None\n",
"- **Goal**: After completing this tutorial, you'll understand how to serialize and deserialize between YAML and Python code."
]
"source": "# Tutorial: Serializing LLM Pipelines\n\n- **Level**: Beginner\n- **Time to complete**: 10 minutes\n- **Components Used**: [`TransformersChatGenerator`](https://docs.haystack.deepset.ai/docs/transformerschatgenerator), [`ChatPromptBuilder`](https://docs.haystack.deepset.ai/docs/chatpromptbuilder)\n- **Prerequisites**: None\n- **Goal**: After completing this tutorial, you'll understand how to serialize and deserialize between YAML and Python code."
},
{
"cell_type": "markdown",
Expand All @@ -35,11 +27,7 @@
"metadata": {
"id": "TLaHxdJcfWtI"
},
"source": [
"## Installing Haystack\n",
"\n",
"Install Haystack with `pip`:"
]
"source": "## Installing Haystack\n\nInstall Haystack and the [`transformers-haystack`](https://haystack.deepset.ai/integrations/huggingface) integration (which provides `TransformersChatGenerator`) with `pip`:"
},
{
"cell_type": "code",
Expand All @@ -52,11 +40,7 @@
"outputId": "e304450a-24e3-4ef8-e642-1fbb573e7d55"
},
"outputs": [],
"source": [
"%%bash\n",
"\n",
"pip install haystack-ai"
]
"source": "%%bash\n\npip install haystack-ai transformers-haystack"
},
{
"cell_type": "markdown",
Expand All @@ -71,51 +55,12 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"metadata": {
"id": "odZJjD7KgO1g"
},
"outputs": [
{
"data": {
"text/plain": [
"<haystack.core.pipeline.pipeline.Pipeline object at 0x13cc77370>\n",
"🚅 Components\n",
" - builder: ChatPromptBuilder\n",
" - llm: HuggingFaceLocalChatGenerator\n",
"🛤️ Connections\n",
" - builder.prompt -> llm.messages (List[ChatMessage])"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from haystack import Pipeline\n",
"from haystack.components.builders import ChatPromptBuilder\n",
"from haystack.dataclasses import ChatMessage\n",
"from haystack.components.generators.chat import HuggingFaceLocalChatGenerator\n",
"\n",
"template = [\n",
" ChatMessage.from_user(\n",
" \"\"\"\n",
"Please create a summary about the following topic:\n",
"{{ topic }}\n",
"\"\"\"\n",
" )\n",
"]\n",
"\n",
"builder = ChatPromptBuilder(template=template)\n",
"llm = HuggingFaceLocalChatGenerator(model=\"Qwen/Qwen2.5-1.5B-Instruct\", generation_kwargs={\"max_new_tokens\": 150})\n",
"\n",
"pipeline = Pipeline()\n",
"pipeline.add_component(name=\"builder\", instance=builder)\n",
"pipeline.add_component(name=\"llm\", instance=llm)\n",
"\n",
"pipeline.connect(\"builder.prompt\", \"llm.messages\")"
]
"outputs": [],
"source": "from haystack import Pipeline\nfrom haystack.components.builders import ChatPromptBuilder\nfrom haystack.dataclasses import ChatMessage\nfrom haystack_integrations.components.generators.transformers import TransformersChatGenerator\n\ntemplate = [\n ChatMessage.from_user(\n \"\"\"\nPlease create a summary about the following topic:\n{{ topic }}\n\"\"\"\n )\n]\n\nbuilder = ChatPromptBuilder(template=template)\nllm = TransformersChatGenerator(model=\"Qwen/Qwen2.5-1.5B-Instruct\", generation_kwargs={\"max_new_tokens\": 150})\n\npipeline = Pipeline()\npipeline.add_component(name=\"builder\", instance=builder)\npipeline.add_component(name=\"llm\", instance=llm)\n\npipeline.connect(\"builder.prompt\", \"llm.messages\")"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -157,7 +102,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
Expand All @@ -175,23 +120,26 @@
" init_parameters:\n",
" required_variables: null\n",
" template:\n",
" - _content:\n",
" - content:\n",
" - text: '\n",
"\n",
" Please create a summary about the following topic:\n",
" Please create a summary about the following topic:\n",
"\n",
" {{ topic }}\n",
" {{ topic }}\n",
"\n",
" '\n",
" _meta: {}\n",
" _name: null\n",
" _role: user\n",
" '\n",
" meta: {}\n",
" name: null\n",
" role: user\n",
" variables: null\n",
" type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder\n",
" llm:\n",
" init_parameters:\n",
" chat_template: null\n",
" enable_thinking: false\n",
" generation_kwargs:\n",
" max_new_tokens: 150\n",
" return_full_text: false\n",
" stop_sequences: []\n",
" huggingface_pipeline_kwargs:\n",
" device: cpu\n",
Expand All @@ -204,7 +152,10 @@
" - HF_TOKEN\n",
" strict: false\n",
" type: env_var\n",
" type: haystack.components.generators.chat.hugging_face_local.HuggingFaceLocalChatGenerator\n",
" tool_parsing_function: haystack_integrations.components.generators.transformers.chat.chat_generator.default_tool_parser\n",
" tools: null\n",
" type: haystack_integrations.components.generators.transformers.chat.chat_generator.TransformersChatGenerator\n",
"connection_type_validation: true\n",
"connections:\n",
"- receiver: llm.messages\n",
" sender: builder.prompt\n",
Expand All @@ -225,54 +176,7 @@
"metadata": {
"id": "0C7zGsUCGszq"
},
"source": [
"You should get a pipeline YAML that looks like the following:\n",
"\n",
"```yaml\n",
"components:\n",
" builder:\n",
" init_parameters:\n",
" required_variables: null\n",
" template:\n",
" - _content:\n",
" - text: '\n",
"\n",
" Please create a summary about the following topic:\n",
"\n",
" {{ topic }}\n",
"\n",
" '\n",
" _meta: {}\n",
" _name: null\n",
" _role: user\n",
" variables: null\n",
" type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder\n",
" llm:\n",
" init_parameters:\n",
" init_parameters:\n",
" generation_kwargs:\n",
" max_new_tokens: 150\n",
" stop_sequences: []\n",
" huggingface_pipeline_kwargs:\n",
" device: cpu\n",
" model: Qwen/Qwen2.5-1.5B-Instruct\n",
" task: text-generation\n",
" streaming_callback: null\n",
" token:\n",
" env_vars:\n",
" - HF_API_TOKEN\n",
" - HF_TOKEN\n",
" strict: false\n",
" type: env_var\n",
" type: haystack.components.generators.chat.hugging_face_local.HuggingFaceLocalChatGenerator\n",
"connections:\n",
"- receiver: llm.messages\n",
" sender: builder.prompt\n",
"max_runs_per_component: 100\n",
"metadata: {}\n",
"\n",
"```"
]
"source": "You should get a pipeline YAML that looks like the following:\n\n```yaml\ncomponents:\n builder:\n init_parameters:\n required_variables: null\n template:\n - content:\n - text: '\n\n Please create a summary about the following topic:\n\n {{ topic }}\n\n '\n meta: {}\n name: null\n role: user\n variables: null\n type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder\n llm:\n init_parameters:\n chat_template: null\n enable_thinking: false\n generation_kwargs:\n max_new_tokens: 150\n return_full_text: false\n stop_sequences: []\n huggingface_pipeline_kwargs:\n device: cpu\n model: Qwen/Qwen2.5-1.5B-Instruct\n task: text-generation\n streaming_callback: null\n token:\n env_vars:\n - HF_API_TOKEN\n - HF_TOKEN\n strict: false\n type: env_var\n tool_parsing_function: haystack_integrations.components.generators.transformers.chat.chat_generator.default_tool_parser\n tools: null\n type: haystack_integrations.components.generators.transformers.chat.chat_generator.TransformersChatGenerator\nconnection_type_validation: true\nconnections:\n- receiver: llm.messages\n sender: builder.prompt\nmax_runs_per_component: 100\nmetadata: {}\n\n```"
},
{
"cell_type": "markdown",
Expand All @@ -287,49 +191,12 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"metadata": {
"id": "U332-VjovFfn"
},
"outputs": [],
"source": [
"yaml_pipeline = \"\"\"\n",
"components:\n",
" builder:\n",
" init_parameters:\n",
" template:\n",
" - _content:\n",
" - text: 'Please translate the following to French: \\n{{ sentence }}\\n'\n",
" _meta: {}\n",
" _name: null\n",
" _role: user\n",
" variables: null\n",
" type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder\n",
" llm:\n",
" init_parameters:\n",
" generation_kwargs:\n",
" max_new_tokens: 150\n",
" stop_sequences: []\n",
" huggingface_pipeline_kwargs:\n",
" device: cpu\n",
" model: Qwen/Qwen2.5-1.5B-Instruct\n",
" task: text-generation\n",
" streaming_callback: null\n",
" chat_template : \"{% for message in messages %}{% if message['role'] == 'user' %}{{ ' ' }}{% endif %}{{ message['content'] }}{% if not loop.last %}{{ ' ' }}{% endif %}{% endfor %}{{ eos_token }}\"\n",
" token:\n",
" env_vars:\n",
" - HF_API_TOKEN\n",
" - HF_TOKEN\n",
" strict: false\n",
" type: env_var\n",
" type: haystack.components.generators.chat.hugging_face_local.HuggingFaceLocalChatGenerator\n",
"connections:\n",
"- receiver: llm.messages\n",
" sender: builder.prompt\n",
"max_runs_per_component: 100\n",
"metadata: {}\n",
"\"\"\""
]
"source": "yaml_pipeline = \"\"\"\ncomponents:\n builder:\n init_parameters:\n required_variables: null\n template:\n - content:\n - text: 'Please translate the following to French: \\n{{ sentence }}\\n'\n meta: {}\n name: null\n role: user\n variables: null\n type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder\n llm:\n init_parameters:\n chat_template: \"{% for message in messages %}{% if message['role'] == 'user' %}{{ ' ' }}{% endif %}{{ message['content'] }}{% if not loop.last %}{{ ' ' }}{% endif %}{% endfor %}{{ eos_token }}\"\n enable_thinking: false\n generation_kwargs:\n max_new_tokens: 150\n return_full_text: false\n stop_sequences: []\n huggingface_pipeline_kwargs:\n device: cpu\n model: Qwen/Qwen2.5-1.5B-Instruct\n task: text-generation\n streaming_callback: null\n token:\n env_vars:\n - HF_API_TOKEN\n - HF_TOKEN\n strict: false\n type: env_var\n tool_parsing_function: haystack_integrations.components.generators.transformers.chat.chat_generator.default_tool_parser\n tools: null\n type: haystack_integrations.components.generators.transformers.chat.chat_generator.TransformersChatGenerator\nconnection_type_validation: true\nconnections:\n- receiver: llm.messages\n sender: builder.prompt\nmax_runs_per_component: 100\nmetadata: {}\n\"\"\""
},
{
"cell_type": "markdown",
Expand Down
26 changes: 5 additions & 21 deletions tutorials/33_Hybrid_Retrieval.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,7 @@
"metadata": {
"id": "kTas9ZQ7lXP7"
},
"source": [
"# Tutorial: Creating a Hybrid Retrieval Pipeline\n",
"\n",
"- **Level**: Intermediate\n",
"- **Time to complete**: 15 minutes\n",
"- **Components Used**: [`DocumentSplitter`](https://docs.haystack.deepset.ai/docs/documentsplitter), [`SentenceTransformersDocumentEmbedder`](https://docs.haystack.deepset.ai/docs/sentencetransformersdocumentembedder), [`InMemoryDocumentStore`](https://docs.haystack.deepset.ai/docs/inmemorydocumentstore), [`InMemoryBM25Retriever`](https://docs.haystack.deepset.ai/docs/inmemorybm25retriever), [`InMemoryEmbeddingRetriever`](https://docs.haystack.deepset.ai/docs/inmemoryembeddingretriever), and [`TransformersSimilarityRanker`](https://docs.haystack.deepset.ai/docs/transformerssimilarityranker)\n",
"- **Prerequisites**: None\n",
"- **Goal**: After completing this tutorial, you will have learned about creating a hybrid retrieval and when it's useful."
]
"source": "# Tutorial: Creating a Hybrid Retrieval Pipeline\n\n- **Level**: Intermediate\n- **Time to complete**: 15 minutes\n- **Components Used**: [`DocumentSplitter`](https://docs.haystack.deepset.ai/docs/documentsplitter), [`SentenceTransformersDocumentEmbedder`](https://docs.haystack.deepset.ai/docs/sentencetransformersdocumentembedder), [`InMemoryDocumentStore`](https://docs.haystack.deepset.ai/docs/inmemorydocumentstore), [`InMemoryBM25Retriever`](https://docs.haystack.deepset.ai/docs/inmemorybm25retriever), [`InMemoryEmbeddingRetriever`](https://docs.haystack.deepset.ai/docs/inmemoryembeddingretriever), and [`SentenceTransformersSimilarityRanker`](https://docs.haystack.deepset.ai/docs/sentencetransformerssimilarityranker)\n- **Prerequisites**: None\n- **Goal**: After completing this tutorial, you will have learned about creating a hybrid retrieval and when it's useful."
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -230,24 +222,16 @@
"metadata": {
"id": "r8_jHzmosbC_"
},
"source": [
"### 2) Rank the Results\n",
"\n",
"Use the [TransformersSimilarityRanker](https://docs.haystack.deepset.ai/docs/transformerssimilarityranker) that scores the relevancy of all retrieved documents for the given search query by using a cross encoder model. In this example, you will use [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) model to rank the retrieved documents but you can replace this model with other cross-encoder models on Hugging Face."
]
"source": "### 2) Rank the Results\n\nUse the [SentenceTransformersSimilarityRanker](https://docs.haystack.deepset.ai/docs/sentencetransformerssimilarityranker) that scores the relevancy of all retrieved documents for the given search query by using a cross encoder model. In this example, you will use [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) model to rank the retrieved documents but you can replace this model with other cross-encoder models on Hugging Face."
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"metadata": {
"id": "cN0woIxHs4Ng"
},
"outputs": [],
"source": [
"from haystack.components.rankers import TransformersSimilarityRanker\n",
"\n",
"ranker = TransformersSimilarityRanker(model=\"BAAI/bge-reranker-base\")"
]
"source": "from haystack.components.rankers import SentenceTransformersSimilarityRanker\n\nranker = SentenceTransformersSimilarityRanker(model=\"BAAI/bge-reranker-base\")"
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -533,4 +517,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
Loading