Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 72 additions & 1 deletion apps/docs/content/docs/en/tools/apify.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ These operations equip your agents to automate, scrape, and orchestrate data col

## Usage Instructions

Integrate Apify into your workflow. Run any Apify actor with custom input and retrieve results. Supports both synchronous and asynchronous execution with automatic dataset fetching.
Integrate Apify into your workflow. Run any Apify actor or saved task with custom input, fetch dataset items, and check run status. Supports both synchronous and asynchronous execution with automatic dataset fetching.



Expand Down Expand Up @@ -87,4 +87,75 @@ Run an APIFY actor asynchronously with polling for long-running tasks
| `datasetId` | string | Dataset ID containing results |
| `items` | array | Dataset items \(if completed\) |

### `apify_run_task`

Run a saved APIFY actor task synchronously and get dataset items (max 5 minutes)

#### Input

| Parameter | Type | Required | Description |
| --------- | ---- | -------- | ----------- |
| `apiKey` | string | Yes | APIFY API token from console.apify.com/account#/integrations |
| `taskId` | string | Yes | Task ID or username/task-name. Examples: "janedoe/my-task", "moJRLRc85AitArpNN" |
| `input` | string | No | JSON string that overrides the task\'s saved input. Example: \{"startUrls": \[\{"url": "https://example.com"\}\]\} |
| `itemLimit` | number | No | Max dataset items to return \(1-250000\). Example: 500 |
| `memory` | number | No | Memory in megabytes allocated for the run \(128-32768\). Example: 1024 for 1GB |
| `timeout` | number | No | Timeout in seconds for the run. Example: 300 for 5 minutes |
| `build` | string | No | Actor build to run. Examples: "latest", "beta", "1.2.3" |

#### Output

| Parameter | Type | Description |
| --------- | ---- | ----------- |
| `success` | boolean | Whether the task run succeeded |
| `status` | string | Run status \(SUCCEEDED, FAILED, etc.\) |
| `items` | array | Dataset items produced by the run |

### `apify_get_dataset_items`

Retrieve items stored in an APIFY dataset

#### Input

| Parameter | Type | Required | Description |
| --------- | ---- | -------- | ----------- |
| `apiKey` | string | Yes | APIFY API token from console.apify.com/account#/integrations |
| `datasetId` | string | Yes | Dataset ID to read items from. Example: "9RnD3Pql2vGZkc5H5" |
| `itemLimit` | number | No | Max items to return \(1-250000\). Default: all items. Example: 500 |
| `offset` | number | No | Number of items to skip at the start. Default: 0 |
| `fields` | string | No | Comma-separated list of fields to include. Example: "title,url,price" |

#### Output

| Parameter | Type | Description |
| --------- | ---- | ----------- |
| `success` | boolean | Whether the items were retrieved |
| `datasetId` | string | Dataset ID the items were read from |
| `items` | array | Items stored in the dataset |
| `count` | number | Number of items returned |

### `apify_get_run`

Get the status and details of an APIFY actor run

#### Input

| Parameter | Type | Required | Description |
| --------- | ---- | -------- | ----------- |
| `apiKey` | string | Yes | APIFY API token from console.apify.com/account#/integrations |
| `runId` | string | Yes | Actor run ID to fetch. Example: "HG7ML7M8z78YcAPEB" |

#### Output

| Parameter | Type | Description |
| --------- | ---- | ----------- |
| `success` | boolean | Whether the run was found |
| `runId` | string | APIFY run ID |
| `status` | string | Run status \(READY, RUNNING, SUCCEEDED, FAILED, etc.\) |
| `startedAt` | string | When the run started \(ISO timestamp\) |
| `finishedAt` | string | When the run finished \(ISO timestamp\) |
| `datasetId` | string | Default dataset ID for the run |
| `keyValueStoreId` | string | Default key-value store ID for the run |
| `stats` | json | Run statistics \(memory, CPU, duration\) |


16 changes: 14 additions & 2 deletions apps/sim/app/(landing)/integrations/data/integrations.json
Original file line number Diff line number Diff line change
Expand Up @@ -772,7 +772,7 @@
"slug": "apify",
"name": "Apify",
"description": "Run Apify actors and retrieve results",
"longDescription": "Integrate Apify into your workflow. Run any Apify actor with custom input and retrieve results. Supports both synchronous and asynchronous execution with automatic dataset fetching.",
"longDescription": "Integrate Apify into your workflow. Run any Apify actor or saved task with custom input, fetch dataset items, and check run status. Supports both synchronous and asynchronous execution with automatic dataset fetching.",
"bgColor": "#E0E0E0",
"iconName": "ApifyIcon",
"docsUrl": "https://docs.sim.ai/tools/apify",
Expand All @@ -784,9 +784,21 @@
{
"name": "Run Actor (Async)",
"description": "Run an APIFY actor asynchronously with polling for long-running tasks"
},
{
"name": "Run Task",
"description": "Run a saved APIFY actor task synchronously and get dataset items (max 5 minutes)"
},
{
"name": "Get Dataset Items",
"description": "Retrieve items stored in an APIFY dataset"
},
{
"name": "Get Run",
"description": "Get the status and details of an APIFY actor run"
}
],
"operationCount": 2,
"operationCount": 5,
"triggers": [],
"triggerCount": 0,
"authType": "api-key",
Expand Down
133 changes: 95 additions & 38 deletions apps/sim/blocks/blocks/apify.ts
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
import { ApifyIcon } from '@/components/icons'
import type { BlockConfig } from '@/blocks/types'
import { IntegrationType } from '@/blocks/types'
import { AuthMode, IntegrationType } from '@/blocks/types'
import type { RunActorResult } from '@/tools/apify/types'

const RUN_OPERATIONS = ['apify_run_actor_sync', 'apify_run_actor_async']
const RUN_OR_TASK_OPERATIONS = [...RUN_OPERATIONS, 'apify_run_task']

export const ApifyBlock: BlockConfig<RunActorResult> = {
type: 'apify',
name: 'Apify',
description: 'Run Apify actors and retrieve results',
authMode: AuthMode.ApiKey,
longDescription:
'Integrate Apify into your workflow. Run any Apify actor with custom input and retrieve results. Supports both synchronous and asynchronous execution with automatic dataset fetching.',
'Integrate Apify into your workflow. Run any Apify actor or saved task with custom input, fetch dataset items, and check run status. Supports both synchronous and asynchronous execution with automatic dataset fetching.',
docsLink: 'https://docs.sim.ai/tools/apify',
category: 'tools',
integrationType: IntegrationType.Search,
Expand All @@ -24,6 +28,9 @@ export const ApifyBlock: BlockConfig<RunActorResult> = {
options: [
{ label: 'Run Actor', id: 'apify_run_actor_sync' },
{ label: 'Run Actor (Async)', id: 'apify_run_actor_async' },
{ label: 'Run Task', id: 'apify_run_task' },
{ label: 'Get Dataset Items', id: 'apify_get_dataset_items' },
{ label: 'Get Run', id: 'apify_get_run' },
],
value: () => 'apify_run_actor_sync',
},
Expand All @@ -40,7 +47,32 @@ export const ApifyBlock: BlockConfig<RunActorResult> = {
title: 'Actor ID',
type: 'short-input',
placeholder: 'e.g., janedoe/my-actor or actor ID',
required: true,
condition: { field: 'operation', value: RUN_OPERATIONS },
required: { field: 'operation', value: RUN_OPERATIONS },
},
{
id: 'taskId',
title: 'Task ID',
type: 'short-input',
placeholder: 'e.g., janedoe/my-task or task ID',
condition: { field: 'operation', value: 'apify_run_task' },
required: { field: 'operation', value: 'apify_run_task' },
},
{
id: 'datasetId',
title: 'Dataset ID',
type: 'short-input',
placeholder: 'e.g., 9RnD3Pql2vGZkc5H5',
condition: { field: 'operation', value: 'apify_get_dataset_items' },
required: { field: 'operation', value: 'apify_get_dataset_items' },
},
{
id: 'runId',
title: 'Run ID',
type: 'short-input',
placeholder: 'e.g., HG7ML7M8z78YcAPEB',
condition: { field: 'operation', value: 'apify_get_run' },
required: { field: 'operation', value: 'apify_get_run' },
},
{
id: 'input',
Expand All @@ -49,6 +81,7 @@ export const ApifyBlock: BlockConfig<RunActorResult> = {
language: 'json',
placeholder: '{\n "startUrl": "https://example.com",\n "maxPages": 10\n}',
required: false,
condition: { field: 'operation', value: RUN_OR_TASK_OPERATIONS },
wandConfig: {
enabled: true,
prompt: `Generate a JSON configuration object for an Apify actor based on the user's description.
Expand Down Expand Up @@ -82,79 +115,95 @@ Return ONLY the valid JSON object - no explanations, no markdown.`,
type: 'short-input',
placeholder: 'Memory in MB (e.g., 1024 for 1GB, 2048 for 2GB)',
required: false,
mode: 'advanced',
condition: { field: 'operation', value: RUN_OR_TASK_OPERATIONS },
},
{
id: 'timeout',
title: 'Timeout',
type: 'short-input',
placeholder: 'Timeout in seconds (e.g., 300 for 5 min)',
required: false,
mode: 'advanced',
condition: { field: 'operation', value: RUN_OR_TASK_OPERATIONS },
},
{
id: 'build',
title: 'Build',
type: 'short-input',
placeholder: 'Build version (e.g., "latest", "beta", "1.2.3")',
required: false,
mode: 'advanced',
condition: { field: 'operation', value: RUN_OR_TASK_OPERATIONS },
},
{
id: 'waitForFinish',
title: 'Wait For Finish',
type: 'short-input',
placeholder: 'Initial wait time in seconds (0-60)',
required: false,
condition: {
field: 'operation',
value: 'apify_run_actor_async',
},
mode: 'advanced',
condition: { field: 'operation', value: 'apify_run_actor_async' },
},
{
id: 'itemLimit',
title: 'Item Limit',
type: 'short-input',
placeholder: 'Max dataset items to fetch (1-250000)',
required: false,
mode: 'advanced',
condition: {
field: 'operation',
value: 'apify_run_actor_async',
value: ['apify_run_actor_async', 'apify_run_task', 'apify_get_dataset_items'],
},
},
{
id: 'offset',
title: 'Offset',
type: 'short-input',
placeholder: 'Number of items to skip (default 0)',
required: false,
mode: 'advanced',
condition: { field: 'operation', value: 'apify_get_dataset_items' },
},
{
id: 'fields',
title: 'Fields',
type: 'short-input',
placeholder: 'Comma-separated fields (e.g., title,url,price)',
required: false,
mode: 'advanced',
condition: { field: 'operation', value: 'apify_get_dataset_items' },
},
],

tools: {
access: ['apify_run_actor_sync', 'apify_run_actor_async'],
access: [
'apify_run_actor_sync',
'apify_run_actor_async',
'apify_run_task',
'apify_get_dataset_items',
'apify_get_run',
],
config: {
tool: (params) => params.operation,
params: (params: Record<string, any>) => {
const { operation, ...rest } = params
const result: Record<string, any> = {
apiKey: rest.apiKey,
actorId: rest.actorId,
}

if (rest.input) {
result.input = rest.input
}

if (rest.memory) {
result.memory = Number(rest.memory)
}

if (rest.timeout) {
result.timeout = Number(rest.timeout)
}

if (rest.build) {
result.build = rest.build
}

if (rest.waitForFinish) {
result.waitForFinish = Number(rest.waitForFinish)
}

if (rest.itemLimit) {
result.itemLimit = Number(rest.itemLimit)
}
const result: Record<string, any> = { apiKey: rest.apiKey }

if (rest.actorId) result.actorId = rest.actorId
if (rest.taskId) result.taskId = rest.taskId
if (rest.datasetId) result.datasetId = rest.datasetId
if (rest.runId) result.runId = rest.runId
if (rest.input) result.input = rest.input
if (rest.build) result.build = rest.build
if (rest.fields) result.fields = rest.fields
if (rest.memory) result.memory = Number(rest.memory)
if (rest.timeout) result.timeout = Number(rest.timeout)
if (rest.waitForFinish) result.waitForFinish = Number(rest.waitForFinish)
if (rest.itemLimit) result.itemLimit = Number(rest.itemLimit)
if (rest.offset !== undefined && rest.offset !== null && rest.offset !== '')
result.offset = Number(rest.offset)

return result
},
Expand All @@ -165,19 +214,27 @@ Return ONLY the valid JSON object - no explanations, no markdown.`,
operation: { type: 'string', description: 'Operation to perform' },
apiKey: { type: 'string', description: 'Apify API token' },
actorId: { type: 'string', description: 'Actor ID or username/actor-name' },
taskId: { type: 'string', description: 'Task ID or username/task-name' },
datasetId: { type: 'string', description: 'Dataset ID to read items from' },
runId: { type: 'string', description: 'Actor run ID to fetch' },
input: { type: 'string', description: 'Actor input as JSON string' },
memory: { type: 'number', description: 'Memory in MB (128-32768)' },
timeout: { type: 'number', description: 'Timeout in seconds' },
build: { type: 'string', description: 'Actor build version' },
waitForFinish: { type: 'number', description: 'Initial wait time in seconds' },
itemLimit: { type: 'number', description: 'Max dataset items to fetch' },
offset: { type: 'number', description: 'Number of items to skip' },
fields: { type: 'string', description: 'Comma-separated fields to include' },
},

outputs: {
success: { type: 'boolean', description: 'Whether the actor run succeeded' },
success: { type: 'boolean', description: 'Whether the operation succeeded' },
runId: { type: 'string', description: 'Apify run ID' },
status: { type: 'string', description: 'Run status (SUCCEEDED, FAILED, etc.)' },
datasetId: { type: 'string', description: 'Dataset ID containing results' },
items: { type: 'json', description: 'Dataset items (if completed)' },
count: { type: 'number', description: 'Number of items returned (Get Dataset Items)' },
startedAt: { type: 'string', description: 'When the run started (Get Run)' },
finishedAt: { type: 'string', description: 'When the run finished (Get Run)' },
},
}
Loading
Loading