Skip to content
4 changes: 0 additions & 4 deletions sources/academy/ai/ai-agents.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_position: 1
slug: /ai/ai-agents
---

**In this section of the Apify Academy, we show you how to build an AI agent with the CrewAI Python framework. You’ll learn how to create an agent for Instagram analysis and integrate it with LLMs and Apify Actors.**

---

AI agents are goal-oriented systems that make independent decisions. They interact with environments using predefined tools and workflows to automate complex tasks.

On Apify, AI agents are built as Actors—serverless cloud programs for web scraping, data processing, and AI deployment. Apify evolved from running scrapers in the cloud to supporting LLMs that follow predefined workflows with dynamically defined goals.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ category: build-and-publish
slug: /actor-marketing-playbook/actor-basics/how-to-create-an-actor-readme
---

**Learn how to write a comprehensive README to help users better navigate, understand and run public Actors in Apify Store.**

---

## What's a README in the Apify sense?

At Apify, when we talk about a README, we don’t mean a guide mainly aimed at developers that explains what a project is, how to set it up, or how to contribute to it. At least, not in its traditional sense.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ category: build-and-publish
slug: /actor-marketing-playbook/actor-basics/name-your-actor
---

**Apify's standards for Actor naming. Learn how to choose the right name for scraping and automation Actors and how to optimize your Actor for search engines.**

---

Naming your Actor can be tricky, especially after you’ve worked hard on it. To help people find your Actor and make it stand out, we’ve set some naming guidelines. These will help your Actor rank better on Google and keep things consistent on [Apify Store](https://apify.com/store).

Ideally, you should choose a name that clearly shows what your Actor does and includes keywords people might use to search for it.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ category: build-and-publish
slug: /actor-marketing-playbook/product-optimization/actor-bundles
---

**Learn what an Actor bundle is, explore existing examples, and discover how to promote them.**

---

## What is an Actor bundle?

If an Actor is an example of web automation software, what is an Actor bundle? An Actor bundle is basically a chain of multiple Actors unified by a common use case. Bundles can include both scrapers and automation tools, and they are usually designed to achieve an overarching goal related to scraping or automation.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ category: build-and-publish
slug: /running-a-web-server
---

**A web server running in an Actor can act as a communication channel with the outside world. Learn how to set one up with Node.js.**

---

Sometimes, an Actor needs a channel for communication with other systems (or humans). This channel might be used to receive commands, to provide info about progress, or both. To implement this, we will run a HTTP web server inside the Actor that will provide:

- An API to receive commands.
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/apify_platform.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ category: apify platform
slug: /apify-platform
---

**Learn all about the Apify platform, all of the tools it offers, and how it can improve your overall development experience.**

---

The [Apify platform](https://apify.com) was built to serve large-scale and high-performance web scraping and automation needs. It provides easy access to compute instances ([Actors](./getting_started/actors.md)), convenient request and result storages, proxies, scheduling, webhooks and more - all accessible through the **Console** web interface, [Apify's API](/api/v2), or our [JavaScript](/api/client/js) and [Python](/api/client/python) API clients.

## Category outline {#this-category}
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/deploying_your_code/deploying.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@ sidebar_position: 5
slug: /deploying-your-code/deploying
---

**Push local code to the platform, or create an Actor and integrate it with a Git repository for automatic rebuilds.**

---

Once you've **actorified** your code, there are two ways to deploy it to the Apify platform. You can either push the code directly from your local machine onto the platform, or you can create a blank Actor in the web interface, and then integrate its source code with a GitHub repository.

## With a Git repository {#with-git-repository}
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/deploying_your_code/docker_file.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,6 @@ slug: /deploying-your-code/docker-file
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

**Learn to write a Dockerfile for your project so it can run in a Docker container on the Apify platform.**

---

The **Dockerfile** is a file which gives the Apify platform (or Docker, more specifically) instructions on how to create an environment for your code to run in. Every Actor must have a Dockerfile, as Actors run in Docker containers.

:::note Local testing
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/deploying_your_code/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,6 @@ slug: /deploying-your-code
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

**In this course learn how to take an existing project of yours and deploy it to the Apify platform as an Actor.**

---

This section discusses how to use your newfound knowledge of the Apify platform and Actors from the [**Getting started**](../getting_started/index.md) section to deploy your existing project's code to the Apify platform as an Actor.
Any program running in a Docker container can become an Apify Actor.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@ sidebar_position: 2
slug: /deploying-your-code/input-schema
---

**Learn how to generate a user interface on the platform for your Actor's input with a single file - the INPUT_SCHEMA.json file.**

---

Though writing an [input schema](/platform/actors/development/actor-definition/input-schema) for an Actor is not a required step, it's definitely an ideal one. The Apify platform will read the `INPUT_SCHEMA.json` file within the root of your project and generate a user interface for entering input into your Actor, which makes it significantly easier for non-developers (and even developers) to configure and understand the inputs your Actor can receive. Because of this, we'll be writing an input schema for our example Actor.

:::note JSON requirement
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@ sidebar_position: 1
slug: /deploying-your-code/inputs-outputs
---

**Learn to accept input into your Actor, process it, and return output. This concept applies to Actors in any language.**

---

Most of the time when you're creating a project, you are expecting some sort of input from which your software will run off. Oftentimes as well, you want to provide some sort of output once your software has completed running. Apify provides a convenient way to handle inputs and deliver outputs.

Understanding inputs and outputs is essential because they are read/written differently depending on where the Actor is running:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@ sidebar_position: 3
slug: /deploying-your-code/dataset-schema
---

**Learn how to generate an appealing Overview table interface to preview your Actor results in real time on the Apify platform.**

---

The dataset schema generates an interface that enables users to instantly preview their Actor results in real time.

![Dataset Schema](../../../platform/actors/development/actor_definition/images/output-schema-example.png)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ sidebar_label: I - Webhooks & advanced Actor overview
slug: /expert-scraping-with-apify/actors-webhooks
---

**Learn more advanced details about Actors, how they work, and the default configurations they can take. Also, learn how to integrate your Actor with webhooks.**

:::caution Updates coming

This lesson is subject to change because it currently relies on code from our archived **Web scraping basics for JavaScript devs** course. For now you can still access the archived course, but we plan to completely retire it in a few months. This lesson will be updated to remove the dependency.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: IV - Apify API & client
slug: /expert-scraping-with-apify/apify-api-and-client
---

**Gain an in-depth understanding of the two main ways of programmatically interacting with the Apify platform - through the API, and through a client.**

---

You can use one of the two main ways to programmatically interact with the Apify platform: by directly using [Apify's RESTful API](/api/v2), or by using the [JavaScript](/api/client/js) and [Python](/api/client/python) API clients. In the next two lessons, we'll be focusing on the first two.

> Apify's API and JavaScript API client allow us to do anything a regular user can do when interacting with the platform's web interface, only programmatically.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: VI - Bypassing anti-scraping methods
slug: /expert-scraping-with-apify/bypassing-anti-scraping
---

**Learn about bypassing anti-scraping methods using proxies and proxy/session rotation together with Crawlee and the Apify SDK.**

---

Effectively bypassing anti-scraping software is one of the most crucial, but also one of the most difficult skills to master. The different types of [anti-scraping protections](../../webscraping/anti_scraping/index.md) can vary a lot on the web. Some websites aren't even protected at all, some require only moderate IP rotation, and some cannot even be scraped without using advanced techniques and workarounds. Additionally, because the web is evolving, anti-scraping techniques are also evolving and becoming more advanced.

It is generally quite difficult to recognize the anti-scraping protections a page may have when first inspecting it, so it is important to thoroughly investigate a site prior to writing any lines of code, as anti-scraping measures can significantly change your approach as well as complicate the development process of an Actor. As your skills expand, you will be able to spot anti-scraping measures quicker, and better evaluate the complexity of a new project.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ category: apify platform
slug: /expert-scraping-with-apify
---

**After learning the basics of Actors and Apify, learn to develop pro-level scrapers on the Apify platform with this advanced course.**

---

This course will teach you the nitty gritty of what it takes to build pro-level scrapers with Apify. We recommend that you've at least looked through all of the other courses in the academy prior to taking this one.

## Preparations
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: II - Managing source code
slug: /expert-scraping-with-apify/managing-source-code
---

**Learn how to manage your Actor's source code more efficiently by integrating it with a GitHub repository. This is standard on the Apify platform.**

---

In this brief lesson, we'll discuss how to better manage an Actor's source code. Up 'til now, you've been developing your scripts locally, and then pushing the code directly to the Actor on the Apify platform; however, there is a much more optimal (and standard) way.

## Learning 🧠 {#learning}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: V - Migrations & maintaining state
slug: /expert-scraping-with-apify/migrations-maintaining-state
---

**Learn about what Actor migrations are and how to handle them properly so that the state is not lost and runs can safely be resurrected.**

---

We already know that Actors are Docker containers that can be run on any server. This means that they can be allocated anywhere there is space available, making them very efficient. Unfortunately, there is one big caveat: Actors move - a lot. When an Actor moves, it is called a **migration**.

On migration, the process inside of an Actor is completely restarted and everything in its memory is lost, meaning that any values stored within variables or classes are lost.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: VII - Saving useful run statistics
slug: /expert-scraping-with-apify/saving-useful-stats
---

**Understand how to save statistics about an Actor's run, what types of statistics you can save, and why you might want to save them for a large-scale scraper.**

---

Using Crawlee and the Apify SDK, we are now able to collect and format data coming directly from websites and save it into a Key-Value store or Dataset. This is great, but sometimes, we want to store some extra data about the run itself, or about each request. We might want to store some extra general run information separately from our results or potentially include statistics about each request within its corresponding dataset item.

The types of values that are saved are totally up to you, but the most common are error scores, number of total saved items, number of request retries, number of CAPTCHAs hit, etc. Storing these values is not always necessary, but can be valuable when debugging and maintaining an Actor. As your projects scale, this will become more and more useful and important.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: V - Handling migrations
slug: /expert-scraping-with-apify/solutions/handling-migrations
---

**Get real-world experience of maintaining a stateful object stored in memory, which will be persisted through migrations and even graceful aborts.**

---

Let's first head into our **demo-actor** and create a new file named **asinTracker.js** in the **src** folder. Within this file, we are going to build a utility class which will allow us to store, modify, persist, and log our tracked ASIN data.

Here's the skeleton of our class:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@ sidebar_position: 6.7
slug: /expert-scraping-with-apify/solutions
---

**View all of the solutions for all of the activities and tasks of this course. Please try to complete each task on your own before reading the solution!**

---

The final section of each lesson in this course will be a task which you as the course-taker are expected to complete before moving on to the next lesson. Each task's completion and understanding plays an important role in the ability to continue through the course.

If you ever get stuck, or if you feel like your solution could be more optimal, you can always refer to the **Solutions** section of the course. Each solution will have all of the code and explanations needed to understand it.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ sidebar_label: I - Integrating webhooks
slug: /expert-scraping-with-apify/solutions/integrating-webhooks
---

**Learn how to integrate webhooks into your Actors. Webhooks are a super powerful tool, and can be used to do almost anything!**

:::caution Updates coming

This lesson is subject to change because it currently relies on code from our archived **Web scraping basics for JavaScript devs** course. For now you can still access the archived course, but we plan to completely retire it in a few months. This lesson will be updated to remove the dependency.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: II - Managing source
slug: /expert-scraping-with-apify/solutions/managing-source
---

**View in-depth answers for all three of the quiz questions that were provided in the corresponding lesson about managing source code.**

---

In the lesson corresponding to this solution, we discussed an extremely important topic: source code management. Though we solved the task right in the lesson, we've still included the quiz answers here.

## Quiz answers {#quiz-answers}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: VI - Rotating proxies/sessions
slug: /expert-scraping-with-apify/solutions/rotating-proxies
---

**Learn firsthand how to rotate proxies and sessions in order to avoid the majority of the most common anti-scraping protections.**

---

If you take a look at our current code for the Amazon scraping Actor, you might notice this snippet:

```js
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: VII - Saving run stats
slug: /expert-scraping-with-apify/solutions/saving-stats
---

**Implement the saving of general statistics about an Actor's run, as well as adding request-specific statistics to dataset items.**

---

The code in this solution will be similar to what we already did in the **Handling migrations** solution; however, we'll be storing and logging different data. First, let's create a new file called **Stats.js** and write a utility class for storing our run stats:

```js
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: IV - Using the Apify API & JavaScript client
slug: /expert-scraping-with-apify/solutions/using-api-and-client
---

**Learn how to interact with the Apify API directly through the well-documented RESTful routes, or by using the proprietary Apify JavaScript client.**

---

Since we need to create another Actor, we'll once again use the `apify create` command and start from an empty template. This time, let's call our project **actor-caller**:

```text
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ sidebar_label: III - Tasks & storage
slug: /expert-scraping-with-apify/tasks-and-storage
---

**Understand how to save the configurations for Actors with Actor tasks. Also, learn about storage and the different types Apify offers.**

---

Both of these are very different things; however, they are also tied together in many ways. **Tasks** run Actors, Actors return data, and data is stored in different types of **Storages**.

## Tasks {#tasks}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ slug: /get-most-of-actors/monetizing-your-actor
unlisted: true
---

**Learn how you can monetize your web scraping and automation projects by publishing Actors to users in Apify Store.**

---

When you publish your Actor on the Apify platform, you have the option to make it a _Paid Actor_ and earn revenue from users who benefit from your tool. You can choose between two pricing models:

- Rental
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/getting_started/actors.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@ sidebar_position: 1
slug: /getting-started/actors
---

**What is an Actor? How do we create them? Learn the basics of what Actors are, how they work, and try out an Actor yourself right on the Apify platform!**

---

After you've followed the **Getting started** lesson, you're almost ready to start creating some Actors! But before we get into that, let's discuss what an Actor is, and a bit about how they work.

## What's an Actor? {#what-is-an-actor}
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/getting_started/apify_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@ sidebar_position: 4
slug: /getting-started/apify-api
---

**Learn how to use the Apify API to programmatically call your Actors, retrieve data stored on the platform, view Actor logs, and more!**

---

[Apify's API](/api/v2) is your ticket to the Apify platform without even needing to access the [Apify Console](https://console.apify.com?asrc=developers_portal) web-interface. The API is organized around RESTful HTTP endpoints.

In this lesson, we'll be learning how to use the Apify API to call an Actor and view its results. We'll be using the Actor we created in the previous lesson, so if you haven't already gotten that one set up, go ahead do that before moving forward if you'd like to follow along.
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/getting_started/apify_client.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,6 @@ slug: /getting-started/apify-client
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

**Interact with the Apify API in your code by using the apify-client package, which is available for both JavaScript and Python.**

---

Now that you've gotten your toes wet with interacting with the Apify API through raw HTTP requests, you're ready to become familiar with the **Apify client**, which is a package available for both JavaScript and Python that allows you to interact with the API in your code without explicitly needing to make any GET or POST requests.

This lesson will provide code examples for both Node.js and Python, so regardless of the language you are using, you can follow along!
Expand Down
4 changes: 0 additions & 4 deletions sources/academy/platform/getting_started/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ category: apify platform
slug: /getting-started
---

**Get started with the Apify platform by creating an account and learning about the Apify Console, which is where all Apify Actors are born!**

---

Your gateway to the Apify platform is your Apify account. The great thing about creating an account is that we support integration with both Google and GitHub, which takes only about 30 seconds!

1. Create your account on the [sign up](https://console.apify.com/sign-up?asrc=developers_portal) page.
Expand Down
Loading