diff --git a/sources/academy/ai/ai-agents.mdx b/sources/academy/ai/ai-agents.mdx index 8ccea0a205..22f4b95974 100644 --- a/sources/academy/ai/ai-agents.mdx +++ b/sources/academy/ai/ai-agents.mdx @@ -6,10 +6,6 @@ sidebar_position: 1 slug: /ai/ai-agents --- -**In this section of the Apify Academy, we show you how to build an AI agent with the CrewAI Python framework. You’ll learn how to create an agent for Instagram analysis and integrate it with LLMs and Apify Actors.** - ---- - AI agents are goal-oriented systems that make independent decisions. They interact with environments using predefined tools and workflows to automate complex tasks. On Apify, AI agents are built as Actors—serverless cloud programs for web scraping, data processing, and AI deployment. Apify evolved from running scrapers in the cloud to supporting LLMs that follow predefined workflows with dynamically defined goals. diff --git a/sources/academy/build-and-publish/apify-store-basics/how_to_create_actor_readme.md b/sources/academy/build-and-publish/apify-store-basics/how_to_create_actor_readme.md index 6b0d9d2e12..97dd8eeb69 100644 --- a/sources/academy/build-and-publish/apify-store-basics/how_to_create_actor_readme.md +++ b/sources/academy/build-and-publish/apify-store-basics/how_to_create_actor_readme.md @@ -6,10 +6,6 @@ category: build-and-publish slug: /actor-marketing-playbook/actor-basics/how-to-create-an-actor-readme --- -**Learn how to write a comprehensive README to help users better navigate, understand and run public Actors in Apify Store.** - ---- - ## What's a README in the Apify sense? At Apify, when we talk about a README, we don’t mean a guide mainly aimed at developers that explains what a project is, how to set it up, or how to contribute to it. At least, not in its traditional sense. diff --git a/sources/academy/build-and-publish/apify-store-basics/name_your_actor.md b/sources/academy/build-and-publish/apify-store-basics/name_your_actor.md index 05013561a4..2f539e893b 100644 --- a/sources/academy/build-and-publish/apify-store-basics/name_your_actor.md +++ b/sources/academy/build-and-publish/apify-store-basics/name_your_actor.md @@ -6,10 +6,6 @@ category: build-and-publish slug: /actor-marketing-playbook/actor-basics/name-your-actor --- -**Apify's standards for Actor naming. Learn how to choose the right name for scraping and automation Actors and how to optimize your Actor for search engines.** - ---- - Naming your Actor can be tricky, especially after you’ve worked hard on it. To help people find your Actor and make it stand out, we’ve set some naming guidelines. These will help your Actor rank better on Google and keep things consistent on [Apify Store](https://apify.com/store). Ideally, you should choose a name that clearly shows what your Actor does and includes keywords people might use to search for it. diff --git a/sources/academy/build-and-publish/how-to-build/actor_bundles.md b/sources/academy/build-and-publish/how-to-build/actor_bundles.md index 3ad238f54d..7a8b22f7ff 100644 --- a/sources/academy/build-and-publish/how-to-build/actor_bundles.md +++ b/sources/academy/build-and-publish/how-to-build/actor_bundles.md @@ -6,10 +6,6 @@ category: build-and-publish slug: /actor-marketing-playbook/product-optimization/actor-bundles --- -**Learn what an Actor bundle is, explore existing examples, and discover how to promote them.** - ---- - ## What is an Actor bundle? If an Actor is an example of web automation software, what is an Actor bundle? An Actor bundle is basically a chain of multiple Actors unified by a common use case. Bundles can include both scrapers and automation tools, and they are usually designed to achieve an overarching goal related to scraping or automation. diff --git a/sources/academy/build-and-publish/how-to-build/running_a_web_server.md b/sources/academy/build-and-publish/how-to-build/running_a_web_server.md index 8a4eaecc86..35ae9c14a1 100644 --- a/sources/academy/build-and-publish/how-to-build/running_a_web_server.md +++ b/sources/academy/build-and-publish/how-to-build/running_a_web_server.md @@ -6,10 +6,6 @@ category: build-and-publish slug: /running-a-web-server --- -**A web server running in an Actor can act as a communication channel with the outside world. Learn how to set one up with Node.js.** - ---- - Sometimes, an Actor needs a channel for communication with other systems (or humans). This channel might be used to receive commands, to provide info about progress, or both. To implement this, we will run a HTTP web server inside the Actor that will provide: - An API to receive commands. diff --git a/sources/academy/platform/apify_platform.md b/sources/academy/platform/apify_platform.md index 8b56843984..54c3f9dd42 100644 --- a/sources/academy/platform/apify_platform.md +++ b/sources/academy/platform/apify_platform.md @@ -6,10 +6,6 @@ category: apify platform slug: /apify-platform --- -**Learn all about the Apify platform, all of the tools it offers, and how it can improve your overall development experience.** - ---- - The [Apify platform](https://apify.com) was built to serve large-scale and high-performance web scraping and automation needs. It provides easy access to compute instances ([Actors](./getting_started/actors.md)), convenient request and result storages, proxies, scheduling, webhooks and more - all accessible through the **Console** web interface, [Apify's API](/api/v2), or our [JavaScript](/api/client/js) and [Python](/api/client/python) API clients. ## Category outline {#this-category} diff --git a/sources/academy/platform/deploying_your_code/deploying.md b/sources/academy/platform/deploying_your_code/deploying.md index 2f3185affa..e0a3131749 100644 --- a/sources/academy/platform/deploying_your_code/deploying.md +++ b/sources/academy/platform/deploying_your_code/deploying.md @@ -5,10 +5,6 @@ sidebar_position: 5 slug: /deploying-your-code/deploying --- -**Push local code to the platform, or create an Actor and integrate it with a Git repository for automatic rebuilds.** - ---- - Once you've **actorified** your code, there are two ways to deploy it to the Apify platform. You can either push the code directly from your local machine onto the platform, or you can create a blank Actor in the web interface, and then integrate its source code with a GitHub repository. ## With a Git repository {#with-git-repository} diff --git a/sources/academy/platform/deploying_your_code/docker_file.md b/sources/academy/platform/deploying_your_code/docker_file.md index c7ae50d9d9..0f0eae77f2 100644 --- a/sources/academy/platform/deploying_your_code/docker_file.md +++ b/sources/academy/platform/deploying_your_code/docker_file.md @@ -8,10 +8,6 @@ slug: /deploying-your-code/docker-file import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -**Learn to write a Dockerfile for your project so it can run in a Docker container on the Apify platform.** - ---- - The **Dockerfile** is a file which gives the Apify platform (or Docker, more specifically) instructions on how to create an environment for your code to run in. Every Actor must have a Dockerfile, as Actors run in Docker containers. :::note Local testing diff --git a/sources/academy/platform/deploying_your_code/index.md b/sources/academy/platform/deploying_your_code/index.md index bd67d39e27..fae905fbb7 100644 --- a/sources/academy/platform/deploying_your_code/index.md +++ b/sources/academy/platform/deploying_your_code/index.md @@ -9,10 +9,6 @@ slug: /deploying-your-code import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -**In this course learn how to take an existing project of yours and deploy it to the Apify platform as an Actor.** - ---- - This section discusses how to use your newfound knowledge of the Apify platform and Actors from the [**Getting started**](../getting_started/index.md) section to deploy your existing project's code to the Apify platform as an Actor. Any program running in a Docker container can become an Apify Actor. diff --git a/sources/academy/platform/deploying_your_code/input_schema.md b/sources/academy/platform/deploying_your_code/input_schema.md index 82eeab6b30..1b8eb72944 100644 --- a/sources/academy/platform/deploying_your_code/input_schema.md +++ b/sources/academy/platform/deploying_your_code/input_schema.md @@ -5,10 +5,6 @@ sidebar_position: 2 slug: /deploying-your-code/input-schema --- -**Learn how to generate a user interface on the platform for your Actor's input with a single file - the INPUT_SCHEMA.json file.** - ---- - Though writing an [input schema](/platform/actors/development/actor-definition/input-schema) for an Actor is not a required step, it's definitely an ideal one. The Apify platform will read the `INPUT_SCHEMA.json` file within the root of your project and generate a user interface for entering input into your Actor, which makes it significantly easier for non-developers (and even developers) to configure and understand the inputs your Actor can receive. Because of this, we'll be writing an input schema for our example Actor. :::note JSON requirement diff --git a/sources/academy/platform/deploying_your_code/inputs_outputs.md b/sources/academy/platform/deploying_your_code/inputs_outputs.md index db05c0338f..60228ced74 100644 --- a/sources/academy/platform/deploying_your_code/inputs_outputs.md +++ b/sources/academy/platform/deploying_your_code/inputs_outputs.md @@ -5,10 +5,6 @@ sidebar_position: 1 slug: /deploying-your-code/inputs-outputs --- -**Learn to accept input into your Actor, process it, and return output. This concept applies to Actors in any language.** - ---- - Most of the time when you're creating a project, you are expecting some sort of input from which your software will run off. Oftentimes as well, you want to provide some sort of output once your software has completed running. Apify provides a convenient way to handle inputs and deliver outputs. Understanding inputs and outputs is essential because they are read/written differently depending on where the Actor is running: diff --git a/sources/academy/platform/deploying_your_code/output_schema.md b/sources/academy/platform/deploying_your_code/output_schema.md index ed3c6156ee..2418236e45 100644 --- a/sources/academy/platform/deploying_your_code/output_schema.md +++ b/sources/academy/platform/deploying_your_code/output_schema.md @@ -5,10 +5,6 @@ sidebar_position: 3 slug: /deploying-your-code/dataset-schema --- -**Learn how to generate an appealing Overview table interface to preview your Actor results in real time on the Apify platform.** - ---- - The dataset schema generates an interface that enables users to instantly preview their Actor results in real time. ![Dataset Schema](../../../platform/actors/development/actor_definition/images/output-schema-example.png) diff --git a/sources/academy/platform/expert_scraping_with_apify/actors_webhooks.md b/sources/academy/platform/expert_scraping_with_apify/actors_webhooks.md index c5c7d1d4a8..51f3cd6cdd 100644 --- a/sources/academy/platform/expert_scraping_with_apify/actors_webhooks.md +++ b/sources/academy/platform/expert_scraping_with_apify/actors_webhooks.md @@ -6,8 +6,6 @@ sidebar_label: I - Webhooks & advanced Actor overview slug: /expert-scraping-with-apify/actors-webhooks --- -**Learn more advanced details about Actors, how they work, and the default configurations they can take. Also, learn how to integrate your Actor with webhooks.** - :::caution Updates coming This lesson is subject to change because it currently relies on code from our archived **Web scraping basics for JavaScript devs** course. For now you can still access the archived course, but we plan to completely retire it in a few months. This lesson will be updated to remove the dependency. diff --git a/sources/academy/platform/expert_scraping_with_apify/apify_api_and_client.md b/sources/academy/platform/expert_scraping_with_apify/apify_api_and_client.md index e4ade9c4fb..f960c494c6 100644 --- a/sources/academy/platform/expert_scraping_with_apify/apify_api_and_client.md +++ b/sources/academy/platform/expert_scraping_with_apify/apify_api_and_client.md @@ -6,10 +6,6 @@ sidebar_label: IV - Apify API & client slug: /expert-scraping-with-apify/apify-api-and-client --- -**Gain an in-depth understanding of the two main ways of programmatically interacting with the Apify platform - through the API, and through a client.** - ---- - You can use one of the two main ways to programmatically interact with the Apify platform: by directly using [Apify's RESTful API](/api/v2), or by using the [JavaScript](/api/client/js) and [Python](/api/client/python) API clients. In the next two lessons, we'll be focusing on the first two. > Apify's API and JavaScript API client allow us to do anything a regular user can do when interacting with the platform's web interface, only programmatically. diff --git a/sources/academy/platform/expert_scraping_with_apify/bypassing_anti_scraping.md b/sources/academy/platform/expert_scraping_with_apify/bypassing_anti_scraping.md index e5f034e1aa..765879be73 100644 --- a/sources/academy/platform/expert_scraping_with_apify/bypassing_anti_scraping.md +++ b/sources/academy/platform/expert_scraping_with_apify/bypassing_anti_scraping.md @@ -6,10 +6,6 @@ sidebar_label: VI - Bypassing anti-scraping methods slug: /expert-scraping-with-apify/bypassing-anti-scraping --- -**Learn about bypassing anti-scraping methods using proxies and proxy/session rotation together with Crawlee and the Apify SDK.** - ---- - Effectively bypassing anti-scraping software is one of the most crucial, but also one of the most difficult skills to master. The different types of [anti-scraping protections](../../webscraping/anti_scraping/index.md) can vary a lot on the web. Some websites aren't even protected at all, some require only moderate IP rotation, and some cannot even be scraped without using advanced techniques and workarounds. Additionally, because the web is evolving, anti-scraping techniques are also evolving and becoming more advanced. It is generally quite difficult to recognize the anti-scraping protections a page may have when first inspecting it, so it is important to thoroughly investigate a site prior to writing any lines of code, as anti-scraping measures can significantly change your approach as well as complicate the development process of an Actor. As your skills expand, you will be able to spot anti-scraping measures quicker, and better evaluate the complexity of a new project. diff --git a/sources/academy/platform/expert_scraping_with_apify/index.md b/sources/academy/platform/expert_scraping_with_apify/index.md index 040dbcf6c8..56e82675cb 100644 --- a/sources/academy/platform/expert_scraping_with_apify/index.md +++ b/sources/academy/platform/expert_scraping_with_apify/index.md @@ -6,10 +6,6 @@ category: apify platform slug: /expert-scraping-with-apify --- -**After learning the basics of Actors and Apify, learn to develop pro-level scrapers on the Apify platform with this advanced course.** - ---- - This course will teach you the nitty gritty of what it takes to build pro-level scrapers with Apify. We recommend that you've at least looked through all of the other courses in the academy prior to taking this one. ## Preparations diff --git a/sources/academy/platform/expert_scraping_with_apify/managing_source_code.md b/sources/academy/platform/expert_scraping_with_apify/managing_source_code.md index e9cf36d342..d318ed203b 100644 --- a/sources/academy/platform/expert_scraping_with_apify/managing_source_code.md +++ b/sources/academy/platform/expert_scraping_with_apify/managing_source_code.md @@ -6,10 +6,6 @@ sidebar_label: II - Managing source code slug: /expert-scraping-with-apify/managing-source-code --- -**Learn how to manage your Actor's source code more efficiently by integrating it with a GitHub repository. This is standard on the Apify platform.** - ---- - In this brief lesson, we'll discuss how to better manage an Actor's source code. Up 'til now, you've been developing your scripts locally, and then pushing the code directly to the Actor on the Apify platform; however, there is a much more optimal (and standard) way. ## Learning 🧠 {#learning} diff --git a/sources/academy/platform/expert_scraping_with_apify/migrations_maintaining_state.md b/sources/academy/platform/expert_scraping_with_apify/migrations_maintaining_state.md index 707e64fd7d..175e04d638 100644 --- a/sources/academy/platform/expert_scraping_with_apify/migrations_maintaining_state.md +++ b/sources/academy/platform/expert_scraping_with_apify/migrations_maintaining_state.md @@ -6,10 +6,6 @@ sidebar_label: V - Migrations & maintaining state slug: /expert-scraping-with-apify/migrations-maintaining-state --- -**Learn about what Actor migrations are and how to handle them properly so that the state is not lost and runs can safely be resurrected.** - ---- - We already know that Actors are Docker containers that can be run on any server. This means that they can be allocated anywhere there is space available, making them very efficient. Unfortunately, there is one big caveat: Actors move - a lot. When an Actor moves, it is called a **migration**. On migration, the process inside of an Actor is completely restarted and everything in its memory is lost, meaning that any values stored within variables or classes are lost. diff --git a/sources/academy/platform/expert_scraping_with_apify/saving_useful_stats.md b/sources/academy/platform/expert_scraping_with_apify/saving_useful_stats.md index 6bc13433f1..4fcac3560d 100644 --- a/sources/academy/platform/expert_scraping_with_apify/saving_useful_stats.md +++ b/sources/academy/platform/expert_scraping_with_apify/saving_useful_stats.md @@ -6,10 +6,6 @@ sidebar_label: VII - Saving useful run statistics slug: /expert-scraping-with-apify/saving-useful-stats --- -**Understand how to save statistics about an Actor's run, what types of statistics you can save, and why you might want to save them for a large-scale scraper.** - ---- - Using Crawlee and the Apify SDK, we are now able to collect and format data coming directly from websites and save it into a Key-Value store or Dataset. This is great, but sometimes, we want to store some extra data about the run itself, or about each request. We might want to store some extra general run information separately from our results or potentially include statistics about each request within its corresponding dataset item. The types of values that are saved are totally up to you, but the most common are error scores, number of total saved items, number of request retries, number of CAPTCHAs hit, etc. Storing these values is not always necessary, but can be valuable when debugging and maintaining an Actor. As your projects scale, this will become more and more useful and important. diff --git a/sources/academy/platform/expert_scraping_with_apify/solutions/handling_migrations.md b/sources/academy/platform/expert_scraping_with_apify/solutions/handling_migrations.md index 24399f8df6..0f37b0f62c 100644 --- a/sources/academy/platform/expert_scraping_with_apify/solutions/handling_migrations.md +++ b/sources/academy/platform/expert_scraping_with_apify/solutions/handling_migrations.md @@ -6,10 +6,6 @@ sidebar_label: V - Handling migrations slug: /expert-scraping-with-apify/solutions/handling-migrations --- -**Get real-world experience of maintaining a stateful object stored in memory, which will be persisted through migrations and even graceful aborts.** - ---- - Let's first head into our **demo-actor** and create a new file named **asinTracker.js** in the **src** folder. Within this file, we are going to build a utility class which will allow us to store, modify, persist, and log our tracked ASIN data. Here's the skeleton of our class: diff --git a/sources/academy/platform/expert_scraping_with_apify/solutions/index.md b/sources/academy/platform/expert_scraping_with_apify/solutions/index.md index 171cfd8dc3..a4fca45ca7 100644 --- a/sources/academy/platform/expert_scraping_with_apify/solutions/index.md +++ b/sources/academy/platform/expert_scraping_with_apify/solutions/index.md @@ -5,10 +5,6 @@ sidebar_position: 6.7 slug: /expert-scraping-with-apify/solutions --- -**View all of the solutions for all of the activities and tasks of this course. Please try to complete each task on your own before reading the solution!** - ---- - The final section of each lesson in this course will be a task which you as the course-taker are expected to complete before moving on to the next lesson. Each task's completion and understanding plays an important role in the ability to continue through the course. If you ever get stuck, or if you feel like your solution could be more optimal, you can always refer to the **Solutions** section of the course. Each solution will have all of the code and explanations needed to understand it. diff --git a/sources/academy/platform/expert_scraping_with_apify/solutions/integrating_webhooks.md b/sources/academy/platform/expert_scraping_with_apify/solutions/integrating_webhooks.md index 926dc70131..f85ca7539f 100644 --- a/sources/academy/platform/expert_scraping_with_apify/solutions/integrating_webhooks.md +++ b/sources/academy/platform/expert_scraping_with_apify/solutions/integrating_webhooks.md @@ -6,8 +6,6 @@ sidebar_label: I - Integrating webhooks slug: /expert-scraping-with-apify/solutions/integrating-webhooks --- -**Learn how to integrate webhooks into your Actors. Webhooks are a super powerful tool, and can be used to do almost anything!** - :::caution Updates coming This lesson is subject to change because it currently relies on code from our archived **Web scraping basics for JavaScript devs** course. For now you can still access the archived course, but we plan to completely retire it in a few months. This lesson will be updated to remove the dependency. diff --git a/sources/academy/platform/expert_scraping_with_apify/solutions/managing_source.md b/sources/academy/platform/expert_scraping_with_apify/solutions/managing_source.md index 9fed4b5d2d..83a827a641 100644 --- a/sources/academy/platform/expert_scraping_with_apify/solutions/managing_source.md +++ b/sources/academy/platform/expert_scraping_with_apify/solutions/managing_source.md @@ -6,10 +6,6 @@ sidebar_label: II - Managing source slug: /expert-scraping-with-apify/solutions/managing-source --- -**View in-depth answers for all three of the quiz questions that were provided in the corresponding lesson about managing source code.** - ---- - In the lesson corresponding to this solution, we discussed an extremely important topic: source code management. Though we solved the task right in the lesson, we've still included the quiz answers here. ## Quiz answers {#quiz-answers} diff --git a/sources/academy/platform/expert_scraping_with_apify/solutions/rotating_proxies.md b/sources/academy/platform/expert_scraping_with_apify/solutions/rotating_proxies.md index 88755208eb..e2f9eafe11 100644 --- a/sources/academy/platform/expert_scraping_with_apify/solutions/rotating_proxies.md +++ b/sources/academy/platform/expert_scraping_with_apify/solutions/rotating_proxies.md @@ -6,10 +6,6 @@ sidebar_label: VI - Rotating proxies/sessions slug: /expert-scraping-with-apify/solutions/rotating-proxies --- -**Learn firsthand how to rotate proxies and sessions in order to avoid the majority of the most common anti-scraping protections.** - ---- - If you take a look at our current code for the Amazon scraping Actor, you might notice this snippet: ```js diff --git a/sources/academy/platform/expert_scraping_with_apify/solutions/saving_stats.md b/sources/academy/platform/expert_scraping_with_apify/solutions/saving_stats.md index 3915dee01c..cee3081cea 100644 --- a/sources/academy/platform/expert_scraping_with_apify/solutions/saving_stats.md +++ b/sources/academy/platform/expert_scraping_with_apify/solutions/saving_stats.md @@ -6,10 +6,6 @@ sidebar_label: VII - Saving run stats slug: /expert-scraping-with-apify/solutions/saving-stats --- -**Implement the saving of general statistics about an Actor's run, as well as adding request-specific statistics to dataset items.** - ---- - The code in this solution will be similar to what we already did in the **Handling migrations** solution; however, we'll be storing and logging different data. First, let's create a new file called **Stats.js** and write a utility class for storing our run stats: ```js diff --git a/sources/academy/platform/expert_scraping_with_apify/solutions/using_api_and_client.md b/sources/academy/platform/expert_scraping_with_apify/solutions/using_api_and_client.md index e2cf1cf0d3..cfcff3e0e5 100644 --- a/sources/academy/platform/expert_scraping_with_apify/solutions/using_api_and_client.md +++ b/sources/academy/platform/expert_scraping_with_apify/solutions/using_api_and_client.md @@ -6,10 +6,6 @@ sidebar_label: IV - Using the Apify API & JavaScript client slug: /expert-scraping-with-apify/solutions/using-api-and-client --- -**Learn how to interact with the Apify API directly through the well-documented RESTful routes, or by using the proprietary Apify JavaScript client.** - ---- - Since we need to create another Actor, we'll once again use the `apify create` command and start from an empty template. This time, let's call our project **actor-caller**: ```text diff --git a/sources/academy/platform/expert_scraping_with_apify/tasks_and_storage.md b/sources/academy/platform/expert_scraping_with_apify/tasks_and_storage.md index ea5640c14d..6fbaeb745e 100644 --- a/sources/academy/platform/expert_scraping_with_apify/tasks_and_storage.md +++ b/sources/academy/platform/expert_scraping_with_apify/tasks_and_storage.md @@ -6,10 +6,6 @@ sidebar_label: III - Tasks & storage slug: /expert-scraping-with-apify/tasks-and-storage --- -**Understand how to save the configurations for Actors with Actor tasks. Also, learn about storage and the different types Apify offers.** - ---- - Both of these are very different things; however, they are also tied together in many ways. **Tasks** run Actors, Actors return data, and data is stored in different types of **Storages**. ## Tasks {#tasks} diff --git a/sources/academy/platform/get_most_of_actors/monetizing_your_actor.md b/sources/academy/platform/get_most_of_actors/monetizing_your_actor.md index 0f3726b1f7..135e70492b 100644 --- a/sources/academy/platform/get_most_of_actors/monetizing_your_actor.md +++ b/sources/academy/platform/get_most_of_actors/monetizing_your_actor.md @@ -6,10 +6,6 @@ slug: /get-most-of-actors/monetizing-your-actor unlisted: true --- -**Learn how you can monetize your web scraping and automation projects by publishing Actors to users in Apify Store.** - ---- - When you publish your Actor on the Apify platform, you have the option to make it a _Paid Actor_ and earn revenue from users who benefit from your tool. You can choose between two pricing models: - Rental diff --git a/sources/academy/platform/getting_started/actors.md b/sources/academy/platform/getting_started/actors.md index 2f8ebfc87c..b7e2b5d030 100644 --- a/sources/academy/platform/getting_started/actors.md +++ b/sources/academy/platform/getting_started/actors.md @@ -5,10 +5,6 @@ sidebar_position: 1 slug: /getting-started/actors --- -**What is an Actor? How do we create them? Learn the basics of what Actors are, how they work, and try out an Actor yourself right on the Apify platform!** - ---- - After you've followed the **Getting started** lesson, you're almost ready to start creating some Actors! But before we get into that, let's discuss what an Actor is, and a bit about how they work. ## What's an Actor? {#what-is-an-actor} diff --git a/sources/academy/platform/getting_started/apify_api.md b/sources/academy/platform/getting_started/apify_api.md index 5414aa5b0d..d2c5ff3d73 100644 --- a/sources/academy/platform/getting_started/apify_api.md +++ b/sources/academy/platform/getting_started/apify_api.md @@ -5,10 +5,6 @@ sidebar_position: 4 slug: /getting-started/apify-api --- -**Learn how to use the Apify API to programmatically call your Actors, retrieve data stored on the platform, view Actor logs, and more!** - ---- - [Apify's API](/api/v2) is your ticket to the Apify platform without even needing to access the [Apify Console](https://console.apify.com?asrc=developers_portal) web-interface. The API is organized around RESTful HTTP endpoints. In this lesson, we'll be learning how to use the Apify API to call an Actor and view its results. We'll be using the Actor we created in the previous lesson, so if you haven't already gotten that one set up, go ahead do that before moving forward if you'd like to follow along. diff --git a/sources/academy/platform/getting_started/apify_client.md b/sources/academy/platform/getting_started/apify_client.md index 4cc4b4e6d2..2b9732dfd9 100644 --- a/sources/academy/platform/getting_started/apify_client.md +++ b/sources/academy/platform/getting_started/apify_client.md @@ -8,10 +8,6 @@ slug: /getting-started/apify-client import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -**Interact with the Apify API in your code by using the apify-client package, which is available for both JavaScript and Python.** - ---- - Now that you've gotten your toes wet with interacting with the Apify API through raw HTTP requests, you're ready to become familiar with the **Apify client**, which is a package available for both JavaScript and Python that allows you to interact with the API in your code without explicitly needing to make any GET or POST requests. This lesson will provide code examples for both Node.js and Python, so regardless of the language you are using, you can follow along! diff --git a/sources/academy/platform/getting_started/index.md b/sources/academy/platform/getting_started/index.md index ab8768919d..8d40c463e5 100644 --- a/sources/academy/platform/getting_started/index.md +++ b/sources/academy/platform/getting_started/index.md @@ -6,10 +6,6 @@ category: apify platform slug: /getting-started --- -**Get started with the Apify platform by creating an account and learning about the Apify Console, which is where all Apify Actors are born!** - ---- - Your gateway to the Apify platform is your Apify account. The great thing about creating an account is that we support integration with both Google and GitHub, which takes only about 30 seconds! 1. Create your account on the [sign up](https://console.apify.com/sign-up?asrc=developers_portal) page. diff --git a/sources/academy/platform/getting_started/inputs_outputs.md b/sources/academy/platform/getting_started/inputs_outputs.md index 1b942658df..1ad0d3de3e 100644 --- a/sources/academy/platform/getting_started/inputs_outputs.md +++ b/sources/academy/platform/getting_started/inputs_outputs.md @@ -5,10 +5,6 @@ sidebar_position: 3 slug: /getting-started/inputs-outputs --- -**Create an Actor from scratch which takes an input, processes that input, and then outputs a result that can be used elsewhere.** - ---- - Actors, as any other programs, take inputs and generate outputs. The Apify platform has a way how to specify what inputs the Actor expects, and a way to temporarily or permanently store its results. In this lesson, we'll be demonstrating inputs and outputs by building an Actor which takes two numbers as input, adds them up, and then outputs the result. diff --git a/sources/academy/tutorials/api/index.md b/sources/academy/tutorials/api/index.md index 4cd998bdd9..af75155b54 100644 --- a/sources/academy/tutorials/api/index.md +++ b/sources/academy/tutorials/api/index.md @@ -6,10 +6,6 @@ category: tutorials slug: /api --- -**A collection of various tutorials explaining how to interact with the Apify platform programmatically using its API.** - ---- - This section explains how you can run [Apify Actors](/platform/actors) using Apify's [API](/api/v2), retrieve their results, and integrate them into your own product and workflows. You can do this using a raw HTTP client, or you can benefit from using one of our API clients for: - [JavaScript](/api/client/js/) diff --git a/sources/academy/tutorials/api/run_actor_and_retrieve_data_via_api.md b/sources/academy/tutorials/api/run_actor_and_retrieve_data_via_api.md index ad154cd126..cca1bbfd63 100644 --- a/sources/academy/tutorials/api/run_actor_and_retrieve_data_via_api.md +++ b/sources/academy/tutorials/api/run_actor_and_retrieve_data_via_api.md @@ -4,10 +4,6 @@ description: Learn how to run an Actor/task via the Apify API, wait for the job slug: /api/run-actor-and-retrieve-data-via-api --- -**Learn how to run an Actor/task via the Apify API, wait for the job to finish, and retrieve its output data. Your key to integrating Actors with your projects.** - ---- - import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; diff --git a/sources/academy/tutorials/apify_scrapers/index.md b/sources/academy/tutorials/apify_scrapers/index.md index f22f4999ef..5d76a2baa5 100644 --- a/sources/academy/tutorials/apify_scrapers/index.md +++ b/sources/academy/tutorials/apify_scrapers/index.md @@ -5,10 +5,6 @@ sidebar_position: 5 slug: /apify-scrapers --- -**Discover Apify's ready-made web scraping and automation tools. Compare Web Scraper, Cheerio Scraper and Puppeteer Scraper to decide which is right for you.** - ---- - Scraping and crawling the web can be difficult and time-consuming without the right tools. That's why Apify provides ready-made solutions to crawl and scrape any website. They are based on our [Actors](https://apify.com/actors), the [Apify SDK](/sdk/js) and [Crawlee](https://crawlee.dev/). Don't let the number of options confuse you. Unless you're really sure you need to use a specific tool, go ahead and use **Web Scraper** ([apify/web-scraper](./web_scraper.md)). It is the easiest to pick up and can handle almost anything. Look at **Puppeteer Scraper** ([apify/puppeteer-scraper](./puppeteer_scraper.md)) or **Cheerio Scraper** ([apify/cheerio-scraper](./cheerio_scraper.md)) only after you know your target websites well and need to optimize your scraper. diff --git a/sources/academy/tutorials/node_js/analyzing_pages_and_fixing_errors.md b/sources/academy/tutorials/node_js/analyzing_pages_and_fixing_errors.md index d2314a7bd7..1f82089c64 100644 --- a/sources/academy/tutorials/node_js/analyzing_pages_and_fixing_errors.md +++ b/sources/academy/tutorials/node_js/analyzing_pages_and_fixing_errors.md @@ -5,10 +5,6 @@ sidebar_position: 14.1 slug: /node-js/analyzing-pages-and-fixing-errors --- -**Learn how to deal with random crashes in your web-scraping and automation jobs. Find out the essentials of debugging and fixing problems in your crawlers.** - ---- - Debugging is absolutely essential in programming. Even if you don't call yourself a programmer, having basic debugging skills will make building crawlers easier. It will also help you save money by allowing you to avoid hiring an expensive developer to solve your issue for you. This quick lesson covers the absolute basics by discussing some of the most common problems and the simplest tools for analyzing and fixing them. diff --git a/sources/academy/tutorials/node_js/caching_responses_in_puppeteer.md b/sources/academy/tutorials/node_js/caching_responses_in_puppeteer.md index 50a86ff4d8..acdc5d7092 100644 --- a/sources/academy/tutorials/node_js/caching_responses_in_puppeteer.md +++ b/sources/academy/tutorials/node_js/caching_responses_in_puppeteer.md @@ -7,10 +7,6 @@ slug: /node-js/caching-responses-in-puppeteer import Example from '!!raw-loader!roa-loader!./caching_responses_in_puppeteer.js'; -**Learn why it is important for performance to cache responses in memory when intercepting requests in Puppeteer and how to implement it in your code.** - ---- - > In the latest version of Puppeteer, the request-interception function inconveniently disables the native cache and significantly slows down the crawler. Therefore, it's not recommended to follow the examples shown in this article unless you have a very specific use-case where the default browser cache is not enough (e.g. cashing over multiple scraper runs) When running crawlers that go through a single website, each open page has to load all resources again. The problem is that each resource needs to be downloaded through the network, which can be slow and/or unstable (especially when proxies are used). diff --git a/sources/academy/tutorials/node_js/choosing_the_right_scraper.md b/sources/academy/tutorials/node_js/choosing_the_right_scraper.md index 296bf338e5..6bd4015277 100644 --- a/sources/academy/tutorials/node_js/choosing_the_right_scraper.md +++ b/sources/academy/tutorials/node_js/choosing_the_right_scraper.md @@ -5,10 +5,6 @@ sidebar_position: 14.3 slug: /node-js/choosing-the-right-scraper --- -**Learn basic web scraping concepts to help you analyze a website and choose the best scraper for your particular use case.** - ---- - You can use one of the two main ways to proceed with building your crawler: 1. Using plain HTTP requests. diff --git a/sources/academy/tutorials/node_js/dealing_with_dynamic_pages.md b/sources/academy/tutorials/node_js/dealing_with_dynamic_pages.md index d1c1b60ecc..6bceb9fad5 100644 --- a/sources/academy/tutorials/node_js/dealing_with_dynamic_pages.md +++ b/sources/academy/tutorials/node_js/dealing_with_dynamic_pages.md @@ -7,10 +7,6 @@ slug: /node-js/dealing-with-dynamic-pages import Example from '!!raw-loader!roa-loader!./dealing_with_dynamic_pages.js'; -**Learn about dynamic pages and dynamic content. How can we find out if a page is dynamic? How do we programmatically scrape dynamic content?** - ---- - ## A quick experiment {#quick-experiment} From our adored and beloved [Fakestore](https://demo-webstore.apify.org/), we have been tasked to scrape each product's title, price, and image from the [new arrivals](https://demo-webstore.apify.org/search/new-arrivals) page. diff --git a/sources/academy/tutorials/node_js/how_to_fix_target_closed.md b/sources/academy/tutorials/node_js/how_to_fix_target_closed.md index 1046980ab5..1f8796c670 100644 --- a/sources/academy/tutorials/node_js/how_to_fix_target_closed.md +++ b/sources/academy/tutorials/node_js/how_to_fix_target_closed.md @@ -5,10 +5,6 @@ sidebar_position: 14.2 slug: /node-js/how_to_fix_target-closed --- -**Learn about common causes for the 'Target closed' error in browser automation and what you can do to fix it.** - ---- - The `Target closed` error happens when you try to access the `page` object (or some of its parent objects like the `browser`), but the underlying browser tab has already been closed. The exact error message can appear in several variants, such as `Target page, context or browser has been closed`, but none of them are very helpful for debugging. To debug it, attach logs in multiple places or use the headful mode. ## Out of memory diff --git a/sources/academy/tutorials/node_js/index.md b/sources/academy/tutorials/node_js/index.md index 8513387ed2..83aeb0ac2d 100644 --- a/sources/academy/tutorials/node_js/index.md +++ b/sources/academy/tutorials/node_js/index.md @@ -6,8 +6,4 @@ category: tutorials slug: /node-js --- -**A collection of various Node.js tutorials on scraping sitemaps, optimizing your scrapers, using popular Node.js web scraping libraries, and more.** - ---- - This section contains various web-scraping or web-scraping related tutorials for Node.js. Whether you're trying to scrape from a website with sitemaps, struggling with a dynamic page, want to optimize your slow Puppeteer scraper, or need some general tips for scraping in Node.js, this section is right for you. diff --git a/sources/academy/tutorials/node_js/js_in_html.md b/sources/academy/tutorials/node_js/js_in_html.md index 26e4225524..7e90fb91d7 100644 --- a/sources/academy/tutorials/node_js/js_in_html.md +++ b/sources/academy/tutorials/node_js/js_in_html.md @@ -5,10 +5,6 @@ sidebar_position: 14.5 slug: /node-js/js-in-html --- -**Learn about "hidden" data found within the JavaScript of certain pages, which can increase the scraper reliability and improve your development experience.** - ---- - Depending on the technology the target website is using, the data to be collected not only can be found within HTML elements, but also in a JSON format within `