Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
:orphan:

BigQuery DataFrames (BigFrames)
===============================


|GA| |pypi| |versions|

BigQuery DataFrames (also known as BigFrames) provides a Pythonic DataFrame
Expand Down
11 changes: 1 addition & 10 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@
"samples/AUTHORING_GUIDE.md",
"samples/CONTRIBUTING.md",
"samples/snippets/README.rst",
"README.rst", # used for include in overview.rst only
]

# The reST default role (used for this markup: `text`) to use for all
Expand Down Expand Up @@ -163,16 +164,6 @@
"logo": {
"text": "BigQuery DataFrames (BigFrames)",
},
"external_links": [
{
"name": "Getting started",
"url": "https://docs.cloud.google.com/bigquery/docs/dataframes-quickstart",
},
{
"name": "User guide",
"url": "https://docs.cloud.google.com/bigquery/docs/bigquery-dataframes-introduction",
},
],
"analytics": {
"google_analytics_id": "G-XVSRMCJ37X",
},
Expand Down
61 changes: 60 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,63 @@
.. include:: README.rst
.. BigQuery DataFrames documentation main file

Welcome to BigQuery DataFrames
==============================

**BigQuery DataFrames** (``bigframes``) provides a Pythonic interface for data analysis that scales to petabytes. It gives you the best of both worlds: the familiar API of **pandas** and **scikit-learn**, powered by the distributed computing engine of **BigQuery**.

BigQuery DataFrames consists of three main components:

* **bigframes.pandas**: A pandas-compatible API for data exploration and transformation.
* **bigframes.ml**: A scikit-learn-like interface for BigQuery ML, including integration with Gemini.
* **bigframes.bigquery**: Specialized functions for managing BigQuery resources and deploying custom logic.

Why BigQuery DataFrames?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we pull in some of the content from my go/why-bigframes draft? One of the missing elements here is enabling collaboration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, new revision has more content from your doc

------------------------

BigFrames allows you to process data where it lives. Instead of downloading massive datasets to your local machine, BigFrames translates your Python code into SQL and executes it across the BigQuery fleet.

* **Scalability:** Work with datasets that exceed local memory limits without complex refactoring.
* **Collaboration & Extensibility:** Bridge the gap between Python and SQL. Deploy custom Python functions to BigQuery, making your logic accessible to SQL-based teammates and data analysts.
* **Production-Ready Pipelines:** Move seamlessly from interactive notebooks to production. BigFrames simplifies data engineering by integrating with tools like **dbt** and **Airflow**, offering a simpler operational model than Spark.
* **Security & Governance:** Keep your data within the BigQuery perimeter. Benefit from enterprise-grade security, auditing, and data governance throughout your entire Python workflow.
* **Familiarity:** Use ``read_gbq``, ``merge``, ``groupby``, and ``pivot_table`` just like you do in pandas.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to make these into links.


Quickstart
----------

Install the library via pip:

.. code-block:: bash

pip install --upgrade bigframes

Load and aggregate a public dataset in just a few lines:

.. code-block:: python

import bigframes.pandas as bpd

# Load data from BigQuery
df = bpd.read_gbq("bigquery-public-data.usa_names.usa_1910_2013")

# Perform familiar pandas operations at scale
top_names = (
df.groupby("name")
.agg({"number": "sum"})
.sort_values("number", ascending=False)
.head(10)
)

print(top_names.to_pandas())


User Guide
----------

.. toctree::
:maxdepth: 2

user_guide/index

API reference
-------------
Expand Down
11 changes: 11 additions & 0 deletions docs/user_guide/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
User Guide
**********

.. include:: ../README.rst

.. toctree::
:caption: Guides
:maxdepth: 1

Getting Started <https://docs.cloud.google.com/bigquery/docs/dataframes-quickstart>
Cloud Docs User Guides <https://docs.cloud.google.com/bigquery/docs/bigquery-dataframes-introduction>
Loading