diff --git a/README.md b/README.md index 4345b52ed..b1d5397ef 100644 --- a/README.md +++ b/README.md @@ -19,16 +19,16 @@ # DataFusion in Python -[![Python test](https://github.com/apache/arrow-datafusion-python/actions/workflows/test.yaml/badge.svg)](https://github.com/apache/arrow-datafusion-python/actions/workflows/test.yaml) -[![Python Release Build](https://github.com/apache/arrow-datafusion-python/actions/workflows/build.yml/badge.svg)](https://github.com/apache/arrow-datafusion-python/actions/workflows/build.yml) +[![Python test](https://github.com/apache/datafusion-python/actions/workflows/test.yaml/badge.svg)](https://github.com/apache/datafusion-python/actions/workflows/test.yaml) +[![Python Release Build](https://github.com/apache/datafusion-python/actions/workflows/build.yml/badge.svg)](https://github.com/apache/datafusion-python/actions/workflows/build.yml) -This is a Python library that binds to [Apache Arrow](https://arrow.apache.org/) in-memory query engine [DataFusion](https://github.com/apache/arrow-datafusion). +This is a Python library that binds to [Apache Arrow](https://arrow.apache.org/) in-memory query engine [DataFusion](https://github.com/apache/datafusion). DataFusion's Python bindings can be used as a foundation for building new data systems in Python. Here are some examples: - [Dask SQL](https://github.com/dask-contrib/dask-sql) uses DataFusion's Python bindings for SQL parsing, query planning, and logical plan optimizations, and then transpiles the logical plan to Dask operations for execution. -- [DataFusion Ballista](https://github.com/apache/arrow-ballista) is a distributed SQL query engine that extends +- [DataFusion Ballista](https://github.com/apache/datafusion-ballista) is a distributed SQL query engine that extends DataFusion's Python bindings for distributed use cases. It is also possible to use these Python bindings directly for DataFrame and SQL operations, but you may find that @@ -120,23 +120,23 @@ See [examples](examples/README.md) for more information. ### Executing Queries with DataFusion -- [Query a Parquet file using SQL](./examples/sql-parquet.py) -- [Query a Parquet file using the DataFrame API](./examples/dataframe-parquet.py) -- [Run a SQL query and store the results in a Pandas DataFrame](./examples/sql-to-pandas.py) -- [Run a SQL query with a Python user-defined function (UDF)](./examples/sql-using-python-udf.py) -- [Run a SQL query with a Python user-defined aggregation function (UDAF)](./examples/sql-using-python-udaf.py) -- [Query PyArrow Data](./examples/query-pyarrow-data.py) -- [Create dataframe](./examples/import.py) -- [Export dataframe](./examples/export.py) +- [Query a Parquet file using SQL](https://github.com/apache/datafusion-python/blob/main/examples/sql-parquet.py) +- [Query a Parquet file using the DataFrame API](https://github.com/apache/datafusion-python/blob/main/examples/dataframe-parquet.py) +- [Run a SQL query and store the results in a Pandas DataFrame](https://github.com/apache/datafusion-python/blob/main/examples/sql-to-pandas.py) +- [Run a SQL query with a Python user-defined function (UDF)](https://github.com/apache/datafusion-python/blob/main/examples/sql-using-python-udf.py) +- [Run a SQL query with a Python user-defined aggregation function (UDAF)](https://github.com/apache/datafusion-python/blob/main/examples/sql-using-python-udaf.py) +- [Query PyArrow Data](https://github.com/apache/datafusion-python/blob/main/examples/query-pyarrow-data.py) +- [Create dataframe](https://github.com/apache/datafusion-python/blob/main/examples/import.py) +- [Export dataframe](https://github.com/apache/datafusion-python/blob/main/examples/export.py) ### Running User-Defined Python Code -- [Register a Python UDF with DataFusion](./examples/python-udf.py) -- [Register a Python UDAF with DataFusion](./examples/python-udaf.py) +- [Register a Python UDF with DataFusion](https://github.com/apache/datafusion-python/blob/main/examples/python-udf.py) +- [Register a Python UDAF with DataFusion](https://github.com/apache/datafusion-python/blob/main/examples/python-udaf.py) ### Substrait Support -- [Serialize query plans using Substrait](./examples/substrait.py) +- [Serialize query plans using Substrait](https://github.com/apache/datafusion-python/blob/main/examples/substrait.py) ## How to install (from pip) @@ -172,7 +172,7 @@ Bootstrap (Conda): ```bash # fetch this repo -git clone git@github.com:apache/arrow-datafusion-python.git +git clone git@github.com:apache/datafusion-python.git # create the conda environment for dev conda env create -f ./conda/environments/datafusion-dev.yaml -n datafusion-dev # activate the conda environment @@ -183,7 +183,7 @@ Bootstrap (Pip): ```bash # fetch this repo -git clone git@github.com:apache/arrow-datafusion-python.git +git clone git@github.com:apache/datafusion-python.git # prepare development environment (used to build wheel / install in development) python3 -m venv venv # activate the venv diff --git a/docs/README.md b/docs/README.md index 8cb101d92..b4b94120e 100644 --- a/docs/README.md +++ b/docs/README.md @@ -20,7 +20,7 @@ # DataFusion Documentation This folder contains the source content of the [Python API](./source/api). -This is published to https://arrow.apache.org/datafusion-python/ by a GitHub action +This is published to https://datafusion.apache.org/python by a GitHub action when changes are merged to the main branch. ## Dependencies @@ -66,15 +66,15 @@ firefox build/html/index.html ## Release Process -This documentation is hosted at https://arrow.apache.org/datafusion-python/ +This documentation is hosted at https://datafusion.apache.org/python When the PR is merged to the `main` branch of the DataFusion -repository, a [github workflow](https://github.com/apache/arrow-datafusion-python/blob/main/.github/workflows/docs.yaml) which: +repository, a [github workflow](https://github.com/apache/datafusion-python/blob/main/.github/workflows/docs.yaml) which: 1. Builds the html content -2. Pushes the html content to the [`asf-site`](https://github.com/apache/arrow-datafusion-python/tree/asf-site) branch in this repository. +2. Pushes the html content to the [`asf-site`](https://github.com/apache/datafusion-python/tree/asf-site) branch in this repository. The Apache Software Foundation provides https://arrow.apache.org/, which serves content based on the configuration in -[.asf.yaml](https://github.com/apache/arrow-datafusion-python/blob/main/.asf.yaml), -which specifies the target as https://arrow.apache.org/datafusion-python/. \ No newline at end of file +[.asf.yaml](https://github.com/apache/datafusion-python/blob/main/.asf.yaml), +which specifies the target as https://datafusion.apache.org/python. \ No newline at end of file diff --git a/docs/source/_static/images/2x_bgwhite_original.png b/docs/source/_static/images/2x_bgwhite_original.png new file mode 100644 index 000000000..abb5fca6e Binary files /dev/null and b/docs/source/_static/images/2x_bgwhite_original.png differ diff --git a/docs/source/_static/images/DataFusion-Logo-Background-White.png b/docs/source/_static/images/DataFusion-Logo-Background-White.png deleted file mode 100644 index 023c2373f..000000000 Binary files a/docs/source/_static/images/DataFusion-Logo-Background-White.png and /dev/null differ diff --git a/docs/source/_static/images/DataFusion-Logo-Background-White.svg b/docs/source/_static/images/DataFusion-Logo-Background-White.svg deleted file mode 100644 index b3bb47c5e..000000000 --- a/docs/source/_static/images/DataFusion-Logo-Background-White.svg +++ /dev/null @@ -1 +0,0 @@ -DataFUSION-Logo-Dark \ No newline at end of file diff --git a/docs/source/_static/images/DataFusion-Logo-Dark.png b/docs/source/_static/images/DataFusion-Logo-Dark.png deleted file mode 100644 index cc60f12a0..000000000 Binary files a/docs/source/_static/images/DataFusion-Logo-Dark.png and /dev/null differ diff --git a/docs/source/_static/images/DataFusion-Logo-Dark.svg b/docs/source/_static/images/DataFusion-Logo-Dark.svg deleted file mode 100644 index e16f24443..000000000 --- a/docs/source/_static/images/DataFusion-Logo-Dark.svg +++ /dev/null @@ -1 +0,0 @@ -DataFUSION-Logo-Dark \ No newline at end of file diff --git a/docs/source/_static/images/DataFusion-Logo-Light.png b/docs/source/_static/images/DataFusion-Logo-Light.png deleted file mode 100644 index 8992213b0..000000000 Binary files a/docs/source/_static/images/DataFusion-Logo-Light.png and /dev/null differ diff --git a/docs/source/_static/images/DataFusion-Logo-Light.svg b/docs/source/_static/images/DataFusion-Logo-Light.svg deleted file mode 100644 index b3bef2193..000000000 --- a/docs/source/_static/images/DataFusion-Logo-Light.svg +++ /dev/null @@ -1 +0,0 @@ -DataFUSION-Logo-Light \ No newline at end of file diff --git a/docs/source/_static/images/original.png b/docs/source/_static/images/original.png new file mode 100644 index 000000000..687f94676 Binary files /dev/null and b/docs/source/_static/images/original.png differ diff --git a/docs/source/_static/images/original.svg b/docs/source/_static/images/original.svg new file mode 100644 index 000000000..6ba0ece99 --- /dev/null +++ b/docs/source/_static/images/original.svg @@ -0,0 +1,31 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/source/_static/images/original2x.png b/docs/source/_static/images/original2x.png new file mode 100644 index 000000000..a7402109b Binary files /dev/null and b/docs/source/_static/images/original2x.png differ diff --git a/docs/source/_templates/docs-sidebar.html b/docs/source/_templates/docs-sidebar.html index 6541b7713..44deeed25 100644 --- a/docs/source/_templates/docs-sidebar.html +++ b/docs/source/_templates/docs-sidebar.html @@ -1,6 +1,6 @@ - +