From cc9321367bea462ac59226357570e31698deb2f3 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Sat, 26 Nov 2022 10:44:47 -0700 Subject: [PATCH 1/7] Update release instructions --- dev/release/README.md | 109 ++++++++++++++++++++++++++++++---- dev/release/create-tarball.sh | 2 + 2 files changed, 101 insertions(+), 10 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index 6e4fc9ab6..4726b2463 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -19,18 +19,107 @@ # DataFusion Python Release Process -This is a work-in-progress that will be updated as we work through the next release. - ## Preparing a Release Candidate -- Update the version number in Cargo.toml -- Generate changelog -- Tag the repo with an rc tag e.g. `0.7.0-rc1` -- Create tarball and upload to ASF -- Start the vote +### Tag the Repository + +```bash +git tag 0.7.0-rc1 +git push apache 0.7.0-rc1 +``` + +### Publish Python Wheels to testpypi + +Pushing an rc tag to master will cause a GitHub Workflow to run that will build the Python wheels. + +Go to https://github.com/apache/arrow-datafusion-python/actions and look for an action named "Python Release Build" +that has run against the pushed tag. + +Click on the action and scroll down to the bottom of the page titled "Artifacts". Download `dist.zip`. + +Upload the wheels to testpypi. + +```bash +unzip dist.zip +python3 -m pip install --upgrade setuptools twine build +python3 -m twine upload --repository testpypi datafusion-0.7.0-cp37-abi3-*.whl +``` -## Releasing Artifacts +### Create a source release ```bash -maturin publish -``` \ No newline at end of file +./dev/create_tarball 0.7.0 1 +``` + +This will also create the email template to send to the mailing list. Here is an example: + +``` +To: dev@arrow.apache.org +Subject: [VOTE][RUST][DataFusion] Release DataFusion Python Bindings 0.7.0 RC2 +Hi, + +I would like to propose a release of Apache Arrow DataFusion Python Bindings, +version 0.7.0. + +This release candidate is based on commit: bd1b78b6d444b7ab172c6aec23fa58c842a592d7 [1] +The proposed release tarball and signatures are hosted at [2]. +The changelog is located at [3]. +The Python wheels are located at [4]. + +Please download, verify checksums and signatures, run the unit tests, and vote +on the release. The vote will be open for at least 72 hours. + +Only votes from PMC members are binding, but all members of the community are +encouraged to test the release and vote with "(non-binding)". + +The standard verification procedure is documented at https://github.com/apache/arrow-datafusion-python/blob/master/dev/release/README.md#verifying-release-candidates. + +[ ] +1 Release this as Apache Arrow DataFusion Python 0.7.0 +[ ] +0 +[ ] -1 Do not release this as Apache Arrow DataFusion Python 0.7.0 because... + +Here is my vote: + ++1 + +[1]: https://github.com/apache/arrow-datafusion-python/tree/bd1b78b6d444b7ab172c6aec23fa58c842a592d7 +[2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-python-0.7.0-rc2 +[3]: https://github.com/apache/arrow-datafusion-python/blob/bd1b78b6d444b7ab172c6aec23fa58c842a592d7/CHANGELOG.md +[4]: https://test.pypi.org/project/datafusion/0.7.0/ +``` + +Create a draft email using this content, but do not send until after completing the next step. + +### Create Python Source Distribution + +Download the source tarball created in the previous step, untar it, and run: + +```bash +python3 -m build +``` + +This will create a file named `dist/datafusion-0.7.0.tar.gz`. Upload this to testpypi: + +```bash +python3 -m twine upload --repository testpypi dist/datafusion-0.7.0.tar.gz +``` + +### Send the Email + +Send the email to start the vote. + +## Publishing a Release + +### Publishing Apache Source Release + +Once the vote passes, we can publish the release. + +Create the source release tarball: + +```bash +./dev/release-tarball.sh 0.7.0 1 +``` + +### Publishing Python Artifacts + +Download the artifacts from testpypi and re-publish on PyPi using twine. diff --git a/dev/release/create-tarball.sh b/dev/release/create-tarball.sh index 64150f503..1a5bd068e 100755 --- a/dev/release/create-tarball.sh +++ b/dev/release/create-tarball.sh @@ -89,6 +89,7 @@ version ${version}. This release candidate is based on commit: ${release_hash} [1] The proposed release tarball and signatures are hosted at [2]. The changelog is located at [3]. +The Python wheels are located at [4]. Please download, verify checksums and signatures, run the unit tests, and vote on the release. The vote will be open for at least 72 hours. @@ -109,6 +110,7 @@ Here is my vote: [1]: https://github.com/apache/arrow-datafusion-python/tree/${release_hash} [2]: ${url} [3]: https://github.com/apache/arrow-datafusion-python/blob/${release_hash}/CHANGELOG.md +[4]: https://test.pypi.org/project/datafusion/${version}/ MAIL echo "---------------------------------------------------------" From 7c154c5b8962b8ffa97bcb6424eb0f5545fe13cb Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Sat, 26 Nov 2022 10:48:56 -0700 Subject: [PATCH 2/7] add notes on verifying release --- dev/release/README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/dev/release/README.md b/dev/release/README.md index 4726b2463..e7c38c398 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -108,6 +108,12 @@ python3 -m twine upload --repository testpypi dist/datafusion-0.7.0.tar.gz Send the email to start the vote. +## Verifying a Release + +```bash +pip install --extra-index-url https://test.pypi.org/simple/ datafusion==0.7.0 +``` + ## Publishing a Release ### Publishing Apache Source Release From 152cc68c8d83fe546be204d562c1defff1a0b643 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Sat, 26 Nov 2022 10:55:02 -0700 Subject: [PATCH 3/7] tag the release --- dev/release/README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/dev/release/README.md b/dev/release/README.md index e7c38c398..a1a86c598 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -129,3 +129,11 @@ Create the source release tarball: ### Publishing Python Artifacts Download the artifacts from testpypi and re-publish on PyPi using twine. + +### Push the Release Tag + +```bash +git checkout 0.7.0-rc1 +git tag 0.7.0 +git push apache 0.7.0 +``` From c9ff414f527469fe683bdfe932027ba13214914a Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Sun, 27 Nov 2022 10:46:45 -0700 Subject: [PATCH 4/7] Update dev/release/README.md Co-authored-by: Batuhan Taskaya --- dev/release/README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/dev/release/README.md b/dev/release/README.md index a1a86c598..65d91b766 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -128,7 +128,11 @@ Create the source release tarball: ### Publishing Python Artifacts -Download the artifacts from testpypi and re-publish on PyPi using twine. +Go to the Test PyPI page of Datafusion, and download [all published artifacts](https://test.pypi.org/project/datafusion/#files) under `dist-release/` directory. Then proceed uploading them using `twine`: + +```py +twine upload --repository pypi dist-release/* +``` ### Push the Release Tag From 2df9291b57b6e7bc204deb45ae3f6f5ae872677b Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Sun, 27 Nov 2022 10:58:01 -0700 Subject: [PATCH 5/7] Improve instructions --- dev/release/README.md | 72 ++++++++++++++++++++++++++++++++++--------- 1 file changed, 57 insertions(+), 15 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index 65d91b766..869c0689e 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -19,30 +19,45 @@ # DataFusion Python Release Process -## Preparing a Release Candidate +## Update Version -### Tag the Repository +The version number in Cargo.toml should be increased, according to semver. + +### Update CHANGELOG.md + +Define release branch (e.g. `master`), base version tag (e.g. `7.0.0`) and future version tag (e.g. `8.0.0`). Commits +between the base version tag and the release branch will be used to populate the changelog content. + +You will need a GitHub Personal Access Token for the following steps. Follow +[these instructions](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) +to generate one if you do not already have one. ```bash -git tag 0.7.0-rc1 -git push apache 0.7.0-rc1 +# create the changelog +CHANGELOG_GITHUB_TOKEN= ./dev/release/update_change_log-datafusion-python.sh master 0.7.0 0.6.0 +# review change log / edit issues and labels if needed, rerun until you are happy with the result +git commit -a -m 'Create changelog for release' ``` -### Publish Python Wheels to testpypi +_If you see the error `"You have exceeded a secondary rate limit"` when running this script, try reducing the CPU +allocation to slow the process down and throttle the number of GitHub requests made per minute, by modifying the +value of the `--cpus` argument in the `update_change_log.sh` script._ -Pushing an rc tag to master will cause a GitHub Workflow to run that will build the Python wheels. +You can add `invalid` or `development-process` label to exclude items from +release notes. -Go to https://github.com/apache/arrow-datafusion-python/actions and look for an action named "Python Release Build" -that has run against the pushed tag. +Send a PR to get these changes merged into `master` branch. If new commits that +could change the change log content landed in the `master` branch before you +could merge the PR, you need to rerun the changelog update script to regenerate +the changelog and update the PR accordingly. -Click on the action and scroll down to the bottom of the page titled "Artifacts". Download `dist.zip`. +## Preparing a Release Candidate -Upload the wheels to testpypi. +### Tag the Repository ```bash -unzip dist.zip -python3 -m pip install --upgrade setuptools twine build -python3 -m twine upload --repository testpypi datafusion-0.7.0-cp37-abi3-*.whl +git tag 0.7.0-rc1 +git push apache 0.7.0-rc1 ``` ### Create a source release @@ -90,7 +105,34 @@ Here is my vote: Create a draft email using this content, but do not send until after completing the next step. -### Create Python Source Distribution +### Publish Python Wheels to testpypi + +To securely upload your project, you’ll need a PyPI API token. Create one at +https://test.pypi.org/manage/account/#api-tokens, setting the “Scope” to “Entire account”. + +You will also need access to the [datafusion](https://test.pypi.org/project/datafusion/) project on testpypi. + +This section assumes some familiary with publishing Python packages to PyPi. For more information, refer to \ +[this tutorial](https://packaging.python.org/en/latest/tutorials/packaging-projects/#uploading-the-distribution-archives). + +Pushing an `rc` tag to master will cause a GitHub Workflow to run that will build the Python wheels. + +Go to https://github.com/apache/arrow-datafusion-python/actions and look for an action named "Python Release Build" +that has run against the pushed tag. + +Click on the action and scroll down to the bottom of the page titled "Artifacts". Download `dist.zip`. + +Upload the wheels to testpypi. + +```bash +unzip dist.zip +python3 -m pip install --upgrade setuptools twine build +python3 -m twine upload --repository testpypi datafusion-0.7.0-cp37-abi3-*.whl +``` + +When prompted for username, enter `__token__`. When prompted for a password, enter a valid GitHub Personal Access Token + +### Create Python Source Distribution to testpypi Download the source tarball created in the previous step, untar it, and run: @@ -128,7 +170,7 @@ Create the source release tarball: ### Publishing Python Artifacts -Go to the Test PyPI page of Datafusion, and download [all published artifacts](https://test.pypi.org/project/datafusion/#files) under `dist-release/` directory. Then proceed uploading them using `twine`: +Go to the Test PyPI page of Datafusion, and download [all published artifacts](https://test.pypi.org/project/datafusion/#files) under `dist-release/` directory. Then proceed uploading them using `twine`: ```py twine upload --repository pypi dist-release/* From ecc58d984242defa5e99637506c2aeb78fe0fca0 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Sun, 27 Nov 2022 10:59:06 -0700 Subject: [PATCH 6/7] Improve instructions --- dev/release/README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index 869c0689e..b4484d692 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -25,7 +25,7 @@ The version number in Cargo.toml should be increased, according to semver. ### Update CHANGELOG.md -Define release branch (e.g. `master`), base version tag (e.g. `7.0.0`) and future version tag (e.g. `8.0.0`). Commits +Define release branch (e.g. `master`), base version tag (e.g. `0.6.0`) and future version tag (e.g. `0.7.0`). Commits between the base version tag and the release branch will be used to populate the changelog content. You will need a GitHub Personal Access Token for the following steps. Follow @@ -170,7 +170,9 @@ Create the source release tarball: ### Publishing Python Artifacts -Go to the Test PyPI page of Datafusion, and download [all published artifacts](https://test.pypi.org/project/datafusion/#files) under `dist-release/` directory. Then proceed uploading them using `twine`: +Go to the Test PyPI page of Datafusion, and download +[all published artifacts](https://test.pypi.org/project/datafusion/#files) under `dist-release/` directory. Then proceed +uploading them using `twine`: ```py twine upload --repository pypi dist-release/* From b6210b8bd0af7933c41641a062e615f907a2d47d Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Sun, 27 Nov 2022 11:06:50 -0700 Subject: [PATCH 7/7] more improvements --- dev/release/README.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/dev/release/README.md b/dev/release/README.md index b4484d692..dd378f86e 100644 --- a/dev/release/README.md +++ b/dev/release/README.md @@ -23,7 +23,7 @@ The version number in Cargo.toml should be increased, according to semver. -### Update CHANGELOG.md +## Update CHANGELOG.md Define release branch (e.g. `master`), base version tag (e.g. `0.6.0`) and future version tag (e.g. `0.7.0`). Commits between the base version tag and the release branch will be used to populate the changelog content. @@ -105,7 +105,7 @@ Here is my vote: Create a draft email using this content, but do not send until after completing the next step. -### Publish Python Wheels to testpypi +### Publish Python Artifacts to testpypi To securely upload your project, you’ll need a PyPI API token. Create one at https://test.pypi.org/manage/account/#api-tokens, setting the “Scope” to “Entire account”. @@ -115,6 +115,8 @@ You will also need access to the [datafusion](https://test.pypi.org/project/data This section assumes some familiary with publishing Python packages to PyPi. For more information, refer to \ [this tutorial](https://packaging.python.org/en/latest/tutorials/packaging-projects/#uploading-the-distribution-archives). +#### Publish Python Wheels to testpypi + Pushing an `rc` tag to master will cause a GitHub Workflow to run that will build the Python wheels. Go to https://github.com/apache/arrow-datafusion-python/actions and look for an action named "Python Release Build" @@ -132,7 +134,7 @@ python3 -m twine upload --repository testpypi datafusion-0.7.0-cp37-abi3-*.whl When prompted for username, enter `__token__`. When prompted for a password, enter a valid GitHub Personal Access Token -### Create Python Source Distribution to testpypi +#### Publish Python Source Distribution to testpypi Download the source tarball created in the previous step, untar it, and run: @@ -152,10 +154,15 @@ Send the email to start the vote. ## Verifying a Release +Install the release from testpypi: + ```bash pip install --extra-index-url https://test.pypi.org/simple/ datafusion==0.7.0 ``` +Try running one of the examples from the top-level README, or write some custom Python code to query some available +data files. + ## Publishing a Release ### Publishing Apache Source Release