Skip to content

Conda install dependencies#648

Merged
vringar merged 41 commits into
masterfrom
conda-install-dependencies
May 18, 2020
Merged

Conda install dependencies#648
vringar merged 41 commits into
masterfrom
conda-install-dependencies

Conversation

@birdsarah

@birdsarah birdsarah commented May 13, 2020

Copy link
Copy Markdown
Contributor

Fixes: #615, #638, #628

This PR will result in the following changes:

  • Installation of OpenWPM will have a completely new procedure.
    • Documented in README
    • Implemented in docker
    • Implemented in .travisci
    • Same installation procedure for OSX and Linux (OpenWPM remains windows incompatible)
  • Geckodriver will be conda installed and so no geckodriver finding, it's just assumed to be on path
  • Re-pinning of dependencies will be done via a script, documented in README, not handled through pip-tools
  • There will only be one environment that contains both development and production needs. See "one environment" note below.
  • While cleaning up dependencies also add in domain_utils and resolve Replace automation.utilities.domain_utils with library #638
  • Should fix Running tests requires sudo #628, as it was fixed upstream and this PR will pick up those changes
  • Will standardize on one version of python. Because python installation is managed via conda, and all installation flows from conda, we do not need to support and test against multiple versions of python.
  • Will standardize on one version of node, for the same reason as above. This will reduce complexity of .travis.yaml
  • The majority of scripts / dev utils have been moved to a scripts directory, to keep things a bit tidier.

Firefox installation will not be managed by conda installation for now. There will need to be a follow-on PR and possible work done in conda packaging. The firefox available via conda install firefox -c conda-forge is release, not unbranded. (So #66 and #381 will not be resolved here.)

Note on one environment: With this PR we'll move to having only one environment instead of a requirements.txt and a requirements-dev.txt. Conda doesn't have the same tooling for managing dependencies as pip-tools. I'm happy with the repin.sh it's pretty simple. But I can't see a lightweight way to make a pinned environment-dev.yaml that has the same pinning environment.yaml. That same pinning is critical to ensure that the tests we run give the same result as someone who would just be using environment.yaml. That's not to say it isn't possible, we can write a bash script to do it, but I can't see that the extra maintenance and complexity is worth it, the only impact, as far as I can see, is that the conda environment grows a little. To measure this, I ran conda pack with and without the dev dependencies. The packed environments is 295M without dev dependencies and 322M with dev dependencies, a difference of <10%. I have, however, kept the input yaml files seperate so it's still clear what dependencies we have for dev and what for prod.

Package management desires / constraints:

  • Installation should be encapsulated in an environment and not change user's machine outside that environment.
  • Pinned packages for deployments to facilitate replicability
  • Pinning / replicability is not so important for dev environments / toolsets.
  • Run tests on OSX as well (future)

After merging this PR:

  • Update wiki to reflect new installation changes, in particular OSX workflow
  • Review need for run-on-osx-via-docker.sh script

@birdsarah birdsarah marked this pull request as draft May 13, 2020 19:04
@birdsarah birdsarah marked this pull request as ready for review May 14, 2020 07:47
@birdsarah birdsarah changed the title [WIP] Conda install dependencies Conda install dependencies May 14, 2020
@birdsarah birdsarah requested review from englehardt and vringar May 14, 2020 07:47
Comment thread Dockerfile
@vringar

vringar commented May 14, 2020

Copy link
Copy Markdown
Contributor

This looks really good to me but I still need to try it out. Thank you so much for this!

As for documenting stuff in the wiki @englehardt and I discussed that we want to move our documentation into a docs folder as this would allow us to get PRs against it instead of only maintainers being able to edit these documents.
I'm also hoping by moving this into the repository we'll be able to get our documentation onto readthedocs.io along side the actual interface documentation.

@vringar

vringar commented May 14, 2020

Copy link
Copy Markdown
Contributor

@nhnt11 do you have the time to try out run-on-osx-via-docker.sh because AFAIK you're the only one using MacOS who's involved with OpenWPM

Comment thread Dockerfile
@birdsarah

Copy link
Copy Markdown
Contributor Author

@nhnt11 do you have the time to try out run-on-osx-via-docker.sh because AFAIK you're the only one using MacOS who's involved with OpenWPM

I was going to leave this for a follow-up discussion, but with this PR I would debate the usefullness/appropriateness of “run-on-osx-via-docker.sh”.

I have run the install script on OSX on my husband’s machine and been able to tweak a few things, check it all runs. OSX and Linux should now be on equal footing when it comes to developer experience. My caveats in the docs are related to me not having fully explored, for example, xvfb and OSX. With that in mind, I don’t see the point of special casing OSX and docker. If we want general “run with docker” stuff - I think that’s well handled in the “openwpm-crawler” repo.

(That’s the gist of my thoughts, pushback welcome)

@vringar

vringar commented May 14, 2020

Copy link
Copy Markdown
Contributor

First of all sorry for being unable to read I read

Review need for run-on-osx-via-docker.sh script

as

Review need_ed_ for run-on-osx-via-docker.sh script

I was going to leave this for a follow-up discussion, but with this PR I would debate the usefullness/appropriateness of “run-on-osx-via-docker.sh”.

Having never looked into the script I expected it to do a lot more than it currently does.
We should/could just document the xhostcommand the same way we currently do for linux users.

And the actual docker exec is just the same thing as for the linux version, so yeah I agree with you, we don't need it anymore.

Add it to docker.
update readme.
@birdsarah

birdsarah commented May 15, 2020

Copy link
Copy Markdown
Contributor Author

Should be good to go @vringar. Thanks for testing it out. I have tested windows (via WSL), linux (ubuntu), the docker image, and osx.

@englehardt i think you’ll mostly be fine with this. Our biggest discussion point was pinning and I worked hard to try and find a good solution for you, there are detailed notes in the PR doc. There are compromises, but I really think it’s good enough like this. It’s certainly working and we can iterate if needed.

@englehardt

Copy link
Copy Markdown
Collaborator

@englehardt i think you’ll mostly be fine with this. Our biggest discussion point was pinning and I worked hard to try and find a good solution for you, there are detailed notes in the PR doc. There are compromises, but I really think it’s good enough like this. It’s certainly working and we can iterate if needed.

I read through the readme changes and Comment 0 of this PR. I agree you found the best solution, and I really appreciate that we're able to continue to pin all dependencies + easily update them. Simplifying things by removing the dev/non-dev distinction seems like a worthwhile tradeoff. Thanks for putting in the work to figure everything out and for thoroughly documenting it.

r+ from me for the meta-review. I'll leave it to @vringar to do a more thorough pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace automation.utilities.domain_utils with library Running tests requires sudo Manage openwpm dependencies with conda

3 participants