Why are we using "pythonlib:/app/pylib"?

In the  omegaml/README.md it is pointed, that we need to install our package in "pylib".

6.Install your package created above

 # install in pylib
 $ mkdir /path/to/pylib
 $ pip install --target /path/to/pylib

as well as provide it into docker-compose:

5.Map your config.yml into docker-compose

 	volumes:
 		- pythonlib:/app/pylib
 		- /host/path/to/config.yml:/app/config.yml

 volumes:
 	pythonlib: 


Could you, please, explain why can not we use just pip install <package_name>?

Why should we use /app/pylib/ and create a separate volume for this?

Comments

  • The omega|ml runtime consists of different workers, which are run in separate containers:

    • the web API
    • the celery worker
    • the apphub worker (currently at omegaml.io only)

    Each of these workers needs to have access to the same Python packages. The easiest way to achieve this is to install packages into /path/to/pylib .

    There are several reasons why the usual `pip install` is not a good option to install packages into omega|ml workers:

    • Due to the way docker containers work, they don't share a file system with the host system. Thus it is not possible for a worker to use your local Python environment (the one that `pip install` writes packages to).
    • We could map the Python environment into the docker container of course, but that only works reliably on one machine - consider you want to run docker-compose on some other machine where your Python env does not exist, then this approach would fail.
    • The docker containers run Linux, however your local environment could be Windows or Mac, or a different Linux release. Then your Python virtual env may not be compatible with the container. That's the case in particular with packages that install some platform-native binaries, as many of the data science packages do.

    If you are developing your own package, and it is a "pure Python" package that happens to be compatible between your host environment (where you develop) and the container runtime (where omega|ml workers run), you can try to install the package using edit mode:

    $ cd /path/to/your-package
    $ pip install -e . --target /app/pylib
    

    This way whenever you change your code, it will be automatically visible to your workers without installing again. Note you may need to restart the workers (using docker-compose) to load the new code version.

  • "Each of these workers needs to have access to the same Python packages"

    We can just install the same packages in 2 containers. Will that work?

  • edited October 2020

    It probably works technically, however it is not the intended way and can cause unintended side-effects (we do not consider this a supported use case). The better way to install packages is to use /app/pylib . This makes sure that the packages are consistent with all other dependencies, while not changing the default installation. We plan to provide support for managed virtual environments which will build on /app/pylib.

    What would be the use case to install differently?

    PS: if you are considering to use our managed services, this also provides /app/pylib support but it not possible to change the default python installation in a persistent manner.

  • Thank you for the response!

    Could you, please, also explain the use of: pythonlib:/app/pylib?

    As I understand, we create a named volume pythonlib and every our container uses it under e.g. pythonlib:/app/pylib. But this volume is empty, unless we install packages in our Dockerfile using:

    RUN pip install --upgrade --target /app/pylib/ ctgan

    It will be installed inside /app/pylib/ which could be used by all the containers as one source of packages?

    And we should not make a RUN pip install --upgrade --target ... in every Dockerfile (in different containers)?

  • Patric, can we have a call?

  • "This makes sure that the packages are consistent with all other dependencies"

    Do you mean, that purpose of she shared python folder is to make sure, that omegaml and skykit version among all of the containers are identical?


    It will still work, but I think, that it will be more pythonic to use separate containers, because having separate containers means:

    + easier to understand: it is not usual flow to install to target folder versus installing to site-packages, installing to /app/pylib/ is a little bit tricky

    + easier installation: what if I install 1 version from 1 container and 2-nd container want to install different version?

    + easier to debug: errors introduced in 1 container may influence another containers

    + easier to test: I can be sure that I test 1 and only 1 container

    + easier to maintain: what if I want install some library in 1 of containers and that breaks dependency in another container?

  • edited October 2020

    Correct. The `pythonlib:/app/pylib`is part of the docker-compose.yml specs, i.e. this is not specific to omega|ml. It is just the way to specify a library to docker-compose.

    And we should not make a RUN pip install --upgrade --target ... in every Dockerfile (in different containers)?

    No, you only run this once, e.g. like so

    $ docker-compose up
    $ docker-compose exec omegaml pip install --target /app/pylib
    $ docker-compose restart omegaml worker jyhub
    
  • Do you mean, that purpose of she shared python folder is to make sure, that omegaml and skykit version among all of the containers are identical?

    All omega|ml workers already use the same image by default and thus use the same scikit-learn, pandas etc. versions. The /app/pylib is to install additional packages, e.g. custom backends or additional frameworks.

    more pythonic to use separate containers, because having separate containers means:

    I don't see see the relationship to being Pythonic. omega|ml runs different containers for different services, as noted above. The purpose of these containers is to provide omega|ml services, in particular model storage, building, running/predictions.

    Other applications (e.g. a front-end app) should be run in a separate container, from a separate image, and not be installed inside the omega|ml containers - while technically possible the purpose of having microservices is to disentangle different components of an application. You may however install in your application containers omega|ml as a client library e.g. to access the REST API, or to use the omega|ml runtime's Python API

Sign In or Register to comment.