Vespa for Data Scientists
Motivation
This library contains application specific code related to data manipulation and analysis of different Vespa use cases. The Vespa python API is used to interact with Vespa applications from python for faster exploration.
The main goal of this space is to facilitate prototyping and experimentation for data scientists. Please visit Vespa sample apps for production-ready use cases and Vespa docs for in-depth Vespa documentation.
Install
Code to support and reproduce the use cases documented here can be found in the learntorank
library.
Install via PyPI:
pip install learntorank
Development
All the code and content of this repo is created using nbdev by editing notebooks. We will give a summary below about the main points required to contribute, but we suggest going through nbdev tutorials to learn more.
Setting up environment
Create and activate a virtual environment of your choice. We recommend pipenv.
pipenv shell
Install Jupyter Lab (or Jupyter Notebook if you prefer).
pip3 install jupyterlab
Create a new kernel for Jupyter that uses the virtual environment created at step 1.
- Check where the current list of kernels is located with
jupyter kernelspec list
. - Copy one of the existing folder and rename it to
learntorank
. - Modify the
kernel.json
file that is inside the new folder to reflect thepython3
executable associated with your virtual env.
- Check where the current list of kernels is located with
Install
nbdev
library:pip3 install nbdev
Install
learntorank
in development mode:pip3 install -e .[dev]
Most used nbdev commands
From your terminal:
nbdev_help
: List all nbdev commands available.nbdev_readme
: UpdateREADME.md
based onindex.ipynb
Preview documentation while editing the notebooks:
nbdev_preview --port 3000
Workflow before pushing code:
nbdev_test --n_workers 2
: Execute all the tests inside notebooks.- Tests can run in parallel but since we create Docker containers we suggest a low number of workers to preserve memory.
nbdev_export
: Export code from notebooks to the python library.nbdev_clean
: Clean notebooks to avoid merge conflicts.
Publish library
nbdev_bump_version
: Bump library version.nbdev_pypi
: Publish library to PyPI.