Use Jupyter, SciPy, pandas, and scikit-learn without installing any Pythons.
kingofsnake is a template for a reproducible data analysis lab which:
- serves Jupyter notebooks from a Docker container.
- never conflicts with existing Python(s), Anaconda, or virtualenvs.
- includes pinned versions of these packages and their dependencies:
networkx
notebook
pandas[all]
requests
scikit-learn
scipy
seabornSee the books folder for examples of:
- data cleaning
- unsupervised clustering
- supervised classification
- principal component analysis
- force-directed graph drawing
Generate a new repo from this template.
- Open a terminal and
cdto this folder. - Edit the Dockerfile to choose a Python version.
- Edit requirements.txt to choose Python packages.
- Run
./kitchen baketo build akingofsnake:latestDocker image. - Run
./kitchen freezeto updaterequirements.txtand rebuild.
- Open a terminal and
cdto this folder. - Run
./kitchen serveto start a Jupyter server. - Open a web browser and enter
localhost:8888in the address bar.
This runs Jupyter in a container, publishes port 8888, and mounts some folders from this repo:
etc/ipythonis mounted as/home/kos/.ipythonetc/jupyteris mounted as/home/kos/.jupyterbooksis mounted as/home/kos/bookscodeis mounted as/home/kos/codedatais mounted as/home/kos/data
Jupyter security: On the first run, Jupyter might ask you to copypaste a token and create a password. It will save the hashed password and any custom settings to etc/ipython and etc/jupyter in this repo. If those folders do not exist, they will be created automatically. Git ignores the contents of both folders.
- Open a terminal and
cdto this folder. ./kitchen cleanstops and deletes allkingofsnakecontainers../kitchen eightysixdeletes thekingofsnake:latestimage.
The clean command is rarely necessary because kingofsnake containers self-destruct.
The books folder contains example notebooks:
- classify.ipynb trains and tests an sklearn.linear_model classifier.
- clean.ipynb standardizes, sorts, and filters pandas DataFrames.
- cluster.ipynb finds clusters with scipy.cluster.hierarchy.
- components.ipynb finds principal components with sklearn.decomposition.PCA.
- graph.ipynb draws graphs using the ForceAtlas2 energy model.
- plot.ipynb uses matplotlib to visualize data.
The code folder contains example Python modules:
- classify.py for classification
- cluster.py for clustering
- graph.py for graph drawing
- plot.py for data visualization
- tools.py for constants and convenience methods
This folder is for storing data files. Git ignores everything in it except a few examples.
kingofsnake has one dependency:
Windows users may need to edit the kitchen script for path compatibility.
Show all available kitchen commands.
./kitchen helpRun a container as root without Jupyter, folder mounts, or published ports.
./kitchen runit latestBake another image called kingofsnake:karl, freeze it, and serve Jupyter.
./kitchen bake karl
./kitchen freeze karl
./kitchen serve karlDelete the kingofsnake:karl image and its containers:
./kitchen eightysix karlDon't install anything. Use this repo as a template.
Click on the terminal running Jupyter and press CTRL-C.
Yes. See the Docker run reference.
No. Delete them if you want to.
