Command-line bioinformatics with reproducible containerized workflows
This guide is for students using the command-line pathway (Docker containers with terminal-based bioinformatics tools).
GUI pathway students: You don't need Docker! Instead, go to GUI Tools Setup to download graphical applications.
Before starting Docker installation, make sure you've completed:
Using your @ucr.edu email address
You can continue while waiting for approval (1-3 days)
Haven't completed GitHub setup? Go to GitHub Setup Guide first!
Docker is a platform that packages applications and their dependencies into containers. For this course, we've created a Docker container with all the bioinformatics tools you need pre-installed (Python, R, BLAST, MAFFT, IQ-TREE, etc.).
Why Docker for bioinformatics?
Our course containers are available through two methods. Understanding when to use each helps you choose the best option for your system.
Container registries are storage repositories for Docker images (the templates used to create containers). Think of them like app stores for containers.
What is it?
Docker Hub is the default, official registry for Docker images. It's maintained by Docker Inc.
Pros:
How to use:
docker pull cosmelab/dna-barcoding-analysis:latest
What is it?
GHCR stores container images directly on GitHub alongside the source code. Access requires GitHub authentication.
Pros:
Cons:
How to use:
# Authenticate with GitHub (one-time setup)
echo "YOUR_GITHUB_TOKEN" | docker login ghcr.io -u YOUR_USERNAME --password-stdin
# Pull the image
docker pull ghcr.io/cosmelab/dna-barcoding-analysis:latest
Windows Users: Use Docker Hub
Docker Desktop on Windows handles Docker Hub images better, especially for volume mounts, networking, and WSL 2 integration.
Command: docker pull cosmelab/dna-barcoding-analysis:latest
Mac Users: Either Works
Both Docker Hub and GHCR work well on macOS. Docker Hub is easier (no authentication needed).
Recommended: docker pull cosmelab/dna-barcoding-analysis:latest (Docker Hub)
Linux Users: GHCR with Podman (Lightweight)
Linux users can use Podman (a Docker alternative that doesn't require a daemon) with GHCR. No root privileges needed, lightweight, compatible with Docker commands.
sudo apt-get install podman
podman pull ghcr.io/cosmelab/dna-barcoding-analysis:latest
For This Course: We recommend Docker Hub for all students unless you have specific reasons to use GHCR. Docker Hub is simpler and works consistently across all platforms.
1. Check System Requirements
2. Open PowerShell as Administrator
Right-click the Start button and select "Windows PowerShell (Admin)" or "Terminal (Admin)"
3. Enable WSL 2
In the PowerShell window, run:
wsl --install
Restart your computer when prompted.
4. Install Ubuntu
After restarting, WSL 2 is enabled but you need to install a Linux distribution. Open PowerShell again and run:
wsl --install -d Ubuntu
When Ubuntu launches for the first time:
Alternative: You can also install Ubuntu from the Microsoft Store by searching for "Ubuntu".
5. Download Docker Desktop
Visit Docker Desktop for Windows and download the installer.
6. Install Docker Desktop
7. Verify Installation
Open PowerShell and run:
docker --version
docker run hello-world
Success! If you see version info and "Hello from Docker!", Docker is properly installed!
After installing Docker, verify it's working correctly by running these commands in your terminal (PowerShell on Windows, Terminal on Mac/Linux):
Step 1: Check Docker version
docker --version
Expected output: Docker version XX.X.X, build ...
Step 2: Run test container
docker run hello-world
You should see a "Hello from Docker!" message explaining that Docker is working correctly.
Both commands worked? Great! Docker is properly installed and ready to use.
Having issues? See the Troubleshooting Guide for common Docker installation problems.
VS Code can connect directly to Docker containers, allowing you to edit code, run terminal commands, and use all your extensions inside the container. This gives you a seamless development experience!
Ctrl+Shift+X / Cmd+Shift+X)Direct link: Dev Containers Extension
These extensions will make your coding experience better! Install them from the Extensions marketplace in VS Code:
Ctrl+Shift+X / Cmd+Shift+X)Dracula Theme Setup: After installing Dracula Official, press Ctrl+K Ctrl+T (or Cmd+K Cmd+T on Mac) and select "Dracula" from the list. Now your VS Code matches the course website!
Volume mounts let you share files between your computer and the container. This is crucial for saving your work!
What is a volume mount?
A volume mount creates a two-way sync between a folder on your computer (the "host") and a folder inside the Docker container. Changes in either location are immediately reflected in the other.
Example volume mount command:
docker run -v /Users/yourname/assignment:/workspace cosmelab/dna-barcoding-analysis:latest
Breaking it down:
-v flag tells Docker to create a volume mount/Users/yourname/assignment - folder on your computer (host path):/workspace - folder inside the container (container path)/workspace inside the container are saved to your computer!Why this matters: Without volume mounts, all your work would be lost when the container stops! Volume mounts ensure your files persist on your computer.
Before pulling containers, you need a Docker account (free) to access Docker Hub.
After Docker is installed and running, login from your terminal:
docker login
You'll be prompted for:
Success message: Login Succeeded
You only need to login once - Docker will remember your credentials!
If you prefer, you can also login through Docker Desktop:
Note: You must be logged in to Docker Hub to pull the course container. If you see "unauthorized" errors when pulling, make sure you've run docker login first!
After logging in to Docker Hub, pull our course container:
docker pull cosmelab/dna-barcoding-analysis:latest
What's included in this container?
First-time download: The container is about 2-3 GB and may take 10-20 minutes to download depending on your internet speed. This is a one-time download!
Let's test the container by starting a Jupyter server!
Step 1: Run the container with Jupyter
docker run -it --rm -p 8888:8888 cosmelab/dna-barcoding-analysis:latest
What do these flags mean?
-it - Interactive terminal (lets you interact with the container)--rm - Automatically remove container when it stops (keeps things clean)-p 8888:8888 - Map port 8888 in container to port 8888 on your computer (for Jupyter)Step 2: Access Jupyter in your browser
After running the command, look for output like this:
http://127.0.0.1:8888/?token=abc123def456...
Copy the entire URL and paste it into your web browser. You should see the Jupyter interface!
Success! If you see the Jupyter interface in your browser, your Docker container is working perfectly!
Note: To stop Jupyter and exit the container, press Ctrl+C twice in the terminal where the container is running.
The course container comes with zsh (Z Shell) pre-installed and configured with useful features like syntax highlighting and auto-suggestions. This makes working in the terminal much more pleasant!
Zsh is an advanced shell that enhances the command-line experience with:
Interactive shell (recommended for exploration):
docker run -it --rm -v "$(pwd)":/workspace -w /workspace cosmelab/dna-barcoding-analysis:latest zsh
This gives you a zsh shell inside the container where you can run multiple commands interactively!
One-off commands (for running specific scripts):
docker run --rm --entrypoint="" -v "$(pwd)":/workspace -w /workspace \
cosmelab/dna-barcoding-analysis:latest python3 script.py
The --entrypoint="" flag lets you run specific commands instead of starting the default shell.
-it - Interactive terminal (keeps the shell open so you can type commands)--rm - Automatically remove container when you exit (keeps things clean)-v "$(pwd)":/workspace - Mount current directory to /workspace in container-w /workspace - Set working directory to /workspace inside container--entrypoint="" - Override default entry point (use for one-off commands) Pro tip: The $(pwd) automatically expands to your current directory path. On Windows PowerShell, use ${PWD} instead.
Here's how you'd run a real analysis from the DNA barcoding repository:
# Mac/Linux:
docker run --rm --entrypoint="" -v "$(pwd)":/workspace -w /workspace \
cosmelab/dna-barcoding-analysis:latest \
python3 modules/04_phylogeny/build_tree.py \
results/tutorial/03_alignment/aligned_sequences.fasta \
results/tutorial/04_phylogeny/
# Windows PowerShell:
docker run --rm --entrypoint="" -v "${PWD}:/workspace" -w /workspace `
cosmelab/dna-barcoding-analysis:latest `
python3 modules/04_phylogeny/build_tree.py `
results/tutorial/03_alignment/aligned_sequences.fasta `
results/tutorial/04_phylogeny/
Why this works: The container has Python, BioPython, IQ-TREE, and all dependencies pre-installed. Your files are mounted at /workspace, so the script can read inputs and save outputs directly to your computer!
Here's the complete workflow for using Docker containers with VS Code for your assignments:
Before starting assignments, clone the course analysis repository. This contains tutorials, example data, and Python scripts for DNA barcoding workflows:
https://github.com/cosmelab/dna-barcoding-analysis
# Clone the repository
git clone https://github.com/cosmelab/dna-barcoding-analysis.git
cd dna-barcoding-analysis
# Explore the tutorials and example data
ls modules/ # Analysis scripts organized by module
ls data/ # Example FASTA files and sequences
This repository contains everything you need to learn the bioinformatics workflows!
1. Accept GitHub Classroom assignment
Click the assignment invitation link from your instructor. GitHub Classroom creates a private repository for you.
2. Clone the repository to your computer
git clone https://github.com/entm201l-fall2025/assignment-name-yourUsername.git
cd assignment-name-yourUsername
3. Open the folder in VS Code
code .
Or use File → Open Folder in VS Code
4. Reopen in Dev Container
VS Code will detect the .devcontainer configuration and prompt you to "Reopen in Container". Click it!
Or use Command Palette (Ctrl+Shift+P / Cmd+Shift+P) → "Dev Containers: Reopen in Container"
5. Work on your assignment
README.md for instructions6. Save your work to GitHub
git add .
git commit -m "Completed DNA barcoding analysis"
git push
What these commands do:
git add . - Stage all changed files for commitgit commit -m "message" - Save changes with a descriptive messagegit push - Upload changes to GitHub (submits your work!)Assignment submitted! Your instructor can now see your work on GitHub. You can push updates until the deadline.
Pro tip: Commit and push frequently! This ensures you don't lose work and lets your instructor see your progress.
Congratulations! You now have Docker, VS Code, and all the tools needed for command-line bioinformatics.
Head to the DNA Barcoding Analysis repository for tutorials and workflows:
If you encounter any issues with Docker, VS Code, or containers:
Repository URL: https://github.com/cosmelab/dna-barcoding-analysis
Essential Docker commands:
# Pull latest course container
docker pull cosmelab/dna-barcoding-analysis:latest
# List downloaded images
docker images
# Run container with Jupyter
docker run -it --rm -p 8888:8888 cosmelab/dna-barcoding-analysis:latest
# Start interactive zsh shell in container (Mac/Linux)
docker run -it --rm -v "$(pwd)":/workspace -w /workspace \
cosmelab/dna-barcoding-analysis:latest zsh
# Start interactive zsh shell in container (Windows PowerShell)
docker run -it --rm -v "${PWD}:/workspace" -w /workspace `
cosmelab/dna-barcoding-analysis:latest zsh
# Run a specific Python script (Mac/Linux)
docker run --rm --entrypoint="" -v "$(pwd)":/workspace -w /workspace \
cosmelab/dna-barcoding-analysis:latest \
python3 modules/04_phylogeny/build_tree.py input.fasta output/
# Run a specific Python script (Windows PowerShell)
docker run --rm --entrypoint="" -v "${PWD}:/workspace" -w /workspace `
cosmelab/dna-barcoding-analysis:latest `
python3 modules/04_phylogeny/build_tree.py input.fasta output/
# List running containers
docker ps
# Stop a running container
docker stop CONTAINER_ID