Coding and Engineering

Tools used to build AI systems.

Click here to go back to main contents.

Table of contents:

Python (1)

PyTorch (7)

data transforms(transform and augment data) data loader(build and load dataset) operation(tensor operations) module(modules to build models) activation function(activation functions) optimizer(optimizers) huggingface(huggingface tools)

Distributed Framework (4)

fsdp(fully sharded data parallel framework) torchrun(console script for Distributed Framework) deepspeed(deepspeed framework) ray(ray framework)

Tools (4)

vscode(vscode configs) macbook-reimage(macOS setup checklist) git(git tools) docker(docker tools)

Python

argparse

command-line arguments

Sep 29, 2025 | argparse

Parser for command-line options, arguments, and sub-commands.


import argparse

parser = argparse.ArgumentParser()

parser.add_argument("--name", type=str, default="John", help="your name")
parser.add_argument("--debug", action="store_true", help="debug mode")  # "store_true" means default is False

args = parser.parse_args()

print(args.name)

PyTorch

Data Transforms

transform and augment data

Jul 01, 2024 | data transforms

category	class / function (alphabetical)
geometry	RandomHorizontalFlip
resizing	Resize
conversion	Normalize ToTensor
else	Compose


from torchvision import transforms
from torchvision.transforms import InterpolationMode  # InterpolationMode.BILINEAR, NEAREST, BICUBIC, LANCZOS

RandomHorizontalFlip: horizontally flip the image randomly with a probability.


p = 0.5  # *** float. Probability to flip
trans = transforms.RandomHorizontalFlip(p)
image_trans = trans(image)  # PIL Image => PIL Image, or Tensor => Tensor

Resize: resize the image to a size.


## When `size` is int, the image shorter size will be resized to `size` with aspect ratio fixed
## When `size` is tuple, the image size will be resized to `size` with aspect ratio changed
size = /  # *** tuple or int
## NEAREST: fastest; lowest quality, jagged
## BILINEAR: fast; low quality, blur
## (recommend) BICUBIC: slow; good quality
## (recommend) LANCZOS: slowest; best quality
interpolation = InterpolationMode.BILINEAR  # *** InterpolationMode
## The shorter size may be lower than `size` if longer size exceeds `max_size` after resizing
max_size = None  # int. Maximum allowed for the longer edge, supported if `size` is int
trans = transforms.Resize(size, interpolation, max_size)
image_trans = trans(image)  # PIL Image => PIL Image, or Tensor => Tensor

ToTensor: convert a PIL Image or ndarray to tensor and scale the values accordingly.


## Input: PIL Image / numpy.ndarray (np.uint8) of shape (HxWxC) in the range [0, 255]
## Output: torch.FloatTensor of shape (CxHxW) in the range (0.0, 1.0)
## Other inputs: only apply type transform
trans = transforms.ToTensor()
image_trans = trans(image)

Compose: compose several transforms.


transforms = /  # *** list of Transform objects
trans = transforms.Compose(transforms)
image_trans = trans(image)  # PIL Image / ndarray / Tensor => Tensor

Normalize: normalize a tensor image with mean and standard deviation.


mean = /  # *** sequence. Means for each channel
std = /  # *** sequence. Standard deviations for each channel
inplace = False  # bool. Bool to make this operation in-place
trans = transforms.Normalize(mean, std, inplace)
image_trans = trans(image)  # Tensor => Tensor

Data Loader

build and load dataset

Jun 30, 2024 | data loader


import torch
from torch.utils.data import DataLoader, Dataset
from torch.utils.data.distributed import DistributedSampler

# Example: Build a dataset
class MyDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)
    
    def __getitem__(self, index):
        return self.data[index]

# Example: Build a distributed sampler
datset = /  # *** Dataset
num_replicas = world_size  # *** int. Number of replicas
rank = rank  # *** int. Rank of the current process
shuffle = False  # *** bool. If True, have the data shuffled at every epoch
seed = 0  # *** int. Random seed used to shuffle the sampler if `shuffle` is True
drop_last = False  # *** bool. If True, drop the last incomplete batch
sampler = DistributedSampler(dataset, num_replicas, rank, shuffle, seed, drop_last)

# Example: Build a data loader
# Note: if sampler is not None, shuffle must be False, drop_last can be either True or False
# Note: if sampler is None, one can set `torch.manual_seed(SEED)` to fix the random seed
dataset = /  # *** Dataset
batch_size = 1  # *** int. Number of samples per batch
shuffle = False  # *** bool. If True, have the data shuffled at every epoch
sampler = None  # Sampler or Iterable. Define how to draw samples
num_workers = 0  # *** int. Number of subprocesses to use for data loading
collate_fn = None  # Callable. Merge a list of samples to form a batch
pin_memory = False  # *** bool. If True, copy Tensors into CUDA pinned memory
drop_last = False  # *** bool. If True, drop the last incomplete batch
timeout = 0  # numeric. If positive, set timeout for collecting a batch from workers
prefetch_factor = None  # int. Default = None if num_workers == 0 else 2
data_loader = DataLoader(dataset, batch_size, shuffle, sampler, num_workers, 
    collate_fn, pin_memory, drop_last, timeout, prefetch_factor)

Operation

tensor operations

Jun 30, 2023 | operation

category	class / function (alphabetical)
operations	basic operations einsum isclose & allclose matmul mean & var softmax
data generation	arange uniform & normal zeros & ones
size	cat chunk & split flatten permute reshape & view size & shape squeeze & unsqueeze transpose unbind
else	where


import torch

basic operations: exp, sin, cos, sqrt.


y = torch.function(x)  # function: exp, sin, cos, sqrt

mean & var.


dim = /  # *** int or tuple of ints. Dims to reduce
keepdim = False # *** bool. If True, return tensor with the same dims
mean = x.mean(dim, keepdim)

## In version>=2.0, `correction=1` equals to `unbiased=True`, `correction=0` equals to `unbiased=False` 
correction = 1  # *** int. 
var = x.var(dim, keepdim, correction)

softmax.


dim = None  # *** int. Dim to apply softmax
y = x.softmax(dim)

matmul: matrix multiplication.


other = /  # *** tensor
y = x.matmul(other)

einsum: Einstein summation convention.


equation = /  # *** str. The subscript for the Einstein summation
operands = /  # *** list of tensor. The tensor to be computed
## torch.einsum("ii", tensor)  # trace
## torch.einsum("ii->i", tensor)  # diagonal
## torch.einsum("i,j->ij", tensor1, tensor2)  # outer product
## torch.einsum("bij,bjk->bik", tensor1, tensor2)  # batch matrix multiplication
## torch.einsum("...ij->...jk", tensor)  # batch permute
y = torch.einsum(equation, operands)

isclose & allclose: check whether two tensors are close.


other = /  # *** tensor. The second tensor to compare
rtol = 1e-5  # float. Relative tolerance
atol = 1e-8  # float. Absolute tolerance
equal_nan = False  # bool. If True, then two NaN will be considered equal
## Check if elements satisfy: |input - other| <= atol + rtol * other
x.isclose(other, rtol, atol, equal_nan)  # return a tensor of bool
x.allclose(other, rtol, atol, equal_nan)  # return True or False
## --------------------------------------------------------------------------------

zeros & ones: fill a tensor with a given value.


size = /  # *** sequence of int. The shape of output
y = torch.zeros(size)
y = torch.ones(size)

uniform & normal: fill a tensor with a given value.


size = /  # *** sequence of int. The shape of output
generator = None  # torch.Generator. A pseudorandom number generator for sampling
requires_grad = False  # bool. If use autograd
dtype = None  # torch.dtype. The desired data type
device = None  # torch.device. The desired device 
y = torch.rand(size, generator, requires_grad, dytpe, device)  # uniform distribution U(0, 1)
y = torch.randn(size, generator, requires_grad, dytpe, device)  # standard normal distribution N(0, 1)

arange: a sequence in order.


start = 0  # *** number. The starting value
end = /  # *** number. The ending value
step = 1  # *** number. The gap between adjacent points
arange = torch.arange(start, end, step)

size & shape: get tensor size.


dim = None  # int. Dim to retrieve the size
size = x.size(dim)  # => torch.Size or int
size = x.shape  #  => torch.Size

reshape & view: reshape a tensor with the given shape.


shape = /  # sequence of int. The new shape. A single dim could be -1
y = x.reshape(shape)  # recommend since it could call .contiguous() if needed
y = x.view(shape)

flatten: flatten along the given dimensions.


start_dim = 0  # *** int. The first dimension to flatten
end_dim = -1  # *** int. The last dimension to flatten
y = x.flatten(start_dim, end_dim)

transpose: swap two dimensions.


dim0 = /  # *** int. The first dim to be transposed
dim1 = /  # *** int. The second dim to be transposed
y = x.transpose(dim0, dim1)

permute: permute dimensions of a tensor.


dims = /  # *** sequence of int. The desired ordering of dims
y = x.permute(dims)

squeeze & unsqueeze: insert and remove dimensions.


dim = None  # *** int or tuple of ints. If given, only the dim will be squeezed
y = x.squeeze(dim)

dim = /  # *** int. The index at which to insert the singleton dim
y = x.unsqueeze(dim)  # Eqaul to y = x[:, :, None, :] when dim = 2

cat: concatenate tensors along a dimension.


tensors = /  # *** tuple of tensors. Tensors with the same shape except in the cat dim
dim = 0  # *** int. The concatenation dim
y = torch.cat(tensors, dim)

unbind: remove a dimension by splitting it.


dim = 0  # *** int. Dim to remove
y = x.unbind(dim)

chunk & split: split a tensor with chunk numbers or split sizes.


chunks = /  # *** int
dim = 0  # *** int
## If the given dim is divisible by chunks, all returned chunks will be the same size
## If the given dim is not divisible by chunks, the last one will not be the same size
## If such division is not possible, it returns fewer than the specified number of chunks
y = x.chunk(chunks, dim)

indices_or_sections = /  # *** tensor, int, list, tuple of ints
dim = 0  # *** int. Dim along which to split the tensor
## If split_size_or_sections is an integer type, split into equally sized chunks
## If split_size_or_sections is a list, split into len(split_size_or_sections) chunks
y = x.split(indices_or_sections, dim)

where: select elements from a tensor.


condition = /  # *** bool. When True, yield input, otherwise yield other
input = /  # *** tensor or scalar
other = /  # *** tensor or scalar
y = torch.where(condition, input, output)

Module

modules to build models

Jun 29, 2024 | module

category	tool (alphabetical)
parameter	Parameter & Buffer
convolution	Conv2d Conv3d
other module	Linear
else	Dropout


import torch
from torch import nn

Parameter & Buffer.


data = /  # *** tensor. Parameter tensor
requires_grad = True
gamma = torch.Parameter(data, requires_grad)

persistent = True  # whether the buffer is part of the module's state_dict
gamma = self.register_buffer(data, persistent)  # usually used in __init__
gamma = nn.parameter.Buffer(data, persistent)  # not usually used

Linear: affine linear transformation.


in_features = /  # *** int. Input features
out_features = /  # *** int. Output features
bias = True  # *** bool. If True, learn an additive bias
device = None  # torch.device or int 
dtype = None  # torch.dtype
linear = Linear(in_features, out_features, bias, device, dtype)  
y = linear(x)  # [..., H_in] => [..., H_out]

Conv2d: 2D convolution.


in_channels = /  # *** int. Number of channels in the input
out_channels = /  # *** int. Number of channels in the output
kernel_size = /  # *** int, tuple. Size of convolving kernel
stride = 1  # *** int, tuple. Stride of convolution
padding = 0  # int, tuple, str. Padding added to all four sides of the input
dilation = 1  # int, tuple. Spacing between kernel elements
groups = 1  # int. Number of blocked connections from input channels to output
bias = True  # bool. If True, add a learnable bias to the output
padding_mode = "zeros"  # str. "zeros", "reflect", "replicate", or "circular"
device = None  # torch.device or int
dtype = None  # torch.dtype
## Weight. Shape: [out_channels, in_channels/groups, k_size[0], k_size[1]]
## Bias. Shape: [out_channels,]
conv2d = nn.Conv2d(in_channels, out_channels, kernel_size, stride, 
    padding, dilation, groups, bias, padding_mode, device, dtype)
## H_out = [(H_in + 2*padding[0] - dilation[0]*(kernel[0]-1)-1) / stride[0] + 1]
## W_out = [(W_in + 2*padding[1] - dilation[1]*(kernel[1]-1)-1) / stride[1] + 1]
y = conv2d(x)  # [B, C, H_in, W_in] => [B, C, H_out, W_out]

Conv3d: 3D convolution.


in_channels = /  # *** int. Number of channels in the input
out_channels = /  # *** int. Number of channels in the output
kernel_size = /  # *** int, tuple. Size of convolving kernel
stride = 1  # *** int, tuple. Stride of convolution
padding = 0  # int, tuple, str. Padding added to all six sides of the input
dilation = 1  # int, tuple. Spacing between kernel elements
groups = 1  # int. Number of blocked connections from input channels to output
bias = True  # bool. If True, add a learnable bias to the output
padding_mode = "zeros"  # str. "zeros", "reflect", "replicate", or "circular"
device = None  # torch.device or int
dtype = None  # torch.dtype
## Weight. Shape: [out_channels, in_channels/groups, k_size[0], k_size[1], k_size[2]]
## Bias. Shape: [out_channels,]
conv3d = nn.Conv3d(in_channels, out_channels, kernel_size, stride, 
    padding, dilation, groups, bias, padding_mode, device, dtype)
## D_out = [(D_in + 2*padding[0] - dilation[0]*(kernel[0]-1)-1) / stride[0] + 1]
## H_out = [(H_in + 2*padding[1] - dilation[1]*(kernel[1]-1)-1) / stride[1] + 1]
## W_out = [(W_in + 2*padding[2] - dilation[2]*(kernel[2]-1)-2) / stride[2] + 1]
y = conv3d(x)  # [B, C, D_in, H_in, W_in] => [B, C, D_out, H_out, W_out]

Dropout.


p = 0.5  # *** float. Probability of an element to be zeroed
inplace = False  # bool
dropout = nn.Dropout(p, inplace)
y = dropout(x)

Activation Function

activation functions

Jun 28, 2024 | activation function

tool (alphabetical)	popular applications
GeLU	/
SiLU/swish	/


from torch import nn

GeLU: Gaussian Error Linear Units function, $\mathrm{GeLU}(x)=x*\phi(x)$, where $\phi$ is cumulative distribution function.


gelu = nn.GeLU()
y = gelu(x)

SiLU/swish: Sigmoid Linear Unit function, $\mathrm{SiLU}(x)=x*\sigma(x)$, where $\sigma$ is logistic function.


inplace = False
silu = nn.SiLU(inplace)
y = silu(x)

Optimizer

optimizers

Jun 28, 2024 | optimizer

It includes tools for building optimization algorithms: AdamW.


from torch.optim import AdamW

## --------------------------------------------------------------------------------
## AdamW
## --------------------------------------------------------------------------------
params = /  # *** iterable. Parameters / named_parameters / parameter groups to optimize
lr = 0.001  # *** float, Tensor. Learning rate
betas = (0.9, 0.999)  # tuple. For computing running averages of gradients & squares
weight_decay = 0.01  # float. Weight decay coefficient
# ...
adam_optim = AdamW(params, lr, betas, weight_decay)
## --------------------------------------------------------------------------------

huggingface

huggingface tools

May 30, 2023 | huggingface

It includes tools from huggingface: snapshot_download


from huggingface_hub import snapshot_download

## --------------------------------------------------------------------------------
## Download checkpoints from huggingface
## --------------------------------------------------------------------------------
repo_id = /  # *** str. A user name and a repo name, e.g., "Qwen/Qwen-VL-Chat"
repo_type = None  # *** str. "dataset", "space", or "model"
local_dir = None  # *** str or Path. If provided, directory to place the downloaded files
token = None  # str, bool. User token
max_workers = 8  # int. Number of concurrent threads to download files
# ...

snapshot_download(repo_id, repo_type, local_dir, token, max_workers)

Distributed Framework

FSDP

fully sharded data parallel framework

Dec 03, 2025 | fsdp

Wrap model with FSDP to enable parallelism.


from torch.distributed.fsdp import FullyShardedDataParallel as FSDP
from torch.distributed.fsdp.ShardingStrategy import FULL_SHARD, SHARD_GRAD_OP, NO_SHARD, HYBRID_SHARD, _HYBRID_SHARD_ZERO2
from torch.distributed.fsdp.BackwardPrefetch import BACKWARD_PRE, BACKWARD_POST

# FULL_SHARD: shard parameters, gradients, and optimizer
# SHARD_GRAD_OP: shard gradients and optimizer
# NO_SHARD: no sharding, like DDP
# HYBRID_SHARD: apply FULL_SHARD within a node
# _HYBRID_SHARD_ZERO2: apply SHARD_GRAD_OP within a node
sharding_strategy = None  # *** ShardingStrategy. The sharding strategy to use

# The auto wrap policy to use. If None, only apply to submodules of `module`
auto_wrap_policy = None  # *** ModuleWrapPolicy, CustomPolicy. User can specify the classes to wrap
# My example:
def custom_auto_wrap_policy(module, recurse, nonwrapped_numel, min_num_params: int = int(1e8)) -> bool:
    if recurse:
        return True
    return nonwrapped_numel >= min_num_params
my_auto_wrap_policy = functools.partial(custom_auto_wrap_policy, min_num_params=int(1e5))

# BACKWARD_PRE: prefetch the next set of para before current set of para's grad computation
# BACKWARD_POST: prefetch the next set of para after current set of para's grad computation
backward_prefetch = BACKWARD_PRE  # BACKWARD_PRE, BACKWARD_POST or None

module = /  # *** nn.Module. The module to be wrapped
process_group = None  # ProcessGroup. The process group to work on (use the default if None)
cpu_offload = None  # *** CPUOffload. If True, offload parameters and gradients to CPU
mixed_precision = None  # *** MixedPrecision. The mixed precision to use
ignored_modules = None  # Module. Modules to ignore. To be deprecated, use `ignored_states` instead
param_init_fn = None  # Module. How to initialize parameters onto a device
device_id = None  # *** int or torch.device. The device to use
sync_module_states = False  # bool. If True, synchronize module states across processes
forward_prefetch = False  # bool. If True, prefetch the next forward before current forward
limit_all_gathers = True  # bool. If True, synchronize CPU thread
use_orig_params = False  # bool. If True, expose the original para instead of the sharded para
ignored_states = None  # *** Parameter. States to ignore
device_mesh = None  # *** DeviceMesh. The device mesh to use
# Shard module parameters across data parallel workers
sharded_model = FSDP(
    module, process_group, sharding_strategy, cpu_offload, auto_wrap_policy, backward_prefetch, 
    mixed_precision, ignored_modules, param_init_fn, device_id, sync_module_states, 
    forward_prefetch, limit_all_gathers, use_orig_params, ignored_states, device_mesh
)

torchrun

console script for Distributed Framework

Sep 29, 2025 | torchrun

Build distributed framework with torchrun.


# Method 1: Use rdzv_endpoint (recommended)
$torchrun
    --nnodes ${NNODES}
    --nproc_per_node ${NPROC_PER_NODE}
    --node_rank ${NODE_RANK}
    --rdzv_backend c10d
    --rdzv_endpoint ${MASTER_ADDR}:${MASTER_PORT}
    --rdzv_id ${RDZV_ID}
    train.py

# Method 2: Use master_addr and master_port
$torchrun
    --nnodes ${NNODES}
    --nproc_per_node ${NPROC_PER_NODE}
    --node_rank ${NODE_RANK}
    --master_addr ${MASTER_ADDR}
    --master_port ${MASTER_PORT}
    train.py


from torch.distributed import barrier
group = None  # ProcessGroup. The process group to work on
async_op = False  # bool. If True, the barrier is asynchronous
device_ids = None  # list[int]. If provided, the barrier will only synchronize the devices in this list
barrier(group, async_op, device_ids)

DeepSpeed

deepspeed framework

Aug 21, 2025 | deepspeed

DeepSpeed is an open-sourced deep learning optimization library developed by Microsoft Research, designed to simplify and accelerate the training and deployment of large-scale deep learning models.

stage	partition	memory saving	complexity
stage 1	optimizer states	~40% - 60% (for Adam)	low
stage 2	optimizer states & gradients	additional ~15% - 25%	medium
stage 3	optimizer states & gradients & model parameters	up to 80% - 90%	high

Ray

ray framework

Oct 01, 2025 | ray

Ray is a distributed computing framework, developed by UC Berkeley, allowing you to scale machine learning and data processing workflows across multiple machines and GPUs. Ray employs a dynamic task graph computation model. Some important concepts:

Task: a remote function on a stateless worker. Create by applying `@ray.remote` decorator to a function.
Actor: a remote class on a stateful worker. Create by applying `@ray.actor` decorator to a class.
Data edges: data1 -> task1 -> data2 means data1 and data2 are the input and output of task1 respectively.
Control edges: task1 -> task2 means task2 is executed by task1.
Ray builds a dynamic task graph consisting of data edges and control edges.
Driver: a process executing the user program.
Worker: a stateless process executing tasks (remote functions).
Actor (another concept): a stateless process executing tasks (remote functions).

Tools

VSCode

vscode configs

Oct 15, 2025 | vscode

Extensions (Remote SSH, Dev Containers), SSH keys, and LaTeX (LaTeX Workshop + TeX Live).

Extensions: (1) "Remote - SSH": connect to remote servers; (2) "Dev Containers": run containers.
SSH passwordless login: (1) If there is no SSH key at `~/.ssh/`, run `$ssh-keygen -t rsa` to generate it, such as `id_rsa` and `id_rsa.pub`; (2) Edit `~/.ssh/config`: "Host [name] HostName [ip] User [user] IdentityFile [private_key_path, such as `~/.ssh/id_rsa`]"; (3) Copy the public key in `~/.ssh/id_rsa.pub` to remote server at `~/.ssh/authorized_keys`.

LaTeX:

Install the LaTeX Workshop extension.
Install TeX Live.
Merge the following into your user settings.json.


"latex-workshop.latex.recipes": [
    {
        "name": "XeLaTeX",
        "tools": [
            "xelatexmk"
        ]
    },
    {
        "name": "PdfLaTeX",
        "tools": [
            "pdflatexmk"
        ]
    }
],
"latex-workshop.latex.tools": [
    {
        "args": [
            "-synctex=1",
            "-pdfxe",
            "-interaction=nonstopmode",
            "-file-line-error",
            "-outdir=%OUTDIR%",
            "%DOC%"
        ],
        "command": "latexmk",
        "env": {},
        "name": "xelatexmk"
    },
    {
        "args": [
            "-synctex=1",
            "-pdf",
            "-interaction=nonstopmode",
            "-file-line-error",
            "-outdir=%OUTDIR%",
            "%DOC%"
        ],
        "command": "latexmk",
        "env": {},
        "name": "pdflatexmk"
    }
],

MacBook Reimage

macOS setup checklist

Apr 02, 2026 | macbook-reimage

Checklist after reimaging a MacBook: System Settings, Logi Options+, and daily apps.

System Settings: Appearance → Dark; Displays → extend screen; Touch ID & Password → Add Fingerprint; Keyboard → Input Sources → + → Pinyin - Simplified; Trackpad → disable Force Click and haptic feedback; search "time" → 24-hour time; Accessibility → Pointer Control → Trackpad options → Dragging style → Three Finger Drag.
Logi Options+: Screen capture → screen capture | region | clipboard; top button → Mission Control; thumb wheel → zoom in/out; forward button → desktop left; back button → desktop right; thumb button → close window; pointer speed → 2300 DPI; scroll wheel → scroll direction → standard.
Daily apps: Microsoft Edge, QQ Music, Weixin Input, Cursor / VSCode.

Git

git tools

May 14, 2026 | git

Git is a distributed version control system that allows you to track changes in your code and collaborate with others.

category	tool (alphabetical)
Setup and configure	config ssh keys
Get and create projects	clone
Branching and workspace	worktree

clone: clone a repository into a new directory.


git clone git@github.com:[user name]/[repo name].git

config: get and set repository or global options.


git config --list  # list all config
git config user.name [your name] && git config user.email [your email]  # repo config; cd to repo and execute
git config --global user.name [your name] && git config --global user.email [your email]  # global config

ssh keys: generate a new SSH key to use for authentication.


## 1. Check if SSH key exists: *.pub
ls -al ~/.ssh
## 2. If not, generate a new SSH key
ssh-keygen -t rsa -b 4096 -C [your GitHub email]
## 3. Copy SSH key
cat ~/.ssh/[your key name].pub
## 4. Open GitHub -> Settings -> SSH and GPG keys -> New SSH key -> paste the SSH key
## 5. Test
ssh -T git@github.com  # to see if it prints "Hi *! You've successfully ..."

worktree: check out multiple branches into separate directories from the same repo. All worktrees share the same .git object store, so adding one costs almost no disk space and avoids the stash + checkout dance when switching branches.


## Typical uses: 
## 1. Handle a hotfix without disturbing in-progress work
## 2. Run builds/tests on multiple branches in parallel
## 3. Review a PR branch alongside your own
## 4. Let an AI agent edit code in an isolated workspace
## Note: 
## 1. A branch can only be checked out in one worktree at a time
## 2. The main repo directory itself is a worktree
## 3. Deleting a worktree folder with `rm -rf` leaves dangling metadata that `git worktree prune` must clean up

## Add a worktree for a branch
git worktree add [worktree path] [remote/local branch name]

## Remove a worktree
git worktree remove [worktree path]
## Clean up records of worktrees whose directories were deleted
git worktree prune

## List all worktrees
git worktree list

Docker

docker tools

Sep 23, 2025 | docker

Docker is a containerization tool that allows you to package your application with all its dependencies into a container.

category	tool (alphabetical)
docker	start & restart & stop
image	ls & pull & rm
container	ps & run & enter & stop

ps & run & enter & stop: list containers, run a container, enter a container, stop a container.


docker ps -a  # list all containers
## --------------------------------------------------------------------------------
## Recommended params: -i: interactive mode; -d: detached mode; -t: allocate a pseudo-TTY
## If need mapping: add "-v [local path, e.g., /home/user]:/root"
## If run GPUs: add "--gpu all", "--ipc host"
## If run container without executing any commands, append "tail -f /dev/null"
docker run [params, e.g., -dit] --name [container name] [image name]  # run a container
## --------------------------------------------------------------------------------
docker exec -it [container name or container id] /bin/bash  # enter a container
docker stop [container name or container id]  # stop a container
docker restart [container name or container id]  # restart a container
docker rm [container name or container id]  # remove a container

ls & pull & rm: list, pull, and remove images.


docker images  # or docker image ls; list all images
docker pull [image name]  # pull an image from a registry
docker rmi [image name or image id]  # remove an image

start & restart & stop: start, restart, stop docker.


sudo service docker start  # start docker
sudo service docker restart  # restart docker
sudo service docker stop  # stop docker

Last updated on May 18, 2026 at 10:47 (UTC-7).