For anki
- config file
- A file (usually JSON or YAML) that contains all the settings and parameters needed to run a model, including architecture details, training settings, and other hyperparameters. Think of it as the “recipe” for how to set up and run the model.
- Model checkpoint
- A saved snapshot of a model’s weights and settings at a particular point during training. Used to resume training or share trained models.
- Model Card
- A documentation file (like README.md) that describes a model’s capabilities, limitations, intended uses, and technical details. Like a user manual for the model.
- Inference
- The process of using a trained model to make predictions on unseen data
.safetensors
file- A file format specifically designed for storing AI model weights (parameters). It’s faster and more secure than traditional PyTorch files (
.pth
), with better memory efficiency and protection against malicious files.
- A file format specifically designed for storing AI model weights (parameters). It’s faster and more secure than traditional PyTorch files (
- Tensor
- Tensors are basically containers for numbers that the AI model uses
- Tensor Hierarchy
- 1D tensor (Vector):
[1,2,3,4]
- 2D tensor (Matrix):
[[1,2,3],[4,5,6]]
- 3D tensor (Stack of grids): Example - Multiple images
- 4D+ tensor (Even more dimensions): Example - video batches
- 1D tensor (Vector):
- Different tensor types
FP16/BF16
: Normal precision, standard choiceINT8/INT4
: Lower precision but uses less VRAM
- If parameters are like music notes in a song, tensors are like the sheet music that organizes these notes, and tensor types are like different ways of writing music notation (some more detailed, some more compact).
- Parameter/weights
- The learned “knowledge” it gained during training. Stored in files like
.safetensors
or.pth
. - Ex)
Llama-3.1-405B-Instruct
has 405B parameters/weights - Generally, more parameters = more capacity to learn complex patterns
- But also: more parameters = needs more computer power & memory to run
- Size comparison
- Small models: 1-3B parameters
- Medium models: 7-13B parameters
- Large models: 30B-70B parameters
- Very large models: 100B+ parameters
- The learned “knowledge” it gained during training. Stored in files like
- VRAM
- Video RAM - the memory on your graphics card (GPU) used for running AI models. More VRAM lets you run larger models or process bigger batches of data at once.
- “token”
- The smallest unit of text that a model processes. Can be words, parts of words, or characters. For example, “butterflies” might be split into tokens like
["butter", "flies"]
or["but", "ter", "flies"]
.
- The smallest unit of text that a model processes. Can be words, parts of words, or characters. For example, “butterflies” might be split into tokens like
- context window
- The maximum amount of text (measured in tokens) that a model can process at once. Like a sliding window of attention - larger windows mean the model can “see” and work with more text at once.
- (여기까지 ankify)