AI & ML Terminology Practical definitions

Essential guide to AI & ML Glossary

This glossary contains definitions of terms I use in my work. It is not comprehensive, but contains terms I use frequently. The definitions are meant to be useful in applying AI not perfect academic definitions.

A

Agent

AI system that can make decisions based on a given set of instructions. The degree of freedom can vary from a very structured task to a very open-ended one.

C

Continued Pre-training // Domain Adaptation

Taking a large amount of documents from a specific area, in a specific language to change the semantics of the tokens. Often used in domains like law, where the tokens (words) mean different things than in the general lingo.

I

Instruction Fine-tuning

Fine-tuning an LLM on a specific task with a small dataset of annotated examples.

L

Large Language Model (LLM)

A type of AI model trained on vast amounts of text data to understand and generate human-like text.

P

Pairwise ranking

Scores two outputs against each other. For example, two documents are given and the model has to decide which one is better.

Pointwise ranking

Scores a single output against a single reference. This is the most common type of evaluation. For example, a query and a document are given and the model has to score how well the document matches the query.

Pre-training

Training on vast amounts of data to learn patterns and relationships. LLMs are pre-trained on nearly the entire internet to learn linguistic patterns.

R

Reinforcement Learning by Human Feedback (RLHF)

Using feedback data by humans (e.g. clicks, likes) to train a model that can label the output as good or bad and in turn using this to train the model.

T

Transformer

An architecture for neural networks that uses self-attention mechanisms to process sequential data.