Understanding AI: A Glossary of Terms
Algorithm
A set of rules or steps designed to solve a problem or perform a task. In AI, algorithms process data to generate outputs or predictions.
Artificial Intelligence (AI)
The simulation of human intelligence by machines, especially computer systems. AI includes learning, reasoning, and self-correction.
Big Data
Extremely large datasets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions.
Bias
A systematic error in a machine learning model that can lead to unfair outcomes. Bias can arise from training data that is not representative of the real-world population.
Chatbot
A software application used to conduct an online chat conversation via text or text-to-speech, often used in customer service.
Classification
A type of supervised learning where the goal is to predict the category to which new data will belong based on past observations.
Clustering
An unsupervised learning method that involves grouping a set of objects so that objects in the same group are more similar to each other than to those in other groups.
Computer Vision
A field of AI that enables computers to interpret and make decisions based on visual data from the world, such as images and videos.
Data Mining
The practice of examining large databases in order to generate new information. It involves finding patterns and relationships in data.
Data Sovereignty
The concept that data is subject to the laws and governance structures within the nation it is collected. This is especially important for data privacy and security.
Deep Learning
A subset of machine learning involving neural networks with three or more layers. These networks are designed to mimic human brain function.
Feature
An individual measurable property or characteristic of a phenomenon being observed. In AI, features are used as input to models.
Generative Adversarial Network (GAN)
A class of machine learning frameworks designed by a system of two neural networks contesting with each other in a zero-sum game, often used to generate synthetic data.
Hallucination
In AI, hallucination refers to the generation of output that is not based on the input data or real-world information. This is a common issue in language models.
Inference
The process of using a trained machine learning model to make predictions or decisions based on new data.
Machine Learning (ML)
A subset of AI that involves the development of algorithms that allow computers to learn from and make predictions based on data.
Model
In AI, a model is a mathematical representation of a real-world process. Models are trained using data and then used to make predictions or decisions.
Natural Language Processing (NLP)
A branch of AI that deals with the interaction between computers and humans using natural language. It involves the ability of computers to understand, interpret, and respond to human language.
Neural Network
A series of algorithms that attempt to recognise underlying relationships in a set of data through a process that mimics the way the human brain operates.
Overfitting
A modelling error in machine learning where a model is too closely aligned to the training data and may fail to generalise to new data.
Predictive Analytics
A branch of advanced analytics that uses historical data, statistical algorithms, and machine learning techniques to predict future outcomes.
Reinforcement Learning
A type of machine learning where an agent learns to make decisions by taking actions in an environment to maximise some notion of cumulative reward.
Supervised Learning
A type of machine learning where the model is trained on labelled data. The goal is to learn a mapping from inputs to outputs based on the examples provided.
Synthetic Data
Artificially generated data that mimics the properties of real data, often used to train machine learning models when real data is scarce or sensitive.
Transfer Learning
A machine learning technique where a model developed for one task is reused as the starting point for a model on a second task.
Turing Test
A test proposed by Alan Turing to determine whether a machine can exhibit intelligent behaviour indistinguishable from a human.
Underfitting
A situation where a machine learning model is too simple to capture the underlying pattern of the data, resulting in poor performance.
Unsupervised Learning
A type of machine learning where the model is trained on unlabelled data. The goal is to identify patterns or structures in the data.
Validation Data
A set of data used to tune the model's parameters and to ensure that the model does not overfit the training data.
White Box Model
An AI model whose inner workings are transparent and understandable, as opposed to a black box model, which is opaque and difficult to interpret.