February 12, 2024

AI Key Terms: AI Models

By Jill Hubbard Bowman

This is part of a series on Key Terms for the AI Law Maze Map blog. In prior posts, I described artificial intelligence, AI systems, and machine learning. Today, I’ll describe more about AI models. In the final installment, I’ll discuss terms related to data and processing software.

Models are the heart of AI systems. But models are hard to pin down. And models don’t fit well into existing legal and licensing frameworks. Models fit worse than traditional software, which fits poorly. This awkward fit leads to licensing confusion and litigation. Understanding more about models will help you develop better intellectual property right and deal strategies.

Below are definitions of some key terms:

AI Model

Generally, an AI Model is “a component of an information system that implements AI technology and uses computational, statistical, or machine-learning techniques to produce outputs from a given set of inputs.”[i]

There are many types of AI models, from simple decision trees to complex neural networks.[ii]

Machine Learning (ML) Model

A ML model is the result of machine learning training techniques. A ML model makes its predictions based on past data. Many basic ML models are trained on labeled data sets. Examples of ML models include decision trees, random forests, and linear regression.

 Deep Learning (DL) Model

A DL model uses an artificial neural network, which is why this type of model is sometimes referred to as a neural network model. The connections are modeled after the synapses in a human brain.

A DL model may be trained on vast amounts of data, both labeled and unlabeled. A DL model can use unstructured data to make new inferences.

When describing a neural network model, AI engineers may refer to a model as: (1) information(probabilistic representation, math, rules, algorithm, or statistical equation); (2) that reflects patternsand relationships in data; and (3) can generate output: predictions, classifications, content, or decisions.

 A neural network model has type of architecture and weights. A simple optimized DL model may consist of multiple, separate computer files, like an .XML file (architecture), a .BIN file (weights), and a wrapper file for inputs and outputs.

 Architecture

The architecture of a model is the arrangement of layers and connections that determines how data flows through a deep learning model. Common architectures are first defined in a technical paper, like the Transformer architecture for large language models that was first defined by Google engineers.

Weights

Weights, also called parameters, are numbers in a neural network model reflecting information about the strength of network connections. Weights are determined and refined by training with examples. After training, the weights in the model automatically change.

Application Programing Interfaces (APIs)

An application programming interface is a set of rules that allow programs to communicate. Models use APIs, which are the bridge between the model and computer systems that exchange information and facilitate interoperability.[iv]

Simply, a neural network model is large array of numbers and calculus concepts, described in computer code.[v]

 Models might also be characterized as computer generated computer code.[vi]

 Computer Vision (CV) Model

A computer vision model is a deep learning model that processes digital images or video inputs to generate predictive information to classify, detect, and track objects or make decisions. For example, computer vision models are used in advanced driver-assistance systems to identify and locate road markings, road signs, and pedestrians. Autonomous cars also use computer vision models to track objects in motion and make decisions to avoid collisions. CV models are also used in factories to detect manufacturing defects and in facial recognition systems.

 Generative AI Models

A generative AI model is a deep machine learning model that has been trained on massive data sets and can produce new synthetic outputs (like predictions of the next word in a sentence or an image) based on input prompts from a user. Think of ChatGPT and its ability to produce human-like content or an image generator like Stable Diffusion. These models often use transformer architecture (hence the T in ChatGPT).

 Large Language Model (LLM)

A large language model is a specialized type of generative AI model that uses natural language processing to understand and generate humanlike text output based on inputs. LLMs are “large” because they are trained on vast amounts of data, like the internet, and “large” because they have billions to trillions of parameters in their arrays. Experts have explained LLMs, and models in general, don’t store training datasets.[vii] The models don’t ingest and retrieve copied information. Rather, the models are statistical representations of information about information from the training datasets. LLMs gain an understanding of the relationship between parts of words and language context resulting in a mathematical function with billions of terms.[viii] Many of these models use transformer architecture. Examples include GPT-4, which generates language in response to prompts and PaLM2, a language model incorporated into Google’s Bard chatbot.

Foundation Model

A foundation model is an AI model trained on massive amounts of broad data that can be adapted to many types of tasks. These models may be used in many varieties of AI applications and systems as the name implies—as a foundation. Examples include large language models but may also include models trained on images.[ix]

General Purpose AI (GPAI) Model (EU AI Act)

“‘general purpose AI model’ means an AI model, including when trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable to competently perform a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications. This does not cover AI models that are used before release on the market for research, development and prototyping activities;”[x]

In the final installment of this Key Terms series I will discuss data and processing software.

[i] Biden, Joseph, White House, “Executive Order on the Safe, Secure, and Trustworthy Use of Artificial Intelligence”, 30 October 2023,https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/

[ii] Hewlett Packard Enterprises, “AI Models, 2023, https://www.hpe.com/us/en/what-is/ai-models.html.

[iii] Viswani, Ashish, et. al, “Attention is All You Need”, 2017, https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf (The foundational paper introducing the Transformer architecture used by LLMs.)

[iv] IBM, “What is an API?” https://www.ibm.com/topics/api

[v] Wolfram, Stephen, “What is ChatGPT Doing . . . Why Does It Work?”, 14 February 2023, https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ (“In current neural nets, one’s essentially using the ideas of calculus—applied to real numbers—to do that incremental modification.”)

[vi] Hervey, Matt and Lavy, Matthew, The Law of Artificial Intelligence, (London, England, Sweet & Maxwell, 2021), (See generally for good descriptions of AI terms) p. 308-309 (discussing AI computer-generated computer programs).

[vii] House of Lords, Communications and Digital Committee, “Large language models and AI,” https://media.licdn.com/dms/document/media/D4E1FAQHLZImoZU9IEg/feedshare-document-pdf-analyzed/0/1706865285557?e=1707955200&v=beta&t=ifvxMFlrhQOvloLekzN9-lYw5srxTviySxYkGQIHp3c (expert testifying that LLMs don’t store training data).

[viii] House of Lords, Communications and Digital Committee, Large language models and AI, https://media.licdn.com/dms/document/media/D4E1FAQHLZImoZU9IEg/feedshare-document-pdf-analyzed/0/1706865285557?e=1707955200&v=beta&t=ifvxMFlrhQOvloLekzN9-lYw5srxTviySxYkGQIHp3c

[ix] Bommasani, Rishi and Liang, Percy, “Reflections on Foundation Models,” Stanford University Human-Centered Intelligence, Oct. 18, 2021, https://hai.stanford.edu/news/reflections-foundation-models

[x] Council of the European Union, “Proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts”, Interinstitutional File: 2021/0106(COD) Brussels, 26 January 2024, https://data.consilium.europa.eu/doc/document/ST-5662-2024-INIT/en/pdf

 

This AI Law Maze Map blog is for education only. It is not intended as legal advice.

By using this website and information, you acknowledge and agree that no attorney-client relationship is created or implied.

Sign up for our newsletter

© 2023 Jill Hubbard Bowman. All rights reserved.
crossmenu