Skip to content

Analogy to understand Open Source in generative AI Models

We often hear the terms "Open Source" and "Open Weights" in the world of AI models - but what is the difference? Traditional software uses open source for full transparency and reproducibility. AI adds layers of complexity that are fundamentally different to traditional software.

This post explains open source vs open weights in the context of AI models. It covers key differences and uses a car analogy ( again!) for clarity and help to make informed decisions. For terms like Parameters (the Model Weights in "Open Weights") or Training, see the AI Glossary.

Open Source in Traditional Software

Open source has transformed software development - examples include Linux, Python, the Apache web server and many others:

  • Full access and transparency. The complete source code acts like a recipe. Users can view, modify, and share changes.
  • Community collaboration. Developers contribute fixes and improvements.
  • Reproducibility. The code and instructions allow rebuilding from scratch.
  • Licenses. Generous options like MIT or GPL support usage, modification, and distribution - including commercial implementations.

This approach lowers costs, speeds innovation and allows full transparency. AI models differ in key ways.

Open Source Applied to AI

AI models involve more than code. They result from processes like architecture design, data curation, and very compute hungry large scale training. Also see the how LLMs work post.

Components of an AI model, which you would expect all for truly open source:

  • Architecture. The structure, such as transformer-based designs.
  • Training code. Scripts for the training process.
  • Training data. Large datasets used in training.
  • Model weights. Learned parameters after training.
  • Inference code. Tools to run the model.

Open Weights vs Open Source

Many AI models labeled as open source are actually only open weights. These provide weights, architecture, and inference code, while keeping training data and training processes hidden. Users can run, fine-tune, and deploy the model with this setup - but they cannot recreate it from scratch.

  Customization becomes easier, and community contributions grow.

  No insight into biases or data sources means risks like ethical issues remain unknown.

True open source models share all elements for full reproducibility and transparency allows audits for safety, which expands innovation.

In mid-2025 all major organisations (except Anthropic) released Open Weights models.

  To get an overview - and also a ranking - the LMArena is a good resource: they list the model licence and Open Weights models are typically released under a differenct licence than "Proprietary".

Car Analogy for Clarity

The car analogy illustrates the differences and highlights levels of control in AI:
From Closed Source to Open Weights to truly Open Source.

Open Source

Closed Source:

A taxi ride. Users decide direction and route. The engine under the hood stays unknown.

Open Weights:

A purchased car. Users know engine details and specs. Build processes and component sources remain secret.

Open Source:

A self-built kit car, such as a Lotus 7 replica. All parts and specs arrive. Users can improve, change, and build from scratch.

Examples from 2025

The AI landscape is highly dynamic and constantly updated. The examples give an indication, but in reality there are many thousands more additional models.

  Closed Source:

  • OpenAI ChatGPT 5
  • Anthropic Claude Opus 4.1
  • Google Gemini 2.5 Pro
  • xAI Grok 3

  Open Weights:

  • Meta Llama family
  • Google Gemma family
  • xAI Grok 2.5
  • DeepSeek models like R1 or V3.1
  • selected Mistral AI models

  Open Source:

  • Apertus LLM from Swiss AI
  • EleutherAI models
  • BigScience Bloom
  • TII/UAE Facon-40B

The Open Source Initiative serves as the primary authority in establishing and reviewing AI standards within the open-source community.

Business Implications

These differences guide AI choices and support strategic decisions.

   Innovation and savings. Open weights allow customization without full development costs.
   Risk control. Partial openness may hide biases or compliance problems.
   Ethics and trust - to enable fairness checks for areas like healthcare or government.
   Market trends. Open models already approach closed-source performance in benchmarks.

In Summary

Often labled as Open Source is really only Open Weights and does not share training data and methods. While Open Weight models are powerful and econimical, they lack transparency and might surface unwanted biases.