Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models

Summary

A systematic analysis of over 100 state-of-the-art LLMs examining transparency and accessibility through the lenses of open-source and open-weight definitions.

Key quotes

Our findings reveal that while some models are labeled as open-source, this does not necessarily mean they are fully open-sourced.

The paper evaluates the adherence of various LLMs to transparency standards, specifically addressing the issue of ‘open-washing.’ It highlights gaps in reporting training data, code, and carbon emissions.