Llama Guard 2 (8B)

Llama-based 8B model for safe LLM content classification

Basic Information

Model Name: LlamaGuard
Developer/Creator: Meta
Release Date: April 2024
Version: LlamaGuard-2-8B
Model Type: Text Classification

Description

Overview

LlamaGuard-2-8B is an 8 billion parameter model developed by Meta AI to classify content safety in large language models (LLMs). It is based on the Meta Llama 3 architecture and is trained to predict safety labels across 11 categories from the MLCommons taxonomy of hazards.

Key Features

Outperforms other popular content moderation APIs like Azure, OpenAI Moderation, and Perspective
Achieves high F1 score of 0.915 and low false positive rate of 0.040
Supports prompt and response safety classification for LLMs
Easily fine-tunable to create custom safety taxonomies for specific applications

Intended Use

LlamaGuard-2-8B is designed to be integrated into LLM-powered applications to ensure the safety and responsibility of generated content. It can be used to filter out potentially harmful or inappropriate text before it is displayed to users.

Language Support

The model is currently trained on English text, but it could potentially be fine-tuned to support other languages as well.

Technical Details

Architecture

LlamaGuard-2-8B is based on the Meta Llama 3 model, a large language model using the Transformer architecture.

Training Data

The model was fine-tuned on the Llama 3 model with additional data for safety classification, including a diverse set of online text covering the 11 safety categories.

Data Source and Size

The training data for LlamaGuard-2-8B is not publicly disclosed, but it is likely a large corpus of online text covering a wide range of topics and genres.

Knowledge Cutoff

The knowledge cutoff for LlamaGuard-2-8B is not explicitly stated, but it is likely trained on data up to 2023.

Diversity and Bias

The model's training data is designed to be diverse and representative, but it is possible that some biases may still exist. Developers should carefully evaluate the model's performance and outputs for any signs of bias or lack of diversity.

Performance Metrics

LlamaGuard-2-8B outperforms other popular content moderation APIs, achieving an F1 score of 0.915 and a low false positive rate of 0.040 on internal test sets. It also demonstrates strong robustness and generalization across different types of content.

Usage

Ethical Guidelines

Meta AI has published ethical guidelines for the development and use of LlamaGuard-2-8B, emphasizing the importance of responsible AI and the need to mitigate potential harms.

License Type

The licensing details for LlamaGuard-2-8B are not publicly disclosed, but it is likely available for both commercial and non-commercial use under certain terms and conditions.

Try it now

Llama Guard 2 (8B)

AI Playground

Our Clients' Voices

Llama Guard 2 (8B)

Basic Information

Description

Overview

Key Features

Intended Use

Language Support

Technical Details

Architecture

Training Data

Data Source and Size

Knowledge Cutoff

Diversity and Bias

Performance Metrics

Usage

Ethical Guidelines

License Type

200+ AI Models

The Best Growth Choice
for Enterprise

Llama Guard 2 (8B)

AI Playground

Our Clients' Voices

Llama Guard 2 (8B)

Basic Information

Description

Overview

Key Features

Intended Use

Language Support

Technical Details

Architecture

Training Data

Data Source and Size

Knowledge Cutoff

Diversity and Bias

Performance Metrics

Usage

Ethical Guidelines

License Type

200+ AI Models

The Best Growth Choice for Enterprise

The Best Growth Choice
for Enterprise