Dolly v2 (7B)
+
Techflow Logo - Techflow X Webflow Template

Dolly v2 (7B)

Dolly v2 (7B): Open-source, instruction-following LLM for commercial use.

API for

Dolly v2 (7B)

Explore Dolly v2 (7B) API, an open-source large language model with instruction-following capabilities, suitable for various NLP tasks and commercial applications.

Dolly v2 (7B)

Model Overview Card for Dolly v2 (7B)

Basic Information
  • Model Name: Dolly v2 (7B)
  • Developer/Creator: Databricks
  • Release Date: April 12, 2023
  • Version: 2.0
  • Model Type: Instruction-following Large Language Model

Description

Overview:Dolly v2 (7B) is an instruction-following large language model trained on the Databricks machine learning platform, licensed for commercial use. It is based on the Pythia-6.9b model and fine-tuned on a dataset of approximately 15,000 instruction/response pairs.Key Features:

  • Instruction-following capabilities
  • Open-source and commercially licensed
  • Fine-tuned on high-quality training data
  • Relatively small model size (6.9 billion parameters)

Intended Use:Dolly v2 (7B) is designed for various natural language processing tasks, including:

  • Brainstorming
  • Classification
  • Closed question answering
  • Text generation
  • Information extraction
  • Open question answering
  • Summarization

Language Support:The model primarily supports English language tasks.

Technical Details

Architecture:Dolly v2 (7B) is based on the Pythia-6.9b architecture, which is a transformer-based model.Training Data:

  • Data Source and Size: The model was fine-tuned on the databricks-dolly-15k dataset, containing approximately 15,000 instruction/response pairs generated by Databricks employees.
  • Knowledge Cutoff: The model's knowledge is based on the Pythia-6.9b pre-training, with additional instruction-following capabilities from the fine-tuning process.

Performance Metrics:While Dolly v2 (7B) is not state-of-the-art, it demonstrates surprisingly high-quality instruction-following behavior. Some benchmark results include:

BenchmarkScoreARC (25-shot) 0.392 HellaSwag (10-shot) 0.633838 MMLU (5-shot) 0.406997 TruthfulQA (0-shot)0.444444

Comparison to Other Models:Dolly v2 (7B) underperforms compared to larger models like GPT-3 (175B parameters) but offers a good balance between performance and resource requirements.

Usage

Code Sample:

Ethical Considerations

Dolly v2 (7B) was developed with ethical considerations in mind. The training data does not contain obscenity, intellectual property, or personally identifying information about non-public figures. However, it may reflect biases present in the data generated by Databricks employees.

Licensing

Dolly v2 (7B) is released under the Apache 2.0 license, which allows for both research and commercial use.

Limitations

  • Not designed to perform competitively with more modern model architectures
  • Struggles with syntactically complex prompts
  • Limited capabilities in programming problems and mathematical operations
  • May contain factual errors and exhibit hallucination
  • Reflects potential biases of Databricks employees in the training data

Try  
Dolly v2 (7B)

More APIs

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.