ML 120: Bloom LLM (30 pts extra)

What You Need

Purpose

To practice making a Large Language Model, using BLOOM. This project is based on this tutorial:

Hello Large Language Models

Using Google Colab

In a browser, go to
https://colab.research.google.com/
If you see a blue "Sign In" button at the top right, click it and log into a Google account.

From the menu, click File, "New notebook".

Installing Libraries

Execute these commands to install and import the required libraries:
!pip install transformers
import torch
import transformers
from transformers import BloomForCausalLM
from transformers import BloomTokenizerFast
As shown below, the software installs.

Downloading BLOOM

Execute these commands to create a LLM using the smallest version of Bloom, with 560 million parameters:
model = BloomForCausalLM.from_pretrained("bigscience/bloom-560m")
tokenizer = BloomTokenizerFast.from_pretrained("bigscience/bloom-560m")
As shown below, the model downloads.

Preparing a Prompt

Execute these commands to start a sentence for the model to complete:
prompt = "Why is the sky blue?" 
result_length = len(prompt.split()) +50 # Number of words to add
inputs = tokenizer(prompt, return_tensors="pt") 
As shown below, the code runs without producing any output.

Greedy Search

Execute this command to perform a greedy search.
print(tokenizer.decode(model.generate(inputs["input_ids"], 
                       max_length=result_length
                      )[0]))
As shown below, the model gets stuck, repeating the same sentences over and over.

Beam Search

Execute this command to perform another search.
print(tokenizer.decode(model.generate(inputs["input_ids"],
                       max_length=result_length, 
                       do_sample=True, 
                       top_k=50, 
                       top_p=0.9
                      )[0]))
As shown below, the model creates a fictional story without repeating sentences.

Sampling Top-k + Top-p

Execute this command to perform a beam search.
print(tokenizer.decode(model.generate(inputs["input_ids"],
                       max_length=result_length, 
                       num_beams=2, 
                       no_repeat_ngram_size=2,
                       early_stopping=True
                      )[0]))
As shown below, the model adds a series of questions after the prompt.

Flag ML 120.1: Colors (10 pts)

Use this prompt: "What are the three primary colors?"

The flag is covered by a green rectangle in the image below.

Flag ML 120.2: A Better Prompt (5 pts)

Use this prompt: "Make a bulleted list of the three primary colors below."

The flag is covered by a green rectangle in the image below. You may need to scroll right to see it.

Flag ML 120.3: An Even Better Prompt (5 pts)

Use this prompt: "Find a reliable references explaining primary colors. Using it, determine the three primary colors. List them:"

With this good prompt, all three search types find the correct answer, as shown below.

The flag is covered by a green rectangle in the image below.

Flag ML 120.4: A Bigger Model (10 pts)

Replace the "bloom-560m" model with the larger "bloom-1b1" model.

Use this prompt: "Find a reliable references explaining primary colors. Using it, determine the three primary colors. List them, and explain their importance:"

The flag is covered by a green rectangle in the image below.

Sources

Posted 5-10-23