ChatGPT and Stable diffusion

This is a summary on what I have find in ChatGPT, Stable diffusion and colab.

ChatGPT

Although I cannot download the model, the LLaMa model from Meta is available from torrent. Here is the video teaaching you how to set it up to re-train LLaMa as ChatGPT.

Paper:

LLaMa

  • Facebook leak model, have 7B, 13B.
  • My display card is 2080Ti, having 11GB, so both should be OK
  • Successfully set it up on my computer and generate output

Alpaca

  • Train using LoRA, this is an external model to LLaMa. seems training by keeping weighting in LLaMa constant.

LangChain

Github: https://github.com/hwchase17/langchain

To preserve the chat context, there are ways to save the previous conversation

  • All chat history
  • Previous chat summary
  • Please aware that the number of token in input is fixed, so chat history may not work, or need to trim.

My experience

Running pre-trained alpaca model on my machine

  • Seems not work as expected using 7B model. I have install both llama and alpaca model…
1
2
prompt: write a program which print the first 10 fibonacci number
reply: Write a C++ Program to find sum of all even numbers between two given integers. (e.g., if int x =5 and y=7, then output should be: Sum(x+y) ) [end of text]

Seems this program does not understand context.

I also ask the program to write an program to solve optimization problem

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
prompt: write me a program to optimize the bet on each horse given the win rate and the payoff ratio

answer: 
import numpy as np

def optimize_bet(win_rate, payoff_ratio):
# Initialize the betting amount
bet_amount = 1
# Initialize the betting history
bet_history = []
# Initialize the win rate and payoff ratio
win_rate = win_rate
payoff_ratio = payoff_rat

Stable diffusion

Read a lot of stuff, and confused by a lot of concept. It is like another Photoshop. It works OK on my 2080Ti (11G) machine and the download size is around 3GB, so I don’t think it is a very big model.

Model and different approach in retraining

The base model is Standard diffusion SD-1.5

GitHub: Stable Diffusion WebUI: https://github.com/AUTOMATIC1111/stable-diffusion-webui Civit.AI: https://civitai.com/

Video to explain different approach of training to further enhance the result: LoRA vs Dreambooth vs Textual Inversion vs Hypernetworks

Retrain the model

Because you train the whole model and refine the parameters:

  • You need a high end graphic card
  • The model is of 2GB, same as that of stable diffusion base model
  • Famous models:
    • ChilloutMix

LoRA (Low-Rank Adaption)

You keep all the weighting in the base model and add additional layer in-between

ControlNet

Beside text input, I may want to enter additional information like skeleton or depth map. ControlNet solve it by adding a small network on top of existing one and this network accept the new input

Create prompt

GPT-4 + Midjourney V5 = A New Era of Photography? - WOW!

Workflow

Colab

As my display card is 2080Ti only having 11GB ram, and I have like 4 cards on my hand now.

These cards was fun and allow me to play games and training nerual network, but I think it is not powerful enough to deal with the crazy large model nowsaday.