About the Recipe
Topics covered: What are LLMs, Prompt Design & Engineering, Tuning, Accessible LLM Tools.

Ingredients
Preparation
What are Large Language Models (LLMs)
Large Language Models (LLMs) are a subset of Deep Learning. They refer to large, general-purpose language models that can be pre-trained and then fine-tuned for specific purposes. Common/Modern Generative language models: LaMDA, Gemini, GPT, etc.
They are trained to solve common language problems, such as: text classification, question answering, document summarization, text generation.
Large meaning large training dataset and large number of parameters.
Language representing the commonality of human languages and recourse restriction.
Models meaning pre-trained and fine-tuned.
LLMs can either have 1. few-shot (trained with minimal data) 2. zero-shot (trained with nothing taught). Few-shot or Zero-shot models focus on prompt design. This is important to note since traditional ML development needed ML Expertise, training examples, needing to train the model, compute time and hardware.
LLMs are primarily built using a Transformer Model:

Prompt Design & Engineering
Prompt Design is the process of creating a prompt that is tailored to the specific task that the system is being asked to perform. While Prompt Engineering is the process of creating a prompt of creating a prompt that is designed to improve performance. In general, prompt engineering is only necessary for systems that require a high degree of accuracy.
Generic (or raw) language models: predict the next word (technically, token) based on the language in the training data

Instruction tuned: trained to predict a response to the instructions given in the input

Dialog tuned: trained to have a dialog by predicting the next response.

Chain-of-thought reasoning: models are better at getting the right answer when they first output text that explains the reason for the answer.
Tasks tuning make LLMs more relatable. Let’s look at Model Garden task-specific models.
i.e. If you need to gather how your customers are feeling about a product or service, you’d use sentiment analysis task model.
Tuning
The process of adapting a model to a new domain or set of custom use cases by training the model on new data.
Parameter-efficient tuning methods (PETM): methods for tuning an LLM on your own custom data without duplicating the model. Meaning, the base model is not alter, instead a small number of add on layers are tuned which can be swapped in and out at inference time.
Prompt tuning: One of the easiest parameter-efficient tuning methods
Accessible LLM tools
Vertex AI Studio → quickly explore and customize generative AI models that can be leverage on Google Cloud. Developers can create and deploy models by providing library of pre-trained models, tools for fine-tuning models, tool for deploying models to production, community forum for developers to share ideas and collaborate.
Vertex AI → you can build generative AI search and conversations for customers and employees with Vertex AI Agent Builder. Build with little or no coding experience. Chat bots, digital assistants, custom search engines, knowledge bases, training application, etc.
Gemini → is a multimodal AI model, meaning it’s not limited to understanding text along. It can analyze images, understand the nuances of audio, interpret code, etc. It’s adaptable and scalable making it suitable for diverse applications. (Gemini in Model-Garden)