Generative AI in Action

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Generative AI can transform your business by streamlining the process of creating text, images, and code. This book will show you how to get in on the action! Generative AI has created new opportunities for organizations of all sizes. You can easily use tools like ChatGPT, Bard, and Stable Diffusion to generate text and images for product catalogs, marketing campaigns, technical reporting, and other common tasks. Coding assistants like Copilot are accelerating productivity in software teams. In this insightful book, author Amit Bahree shares his experience leading Generative AI projects at Microsoft for nearly a decade, starting well before the current GPT revolution. Inside Generative AI in Action you will find: • A practical overview of of generative AI applications • Architectural patterns, integration guidance, and best practices for generative AI • The latest techniques like RAG, prompt engineering, and multi-modality • The challenges and risks of generative AI like hallucinations and jailbreaks • How to integrate generative AI into your business and IT strategy Generative AI in Action stays away from hype and speculation, delivering experience-based advice on how to incorporate AI into your products and processes. You’ll appreciate the relevant use cases that show you how to get started right away, as well as application architectures to deploy GenAI in production at enterprise scale. About the book Generative AI in Action shows you exactly how to add generative AI tools for text, images, code, and more into your organization’s strategies and projects. The book begins with the fundamentals of generative AI models and architectures, and introduces practical use-cases to create efficient processes for marketing, software development, business report generation, and other practical tasks. You’ll quickly master best practices for prompt engineering, model fine tuning and evaluation, and explore the emerging architecture patterns that support generative AI in your enterprise workflow. Along the way, you’ll learn important facts about AI safety and ethics, and look ahead to new trends such as explainable AI, transfer learning, and reinforcement learning. With a frank discussion of risks like hallucinations and jailbreaks, Generative AI in Action gives you the insight you need to incorporate these powerful technologies with confidence. About the reader For enterprise architects and senior developers interested in upgrading their architectures with generative AI. About the author Amit Bahree is a Principal Group TPM at Microsoft, where he is part of the engineering team building the next generation of AI products and services for millions of customers using the Azure AI platform. He is also responsible for custom engineering across the platform with key customers, solving complex enterprise scenarios using all forms of AI, including generative AI.

Author(s): Amit Bahree
Edition: 1
Publisher: Manning
Year: 2024

Language: English
Commentary: Publisher's PDF
Pages: 464
City: Shelter Island, NY
Tags: Ethics; Text Generation; Benchmarking; Application Development; Software Architecture; Scaling; Code Generation; Image Generation; Generative AI; Large Language Models; Prompt Engineering; Retrieval Augmented Generation

Generative AI in Action
brief contents
contents
foreword
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A road map
About the code
liveBook discussion forum
about the author
about the cover illustration
Part 1 Foundations of generative AI
1 Introduction to generative AI
1.1 What is this book about?
1.2 What is generative AI?
1.3 What can we generate?
1.3.1 Entities extraction
1.3.2 Generating text
1.3.3 Generating images
1.3.4 Generating code
1.3.5 Ability to solve logic problems
1.3.6 Generating music
1.3.7 Generating videos
1.4 Enterprise use cases
1.5 When not to use generative AI
1.6 How is generative AI different from traditional AI?
1.7 What approach should enterprises take?
1.8 Architecture considerations
1.9 So your enterprise wants to use generative AI. Now what?
Summary
2 Introduction to large language models
2.1 Overview of foundational models
2.2 Overview of LLMs
2.3 Transformer architecture
2.4 Training cutoff
2.5 Types of LLMs
2.6 Small language models
2.7 Open source vs. commercial LLMs
2.7.1 Commercial LLMs
2.7.2 Open source LLMs
2.8 Key concepts of LLMs
2.8.1 Prompts
2.8.2 Tokens
2.8.3 Counting tokens
2.8.4 Embeddings
2.8.5 Model configuration
2.8.6 Context window
2.8.7 Prompt engineering
2.8.8 Model adaptation
2.8.9 Emergent behavior
Summary
3 Working through an API: Generating text
3.1 Model categories
3.1.1 Dependencies
3.1.2 Listing models
3.2 Completion API
3.2.1 Expanding completions
3.2.2 Azure content safety filter
3.2.3 Multiple completions
3.2.4 Controlling randomness
3.2.5 Controlling randomness using top_p
3.3 Advanced completion API options
3.3.1 Streaming completions
3.3.2 Influencing token probabilities: logit_bias
3.3.3 Presence and frequency penalties
3.3.4 Log probabilities
3.4 Chat completion API
3.4.1 System role
3.4.2 Finish reason
3.4.3 Chat completion API for nonchat scenarios
3.4.4 Managing conversation
3.4.5 Best practices for managing tokens
3.4.6 Additional LLM providers
Summary
4 From pixels to pictures: Generating images
4.1 Vision models
4.1.1 Variational autoencoders
4.1.2 Generative adversarial networks
4.1.3 Vision transformer models
4.1.4 Diffusion models
4.1.5 Multimodal models
4.2 Image generation with Stable Diffusion
4.2.1 Dependencies
4.2.2 Generating an image
4.3 Image generation with other providers
4.3.1 OpenAI DALLE 3
4.3.2 Bing image creator
4.3.3 Adobe Firefly
4.4 Editing and enhancing images using Stable Diffusion
4.4.1 Generating using image-to-image API
4.4.2 Using the masking API
4.4.3 Resize using the upscale API
4.4.4 Image generation tips
Summary
5 What else can AI generate?
5.1 Code generation
5.1.1 Can I trust the code?
5.1.2 GitHub Copilot
5.1.3 How Copilot works
5.2 Additional code-related tasks
5.2.1 Code explanation
5.2.2 Generate tests
5.2.3 Code referencing
5.2.4 Code refactoring
5.3 Other code generation tools
5.3.1 Amazon CodeWhisperer
5.3.2 Code Llama
5.3.3 Tabnine
5.3.4 Check yourself
5.3.5 Best practices for code generation
5.4 Video generation
5.5 Audio and music generation
Summary
Part 2 Advanced techniques and applications
6 Guide to prompt engineering
6.1 What is prompt engineering?
6.1.1 Why do we need prompt engineering?
6.2 The basics of prompt engineering
6.3 In-context learning and prompting
6.4 Prompt engineering techniques
6.4.1 System message
6.4.2 Zero-shot, few-shot, and many-shot learning
6.4.3 Use clear syntax
6.4.4 Making in-context learning work
6.4.5 Reasoning: Chain of Thought
6.4.6 Self-consistency sampling
6.5 Image prompting
6.6 Prompt injection
6.7 Prompt engineering challenges
6.8 Best practices
Summary
7 Retrieval-augmented generation: The secret weapon
7.1 What is RAG?
7.2 RAG benefits
7.3 RAG architecture
7.4 Retriever system
7.5 Understanding vector databases
7.5.1 What is a vector index?
7.5.2 Vector search
7.6 RAG challenges
7.7 Overcoming challenges for chunking
7.7.1 Chunking strategies
7.7.2 Factors affecting chunking strategies
7.7.3 Handling unknown complexities
7.7.4 Chunking sentences
7.7.5 Chunking using natural language processing
7.8 Chunking PDFs
Summary
8 Chatting with your data
8.1 Advantages to enterprises using their data
8.1.1 What about large context windows?
8.1.2 Building a chat application using our data
8.2 Using a vector database
8.3 Planning for retrieving the information
8.4 Retrieving the data
8.4.1 Retriever pipeline best practices
8.5 Search using Redis
8.6 An end-to-end chat implementation powered by RAG
8.7 Using Azure OpenAI on your data
8.8 Benefits of bringing your data using RAG
Summary
9 Tailoring models with model adaptation and fine-tuning
9.1 What is model adaptation?
9.1.1 Basics of model adaptation
9.1.2 Advantages and challenges for enterprises
9.2 When to fine-tune an LLM
9.2.1 Key stages of fine-tuning an LLM
9.3 Fine-tuning OpenAI models
9.3.1 Preparing a dataset for fine-tuning
9.3.2 LLM evaluation
9.3.3 Fine-tuning
9.3.4 Fine-tuning training metrics
9.3.5 Fine-tuning using Azure OpenAI
9.4 Deployment of a fine-tuned model
9.4.1 Inference: Fine-tuned model
9.5 Training an LLM
9.5.1 Pretraining
9.5.2 Supervised fine-tuning
9.5.3 Reward modeling
9.5.4 Reinforcement learning
9.5.5 Direct policy optimization
9.6 Model adaptation techniques
9.6.1 Low-rank adaptation
9.7 RLHF overview
9.7.1 Challenges with RLHF
9.7.2 Scaling an RLHF implementation
Summary
Part 3 Deployment and ethical considerations
10 Application architecture for generative AI apps
10.1 Generative AI: Application architecture
10.1.1 Software 2.0
10.1.2 The era of copilots
10.2 Generative AI: Application stack
10.2.1 Integrating the GenAI stack
10.2.2 GenAI architecture principles
10.2.3 GenAI application architecture: A detailed view
10.3 Orchestration layer
10.3.1 Benefits of an orchestration framework
10.3.2 Orchestration frameworks
10.3.3 Managing operations
10.3.4 Prompt management
10.4 Grounding layer
10.4.1 Data integration and preprocessing
10.4.2 Embeddings and vector management
10.5 Model layer
10.5.1 Model ensemble architecture
10.5.2 Model serving
10.6 Response filtering
Summary
11 Scaling up: Best practices for production deployment
11.1 Challenges for production deployments
11.2 Deployment options
11.3 Managed LLMs via API
11.4 Best practices for production deployment
11.4.1 Metrics for LLM inference
11.4.2 Latency
11.4.3 Scalability
11.4.4 PAYGO
11.4.5 Quotas and rate limits
11.4.6 Managing quota
11.4.7 Observability
11.4.8 Security and compliance considerations
11.5 GenAI operational considerations
11.5.1 Reliability and performance considerations
11.5.2 Managed identities
11.5.3 Caching
11.6 LLMOps and MLOps
11.7 Checklist for production deployment
Summary
12 Evaluations and benchmarks
12.1 LLM evaluations
12.2 Traditional evaluation metrics
12.2.1 BLEU
12.2.2 ROUGE
12.2.3 BERTScore
12.2.4 An example of traditional metric evaluation
12.3 LLM task-specific benchmarks
12.3.1 G-Eval: A measuring approach for NLG evaluation
12.3.2 An example of LLM-based evaluation metrics
12.3.3 HELM
12.3.4 HEIM
12.3.5 HellaSWAG
12.3.6 Massive Multitask Language Understanding
12.3.7 Using Azure AI Studio for evaluations
12.3.8 DeepEval: An LLM evaluation framework
12.4 New evaluation benchmarks
12.4.1 SWE-bench
12.4.2 MMMU
12.4.3 MoCa
12.4.4 HaluEval
12.5 Human evaluation
Summary
13 Guide to ethical GenAI: Principles, practices, and pitfalls
13.1 GenAI risks
13.1.1 LLM limitations
13.1.2 Hallucination
13.2 Understanding GenAI attacks
13.2.1 Prompt injection
13.2.2 Insecure output handling example
13.2.3 Model denial of service
13.2.4 Data poisoning and backdoors
13.2.5 Sensitive information disclosure
13.2.6 Overreliance
13.2.7 Model theft
13.3 A responsible AI lifecycle
13.3.1 Identifying harms
13.3.2 Measure and evaluate harms
13.3.3 Mitigate harms
13.3.4 Transparency and explainability
13.4 Red-teaming
13.4.1 Red-teaming example
13.4.2 Red-teaming tools and techniques
13.5 Content safety
13.5.1 Azure Content Safety
13.5.2 Google Perspective API
13.5.3 Evaluating content filters
Summary
appendix A The book’s GitHub repository
The book’s GitHub repository
appendix B Responsible AI tools
B.1 Model card
B.2 Transparency notes
B.3 HAX Toolkit
B.4 Responsible AI Toolbox
B.5 Learning Interpretability Tool (LIT)
B.6 AI Fairness 360
B.7 C2PA
References
Chapter 1
Chapter 2
Chapter 4
Chapter 6
Chapter 7
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
Z
Generative AI in Action - back