The Developers Guide to GenAI

Start with why?

GenAI will change everything

Where are we going…

The Landscape

Generate Code

Hands up if

Generate Images

Used an AI coding assistant

Have Ghiblified a photo

The Ghibli AI Controversy

Studio Ghibli’s Stance

“An Insult to Life Itself”

— Hayao Miyazaki

Flux just isn’t the same

Don’t say the G-word

Raise you hand if you think your job will change…

  • a lot over the next 2 years.
  • a lot over the next 5 years.
  • a lot over the next 10 years.
  • Not much at all.

The Technology

To be precise: predict probabilities of the next word

Tokens != words

Temperature

Context windows

How much stuff you can put it.

Context windows

🎮 Pick Your LLM!

🔄 I/O Capabilities

  • 📥 Input modalities: text, image, audio, voice
  • 📝 Output modalities: text, image, voice
  • 🌍 Multilingual capabilities

⚙️ Technical Features

  • 🧠 Reasoning
  • 🛠 Tool Calling
  • 🔢 Structured Output

📊 Practical Factors

  • 📏 Maximum context window size
  • 💰 Price per token (input/output)
  • Response generation speed

🧠 Claude 3.7 Sonnet

🎮 Type:Reasoning & Coding

🧩 Input: Text, Image

🖼 Output: Text

📏 Context: 200K tokens

🛠 Function Calling:

🏗️ Structured Output:

🤔 Reasoning:

💻 Special Ability: Computer Use (mouse, keyboard, browser)


💪 Strengths:

  • Amazing coder
  • Caches content
  • Performs autonomous computer tasks

⚠️ Weaknesses:

  • No image output

✨ Gemini 2.0 Flash

⚡ Type: Speed & Efficiency / Long Context

🧩 Input: Text, Image, Audio, Video, Voice

🖼 Output: Text, Voice

📏 Context: 1M tokens

🛠 Function Calling:

🏗️ Structured Output:

🤔 Reasoning:


💪 Strengths:

  • Very fast response times
  • Cost-effective
  • Excellent for long context tasks
  • Good for high-volume/frequency tasks

⚠️ Weaknesses:

  • Less performant on highly complex reasoning vs Pro

🧠 DeepSeek-R1 (OS)

💡 Type: Coding & Technical Reasoning

🧩 Input: Text, Images (VL variant)

🖼 Output: Text

📏 Context: 128K tokens

🛠 Function Calling:

🏗️ Structured Output: 🔶

🤔 Reasoning:


💪 Strengths:

  • Exceptional coding & mathematical reasoning
  • Strong multilingual capabilities (Chinese+English)
  • Open-source

⚠️ Weaknesses:

  • Fewer modalities than some competitors
  • Less robust content moderation

Reasoning Models

The Tools

The GenAI Stack

APPLICATION
TOOLING
MODEL / API
PLATFORM AND STORAGE
HARDWARE
@ Sandi Besen

Visual AI Frameworks

  • LangGraph
  • FlowiseAI
  • n8n

Tools

🔍 Web Search

  • API Searches
  • News Analysis

🕸️ Web Scraping

  • Content Extraction
  • Browser Automation

📚 RAG Systems

  • Document Retrieval
  • Context Management

🗄️ Vector DBs

  • Similarity Search
  • Embedding Storage

🔌 API Clients

  • REST/GraphQL
  • Authentication

🗃️ Database

  • SQL/NoSQL
  • Data Querying

💻 Code Gen

  • Code Analysis
  • Autocompletion

🛠️ Dev Tooling

  • Git Operations
  • Execution Envs

⚙️ Shell Access

  • Command Execution
  • System Integration

📂 File System

  • File Operations
  • Data Processing

📧 Messaging

  • Email/SMS
  • Chat Platforms

🔔 Notifications

  • Push/Webhooks
  • Social Media

🎨 Image Tools

  • Generation
  • Analysis/OCR

🔊 Audio/Video

  • Speech Processing
  • Media Analysis

🧠 Reasoning

  • Chain-of-Thought
  • Logical Analysis

🧩 Planning

  • Goal Decomposition
  • Self-Reflection

Programmic AI Frameworks

🔗 LangChain 🚩
Python JS

🦙 LlamaIndex Python

🌾 Haystack Python

Pydantic Logo PydanticAI Python

Self-Hosting Options

  • CLI Ollama
  • CLI LM Studio
  • CLI LlamaCpp
  • CLI vllm
Illustration of self-hosted LLM infrastructure

Development Tools

📝 VS Code

👨‍💻 Github Copilot
📊 Cline
🦘 Roo Code

💻 IDE

Cursor
🌊 Windsurf

⌨️ CLI

🔧 Aider
🤖 Claude coder ⚠️Only Claude Sonnet

🌐 Web

🔥 Firebase Studio

Biggest Challenge: Specificity

⚠️

GenAI will fail when given tasks that are:

  • Too complex without breakdown
  • Ambiguous in requirements
  • Break down complex tasks
  • Be specific in instructions

"The quality of your output is directly proportional to the specificity of your input."

The Patterns

Techniques for Effective Prompting

  • Few-shot prompting
  • Chain of Thought
  • Tree of Thought
  • Self-Consistency
  • and many more...

Use LLMs to prompt LLMs

Development Patterns

Greenfield Development

💡

1. Idea Honing

🧩

2. Task Decomposition

🚀

3. Implementation

Existing Codebases

🔄

Incremental Iteration

🧪

For both: Lots of tests

Task Management Tools

Task Master AI

📋

Cursor Rules

🦘

Roo Code

Cost Management Strategies

Caching

📦

Batching

📊

Token Usage

MCP: The Protocol That Connects Worlds

"MCP is an open protocol that enables seamless integration between LLM applications and external data sources and tools."

— Anthropic

The Agents

Agent

Autonomously make decisions and take actions to achieve a goal

Key components of Agents

Reflection

Planning

Tools

Collaboration

Key components of Agents

Reflection

Planning

Tools

Collaboration

Types of Agents: Single Agent System

Types of Agents: Hierarchical Multi-Agent System

Types of Agents: Network Multi-Agent System

AI Agents Frameworks

🔄

LangGraph

👥

CrewAI

🤖

Autogen

🧠

Agents SDK

🛠️

Agent Development Kit

Exponential growth

Task Length

"The more it reasons, the more unpredictable it becomes"

— Ilya Sutskever

200 –> 2000

Thank You!

Let's Connect

  • Connect with me on LinkedIn
  • Share your GenAI challenges
  • Discuss implementation strategies

Questions?

Scan to connect