Resources
Books
- "Writing for Software Developers" - Mentioned as a previous topic Philip Kiely discussed on Software Engineering Radio.
Videos & Documentaries
Research & Studies
Tools & Software
- GPT-5 - Mentioned as an example of a capable, off-the-shelf AI model.
- Gemini - Mentioned as an example of a capable, off-the-shelf AI model.
- Claude - Mentioned as an example of a capable, off-the-shelf AI model.
- Llama - Mentioned as an example of an open-source large language model.
- Quent - Mentioned as an example of an open-source large language model.
- Mistral - Mentioned as an example of an open-source large language model.
- DScript - Mentioned as an example of a product that uses multiple AI models for content creation.
- Sourcegraph - Mentioned as a company building code editors that integrate codebase context with AI.
- Zed - Mentioned as a company building code editors that integrate codebase context with AI.
- Grafana - Mentioned as a standard observability tool used for dashboards and alerts.
- Transformers - Mentioned as an underlying technology for running generative models.
- Diffusers - Mentioned as an underlying technology for image models.
- vLLM - Mentioned as an example of an inference engine for running models.
- Sg Lang - Mentioned as an example of an inference engine for running models.
- TensorRT LLM - Mentioned as an example of an inference engine for running models.
Articles & Papers
People Mentioned
- Itamar Friedman - Guest on a previous episode discussing automated testing with generative AI (Episode 633).
- Rishi Singh - Guest on a previous episode discussing using GenAI for test code generation (Episode 603).
- Ipek Ozkaya - Guest on a previous episode discussing GenAI for software architecture (Episode 626).
- Simon Wilson - Mentioned as someone who has written about prompt injection.
Organizations & Institutions
- Base 10 - Philip Kiely's employer, an inference platform company.
- IEEE Computer Society - Sponsor of Software Engineering Radio.
- IEEE Software Magazine - Sponsor of Software Engineering Radio.
- OpenAI - Mentioned in the context of their models and services.
- Nvidia - Mentioned in the context of their GTC conference.
Courses & Educational Resources
Websites & Online Resources
- se radio net - Website for Software Engineering Radio.
- computer org - Website associated with the IEEE Computer Society.
- Hugging Face - Mentioned as a place to download open-source model weights.
- se radio slack com - Slack channel for Software Engineering Radio.
Other Resources
- Multi-Agent AI - The primary topic of the episode, focusing on composing multiple AI models.
- AI Native Software - Software built from the ground up with AI capabilities.
- Function Calling / Tool Use - Technical implementation for agentic AI to interact with tools.
- Retrieval Augmented Generation (RAG) - A technique to introduce new context into models dynamically.
- Embedding Models - Used for RAG to encode semantic meaning.
- Prompt Injection - A security vulnerability where prompts can alter a model's intended behavior.
- Evals - Quality benchmarks created for specific products or domains.
- Alignment - The philosophical and technical challenge of ensuring AI models are helpful, harmless, and useful.
- Inference - The phase where a trained model is used to generate responses to user queries.
- Training - The phase where data is fed to a model to improve its performance.
- Weights (Model Weights) - The parameters within a neural network that determine its behavior.
- Parameters - Individual numbers within a model that influence its output.
- GPU - Graphics Processing Unit, hardware optimized for parallel computations essential for AI inference.
- CPU - Central Processing Unit, the primary processor in a computer, less suited for large-scale AI parallel processing.
- Tensor Cores - Specialized processing units within GPUs designed for matrix math.
- Quantization - A technique to reduce the precision of model weights to decrease memory usage and improve speed.
- Speculation Algorithm - An algorithm used in inference to predict future tokens.
- Batch Sizes - The number of samples processed by the model at once during inference.
- Sequence Lengths - The number of tokens the model considers at once.
- Temperature - A parameter that controls the randomness of model output.
- Observability - The practice of monitoring and understanding the internal state of a system.
- Multi Cloud - Utilizing services from multiple cloud providers.
- Multi Region - Deploying applications across different geographical regions for resilience and performance.