MoreRSS

site iconLuke WroblewskiModify

Luke joined Google when it acquired Polar in 2014 where he was the CEO and Co-founder. Before founding Polar, Luke was the Chief Product Officer and Co-Founder of Bagcheck which was acquired by Twitte
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Luke Wroblewski

Letting the Machines Learn

2025-09-12 08:00:00

Every time I present on AI product design, I'm asked about AI and intellectual property. Specifically: aren't you worried about AI models "stealing" your work? I always answer that if I accused AI models of theft, I'd have to accuse myself as well. Let me explain…

I've spent 30 years writing three books and over two thousand articles on digital product design and strategy. But during those same 30 years? I've consumed exponentially more. Countless books, articles, tweets. Thousands of conversations. Products I've used, solutions I've analyzed. All of it shaped what I know and how I write.

Web sites as training data for AI models

If you asked me to trace the next sentence I type back to its sources, to properly attribute the influences that led to those specific words, I couldn't do it. The synthesis happens at a level I can't fully decompose.

AI models are doing what we do. Reading, viewing, learning, synthesizing. The only difference is scale. They process vastly more information than any human could. When they generate text, they're drawing from that accumulated knowledge. Sound familiar?

So when an AI model produces something influenced by my writings, how is that different from a designer who read my book and applies those principles? I put my books out there for people to buy and learn from. My articles? Free for anyone to read. Why should machines be excluded from that learning opportunity?

"But won't AI companies unfairly profit from training on your content?"

From AI model companies, for $20 per month, I get an assistant that's read more than I ever could, available instantly, capable of helping with everything from code reviews to strategic analysis. That same $20 couldn't buy me two hours of entry-level human assistance.

The benefit I receive from these models, trained on the collective knowledge of millions of contributors, including my microscopic contribution, dwarfs any hypothetical loss from my content being training data. In fact, I'm humbled that my thoughts could even be part of a knowledge base used by billions of people.

So let machines learn, just like humans do. For me, the value I get back from well-trained AI models far exceeds what my contribution puts in.

Unstructured Input in AI Apps Instead of Web Forms

2025-09-09 08:00:00

Web forms exist to put information from people into databases. The input fields and formatting rules in online forms are there to make sure the information fits the structure a database needs. But unstructured input in AI-enabled applications means machines, instead of humans, can do this work.

17 years ago, I wrote a book on Web Form Design that started with "Forms suck." Fast forward to today and the sentiment still holds true. No one likes filling in forms but forms remain ubiquitous because they force people to provide information in the way it's stored within the database of an application. You know the drill: First Name, Last Name, Address Line 2, State abbreviation, and so on.

Web forms exist to put information from people into databases

With Web forms, the burden is on people to adapt to databases. Today's AI models, however, can flip this requirement. That is, they allow people to provide information in whatever form they like and use AI do the work necessary to put that information into the right structure for a database.

How does this work? Instead of a Web form enforcing the database's input requirements a dynamic context system can handle it. One way of doing this is with AgentDB's templating system, which provides instructions to AI models for reading and writing information to a database.

With AgentDB connected to an AI model (via an MCP server), a person can simply say "add this" and provide an image, PDF, audio, video, you name it. The model will use AgentDB's template to decide what information to extract from this unstructured input and how to format it for the database. In the case where something is missing or incomplete, the model can ask for clarification or use tools (like search) to find possible answers.

Unstructured input using AI to format for a database structure

In the example above, I upload a screenshot from Instagram announcing a concert and ask the AI model to add it to my concert tracker. The AgentDB template tells the model it needs Show, Date, Venue, City, Time, and Ticket Price for each database entry. So the AI model pulls this information from the unstructured input (screenshot) and, if complete, turns it into the structured format a database needs.

Web form vs unstructured input in AI apps

Of course, the unstructured input can also be a photo, a link to a Web page, a Word document, a PDF file, or even just audio where you say what you want to add. In each case the combination of AI model and AgentDB will fill in the database for you.

No Web form required. And no form is the best kind of Web Form Design.

World Knowledge Improves AI Apps

2025-09-03 08:00:00

Applications built on top of large-scale AI models benefit from the AI model's built-in capabilities without requiring app developers to write additional code. Essentially if the AI model can do it, an application built on top of it can do it as well. To illustrate, let's look at the impact of a model's World knowledge on an app.

For years, software applications consisted of running code and a database. As a result, their capabilities were defined by coded features and what was inside the database. When the running code is replaced by a large language model (LLM), however, the information encoded in model's weights instantly becomes part of the capabilities of the application.

Traditional software application vs. AI application

With AI apps, end users are no longer constrained by the code developers had the time and foresight to write. All the World knowledge (and other capabilities) in an AI model are now part of the application's logic. Since that sounds abstract let's look at a concrete example.

I created an AI app with AgentDB by uploading a database of NBA statistics spanning 77 years and 13.6 million play-by-play records. When I add the MCP link AgentDB makes for me to Anthropic's Claude, I have an application consisting of a database optimized for AI model use, and an AI model (Claude) to use as the application's brain. Here's a video tutorial on how to do this yourself.

In the past a developer would need to write code to render the user interface for an application front-end to this database. That code would determine what kind of questions people could get answers to. Usually this meant a bunch of UI input elements to search and filter games by date, team, player, etc. The NBA's stats page (below) is a great example of this kind of interface.

NBA Stats search Web application UI

But no matter how much code developers write, they can't cover all the ways people might want to interact with information about the NBA's 77 years. For instance, a question like "What were the last 5 plays in the Malice in the Palace game?" requires either running code that can translate malice in the palace to a specific date and game or an extra field in the database for game nicknames.

World Knowledge in AI model for an NBA app

When a large language model is an application's compute, however, no extra code needs to be written. The association between Malice in the Palace and November 19, 2004 is present in an AI model's weights and it can translate the natural language question into a form the associated database can answer.

An AI model can use its World knowledge to translate people's questions into the kind of multi-step queries needed to answer what seem like simple questions. Consider the example below of: "Who was the tallest player drafted in Ant-Man’s NBA draft class?" We need to figure what player Ant-Man refers to, what year he was drafted, who else was drafted then, get all their heights, and then compare them. Not a simple query to write by hand but with AI acting as an application's brain... it's quick and easy.

World Knowledge in AI model for an NBA app

World knowledge, of course, isn't the only capability built-in to large-language models. There's multi-language support, vision (for image parsing), tool use, and more emerging. All of these are also application capabilities when you build apps on top of AI models.

Chat is: the Future or a Terrible UI

2025-08-28 08:00:00

As the proliferation of AI-powered chat interfaces in software continues, people increasingly take one of two sides. Chat is the future of all UI or chat is a terrible UI. Turns out there's reason to believe both, here's a bunch of them.

Back in 2013, I proposed a variant of Jamie Zawinski's popular Law of Software Envelopment reframed as:

Every mobile app attempts to expand until it includes chat. Those applications which do not are replaced by ones which can.

Today every major mobile app has some form of chat function whether social network, e-commerce, ride-share, and so on. So chat is already pervasive and thereby familiar, which made it a great interface to usher in the age of AI. But is it AI's final form?

“Chat is the future of software.”

  • People already know how to use chat interfaces. This familiarity means people can jump right in and start using powerful AI systems.
  • An empty text box is great at capturing user intent: people can simply tell chat apps what they want to get done. “Just look at Google.”
  • Natural language allows people to communicate what they want like they would in the real World, no need to learn a UI.
  • The best interface is…  no interface, an invisible interface, etc.
  • Conversational interfaces can shift topics and goals, providing a way to compose information and actions that’s just right for specific. needs.
  • Voice input means people don’t have to type but can still simply chat with powerful systems.
  • Chat user interfaces for AI models are a fundamental shift from forcing humans to learn computers to computers understanding human language.

Chat is the future of UI vs. Chat is a terrible UI

“Chat is a terrible user interface.”

  • Chat interfaces face the classic "invisible UI" problem: without clear affordances, people don't know what they can do, nor how to get the best results from them.
  • Walls of text are suboptimal to communicate and display complex information and relationships unlike images, tables, charts, ad more.
  • Scrolling through conversation threads to find and extract relevant information is painful, especially as chat conversations run long.
  • Context gets lost in back and forth interactions which slow everything down. Typing everything you want to do is cumbersome.
  • Language is a terrible way to describe visual, spatial, and temporal things.
  • Voice-based interfaces make it even harder to communicate information better suited to images and user interfaces.
  • We’re very early in the evolution of AI-powered software and lots of different and useful interfaces for interacting with AI will emerge.

It's also worth noting that chat isn't the only way to integrate AI in software products and increasingly agent-based applications outperform chat-only solutions. So expect things to keep changing.

Platform Shifts Redefine Apps

2025-08-27 08:00:00

With each major technology platform shift, people underestimate how much "what an application is and how it's built" changes. From mainframes to PCs, to Web, to Mobile and now AI, computing platform changes redefined software and created new opportunities and constraints for application design and development.

These shifts not only impacted how applications work but also where they run, what they look like, how they're built, delivered, and experienced by people.

Technology shifts redefine what’s an application?

Mainframe era: Applications lived on massive shared computers in climate-controlled rooms, with people typing text-only commands into terminals that were basically windows into a distant brain. All the intelligence sat somewhere else, and you just got text back.

PC era: Software became physical products you'd buy in boxes, install from floppy disks or CDs, and run entirely on your own machine. Suddenly computing power lived under your desk, and applications could use rich graphical interfaces instead of just green text on black screens.

Web era: Applications moved into browsers accessed through URLs, shifting from installed software to services that updated automatically. No more version numbers or install wizards, just type an address and you're using the latest version built out of cross-platform Web standards UI components.

Mobile era: Applications shrank into task-focused apps downloaded from curated stores, designed for fingers not mice, and aware of your location and orientation. Computing became something in your pocket that could make use of the environment around you through cameras, GPS, and on-device sensors.

AI era: Instead of screens and buttons, applications are conversations where AI models understand intent, execute complex tasks, and adapt to context without explicit programming for every scenario. And we're just getting started.

While it's true that AI applications sound a lot like the mainframe applications of old, those apps required exact syntax and returned predetermined responses. AI applications understand natural language and generate solutions on the fly. They don't just process commands, they reason through problems and build UI as needed.

Mainframe vs AI apps

During each of these platform shifts, companies react the same way. They attempt to port the application models they had without thinking through and embracing what's different. Early Web site were posters and brochures. Early mobile apps were ported Websites. Just like early TV shows were just radio shows with cameras pointed at them.

But at the start of a technology platform shift, how applications will change isn't clear. It takes time for new forms to develop. As they do most companies will end up rebuilding their apps like they did for the Web, mobile, and more. Companies that embrace new capabilities and modes of building early on can gain a foothold and grow. That's why technology shifts are accompanied by a surge of new start-ups. Change is opportunity.

Five Paths to Solving Robotics

2025-08-22 08:00:00

In his AI Speaker Series presentation at Sutter Hill Ventures, Google DeepMind's Ted Xiao outlined five worldviews on how to achieve useful, ubiquitous robotics and dug into his team's work integrating frontier models like Gemini directly into robotic systems. Here' my notes from his talk:

We're at a unique moment in robotics where there's no consensus on the path forward. Unlike other AI breakthroughs where approaches quickly consolidated, robotics remains wide open with multiple reasonable paths showing early signs of success. Ted presented five worldviews, each with smart researchers and builders pursuing them with conviction:

AI Speaker Series presentation at Sutter Hill Ventures with Ted Xiao

Industry Incumbent

These researchers believe general-purpose robotics is the wrong goal. Purpose-built solutions actually work today - from industrial automation to appliances we don't even call robots anymore. When robotics succeeds, we just call them tools. The path forward: directly optimize for specific use cases using decades of control theory and hardware expertise.

Humanoid Company

These researchers see hardware as the primary bottleneck. Once platforms stabilize, researchers excel at extracting performance - drones went from fragile research prototypes to consumer products, quadrupeds became robust commercial platforms. Humanoid form factors matter because the world is built for humans, and human-like robots can better leverage internet-scale human data.

Robot Foundation Model Startup

These researchers focuses on robot data and algorithms as the key. Generality is non-negotiable - transformative technologies are general by nature. The core challenge: building an "internet of robotics data" either vertically (solve one domain completely, then expand) or horizontally (achieve robotics' GPT-2 moment first, then improve).

Bitter Lesson Believer

These researchers argue frontier models are the only existence proof of technology that can model internet-scale data with human-level performance. You can't solve robotics without incorporating these "magical artifacts" into the exploration process. Frontier model trends and compute lead robotics by about two years.

AGI Bro

These researchers take the most radical position: just solve AGI and ask it to solve robotics. The Platonic Representation Hypothesis suggests that as AI models improve across domains, their internal representations converge. Perfect language understanding might inherently include physical understanding.

Gemini Robotics

Ted's team at Google DeepMind pursued the Bitter Lesson approach, building robotics capabilities directly into Gemini rather than treating frontier models as black boxes.

Their Gemini Robotics system first enhanced embodied reasoning - teaching the model to understand the physical world better through 2D bounding boxes in cluttered scenes, 3D understanding with depth and orientation, pointing for granular precision, and grasp angles for manipulation. The system then learned low-level control with diverse robot actions, operating at 50Hz control frequency with quarter-second end-to-end latency. This unlocked three key advances:

  • Interactivity: The robot responds to dynamic scenes, following objects as they move and adapting to human interference
  • Dexterity: Beyond rigid objects, it can fold clothes, wrap headphone wires, and manipulate shoelaces
  • Generalization: Handles visual distribution shifts (new lighting, distractors), semantic variations (typos, different languages), and spatial changes (different sized objects requiring different strategies)

When deployed at a conference with completely novel conditions - crowds, different lighting, new table - the system maintained reasonable behavior for arbitrary user requests, showing sparks of that GPT-2 moment where it attempts something sensible regardless of input.

Dark Horses and Emerging Paradigms

  • Several emerging paradigms could completely upend current approaches.
  • Video World Models learning physics without robots through action-conditioned video generation
  • Robot-Free Data from simulation or humans with head-mounted cameras
  • Thinking Models applying frontier models' reasoning capabilities to robotics
  • Locomotion-Manipulation Unity bridging RL-based locomotion with foundation model manipulation

There's no consensus on which path will win. Each approach has reasonable arguments and early signs of success. The lack of agreement isn't a weakness - it's what makes this the most exciting time in robotics history.