MoreRSS

site iconChip HuyenModify

Co-founder of Claypot AI, graduated from Stanford University, grew up in Vietnam. Ex NVIDIA, Snorkel AI, and Netflix.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of twitter of Chip Huyen

Finally got my copy! “AI Engineering” is officially out 🙏 🎉

2025-01-10 00:16:51

Finally got my copy! “AI Engineering” is officially out 🙏 🎉

During the process of writing AI Engineering, I went through so many papers, case studies, blog posts, repos, tools, etc. This repo contains ~100 reso...

2024-12-13 11:44:56

During the process of writing AI Engineering, I went through so many papers, case studies, blog posts, repos, tools, etc. This repo contains ~100 resources that really helped me understand various aspects of building with foundation models.

https://github.com/chiphuyen/aie-book/blob/main/resources.md

Here are the highlights:

1. Anthropic’s Prompt Engineering Interactive Tutorial

The Google Sheets-based interactive exercises make it easy to experiment with different prompts and see immediately what works and what doesn’t. I’m surprised other model providers don’t have similar interactive guides: https://docs.google.com/spreadsheets/d/19jzLgRruG9kjUQNKtCg1ZjdD6l6weA6qRXG5zLIAhC8/edit

2. OpenAI’s best practices for finetuning

While this guide focuses on GPT-3, many techniques are applicable to full finetuning in general. It explains how finetuning works, how to prepare training data, how to pick training hyperparameters, and common finetuning mistakes: https://docs.google.com/document/d/1rqj7dkuvl7Byd5KQPUJRxc19BJt8wo0yHNwK84KfU3Q/edit

3. Llama 3 paper

The section on post-training data is a gold mine as it details different techniques they used to generate 2.7 million examples for supervised finetuning. It also covers a crucial but less talked about topic: data verification, how to evaluate the quality of synthetic data: https://arxiv.org/abs/2407.21783

4. Efficiently Scaling Transformer Inference (Pope et al., 2022)

An amazing paper co-authored by Jeff Dean about inference optimization for transformers models. It covers not only different optimization techniques and their tradeoffs, but also provides a guideline for what to do if you want to optimize for different aspects, e.g. lowest possible latency, highest possible throughput, or longest context length: https://arxiv.org/abs/2211.05102

5. Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models (Lu et al., 2023)

My favorite study on LLM planners, how they use tools, and their failure modes. An interesting finding is that different LLMs have different tool preferences: https://arxiv.org/abs/2304.09842

6. AI Incident Database

For those interested in seeing how AI can go wrong, this contains over 3000 reports of AI harms: https://incidentdatabase.ai/

7. I find case studies from teams that have successfully deployed AI applications extremely educational. Here are some of my favorite enterprise case studies. I'll add more case studies soon!

- LinkedIn: https://www.linkedin.com/blog/engineering/generative-ai/musings-on-building-a-generative-ai-product

- Pinterest's Text-to-SQL:
https://medium.com/pinterest-engineering/how-we-built-text-to-sql-at-pinterest-30bad30dabff

- Gmail’s Smart Compose (2019): https://arxiv.org/abs/1906.00080

- Grab: https://engineering.grab.com/llm-powered-data-classification

It’s done! 150,000 words, 200+ illustrations, 250 footnotes, and over 1200 reference links. My editor just told me the manuscript has been sent to th...

2024-12-05 03:02:17

It’s done! 150,000 words, 200+ illustrations, 250 footnotes, and over 1200 reference links.

My editor just told me the manuscript has been sent to the printers.

- The ebook will be coming out later this week.
- Paperback copies should be available in a few weeks (hopefully before the end of the year). Preorder: https://amzn.to/49j1cGS
- The full manuscript is also accessible on O'Reilly platform: https://oreillymedia.pxf.io/c/5719111/2146021/15173

This wouldn’t have been possible without the help of so many people who reviewed the early drafts, answered my thousands of questions, introduced me to fascinating use cases, or helped me see the beauty of overlooked techniques.

Thank you everyone for making this happen!

Re @Luke_Metz @barret_zoph @LiamFedus @johnschulman2 What a ride! Can't wait to see what you'll build next

2024-10-10 08:06:02

Re @Luke_Metz @barret_zoph @LiamFedus @johnschulman2 What a ride! Can't wait to see what you'll build next

Building a platform for generative AI applications https://huyenchip.com/2024/07/25/genai-platform.html After studying how companies deploy generative...

2024-07-26 01:10:47

Building a platform for generative AI applications

https://huyenchip.com/2024/07/25/genai-platform.html

After studying how companies deploy generative AI applications, I noticed many similarities in their platforms. This post outlines these common components, what they do, and implementation considerations.

This post starts from the simplest architecture and progressively adds more components.

1. Enhance context input into a model by giving the model access to external data sources and tools for information gathering.

2. Put in guardrails to protect your system and your users.

3. Add model router and gateway to support complex pipelines and add more security.

4. Optimize for latency and costs with cache.

5. Add complex logic and write actions to maximize your system’s capabilities.

I try my best to keep the architecture general, but certain applications might deviate. As always, feedback is appreciated!

Re 3. Hallucinations make GenAI applications unusable Models hallucinate because they are probabilistic. However, a model is much more likely to hallu...

2024-06-27 01:57:05

Re 3. Hallucinations make GenAI applications unusable

Models hallucinate because they are probabilistic. However, a model is much more likely to hallucinate when it doesn’t have access to the right information.

Multiple studies have shown that hallucinations can be significantly reduced by giving the model the right context via retrieval or tools that the model can use to gather context (e.g., web search).

While hallucinations are dealbreakers for many applications, they can be sufficiently curtailed to make GenAI usable for many more.