MoreRSS

site icon Answer.AIModify

A new kind of AI R&D lab which creates practical end-user products based on foundational research breakthroughs
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of twitter of Answer.AI

RT Jeremy Howard: Not too shabby...

2024-12-22 17:38:25

RT Jeremy Howard
Not too shabby...

RT Wayde Gilliam: For those of you looking to get started with finetuning @answerdotai's new ModernBERT models with @huggingface Transformers ... I go...

2024-12-20 05:39:49

RT Wayde Gilliam
For those of you looking to get started with finetuning @answerdotai's new ModernBERT models with @huggingface Transformers ... I got you covered

https://github.com/AnswerDotAI/ModernBERT/blob/main/examples/finetune_modernbert_on_glue.ipynb

RT Igor Carron: "We're gonna need a bigger graph"

2024-12-20 01:05:13

RT Igor Carron
"We're gonna need a bigger graph"



Jeremy Howard: Post credits easter egg:

Hey did you wonder what if we trained a bigger model? Where would that take us?

Yeah, us too.

So we're gonna train a "huge" version of this model in 2025. We might need to change the y-axis on this graph…

RT Philipp Schmid: ModernBERT, BERT revisited in the age of LLMs and Generative AI! @LightOnIO and @answerdotai modernized BERT! Improved architecture...

2024-12-20 00:43:38

RT Philipp Schmid
ModernBERT, BERT revisited in the age of LLMs and Generative AI! @LightOnIO and @answerdotai modernized BERT! Improved architecture with 8192 context length, flash attention, and trained on 2T tokens. ModernBERT outperforms version BERT and RoBERTa versions! 👀

TL;DR;
2️⃣ Comes in 2 sizes base (139M) and large (395M)
🚀 Better performance across all metrics than the original BERT
📏 8,192 token context length (16x longer than BERT)
⚡ Modern architecture with Flash Attention 2, RoPE embeddings, and alternating attention
📚 Trained on 2 trillion tokens, primarily English and Code
💨 2-4x faster than other models with mixed-length inputs
🔓 Released under Apache 2.0
🤗 Available on @huggingface and Transformers (main)

RT Griffin Adams: Announcing Cold Compress 1.0 with @answerdotai A hackable toolkit for using and creating KV cache compression methods. Built on top ...

2024-08-02 02:07:08

RT Griffin Adams
Announcing Cold Compress 1.0 with @answerdotai

A hackable toolkit for using and creating KV cache compression methods.

Built on top of @cHHillee and Team’s GPT-Fast for torch.compilable, light-weight performance.

Develop novel methods in as little as 1 line of new code.

RT Jeremy Howard: Announcing FastHTML. A new way to create modern interactive web apps. Scales down to a 6-line python file; scales up to complex prod...

2024-07-30 05:32:34

RT Jeremy Howard
Announcing FastHTML. A new way to create modern interactive web apps.

Scales down to a 6-line python file; scales up to complex production apps.

Auth, DBs, caching, styling, etc built-in & replaceable and extensible. 1-click deploy to @Railway, @vercel, @huggingface, & more.