MoreRSS

site iconManas J. SaloiModify

A product leader, has held key product management roles at Gojek, Directi, Craftsvilla, CouponDunia and Kore, responsible for product development and growth.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Manas J. Saloi

Throne ad

2025-06-19 08:00:00

I really like the idea of ThroneScience.

One of my favourite VC frameworks is: take what only the wealthy can get today and make it available to everyone. But what if we could offer people something that was once fit only for royalty?

I wanted to play around with this idea.

Throne brands itself as the Whoop for your poop. Imagine Apple designing a toilet based health tracker.

The founder had shared their slide deck yesterday, so I tried out a few ad ideas. The numbers are from their deck.

Throne ad 1

Throne ad 2

I tried 2 formats with slightly different copy and alignment. These exercises are part of my effort to improve my skills in design and marketing.

Underutilised knowledge is the new underutilised asset

2025-06-18 08:00:00

A lot of VCs are excited seeing Scale’s outcome. Accel just made 2.5B as an early investor on Meta’s Scale AI deal. Everyone is talking about Scale AI’s and other labeller’s insane success, including their revenue growth, but misses the real story.

It’s not about being a better Mechanical Turk. It’s about timing the supply waves perfectly.

3 waves of labelling marketplace supply:

  • Wave 1: generic labour. 
  • Wave 2: the COVID talent surplus.
  • Wave 3: the underutilised idle specialists.

Just like idle cars and homes created Uber and Airbnb, idle expertise (cultural, legal, financial) will now fuel a new supply wave in data labeling.

Like Uber or Airbnb, data labeling operates as a two-sided marketplace.

  • Demand: AI labs, product companies (co pilot for X).
  • Supply: Human labellers/evaluators, starting with freelancers from India, Africa, Philippines, LatAm, etc to domain experts who have PhDs.

Samasource, was one of the earliest players that worked with these foundational models, and they positioned themselves as the “ethical” player in this space, sourcing labellers in Africa, focusing on both ‘impact’.

Scale is the one that grew to massive ‘scale’.

There are a lot of players now like Surge AI, Invisible Labs, Turing AI, Deccan AI (based in Hyderabad), Handshake that have a distribution advantage and existing relationship with “demand”.

But I think there will be more players. Not because of just more demand. Demand is proven. Not just from foundational labs who spend a lot, but also enterprise clients (verticals such as insurance, telcos, and copilots for X, and more). The differentiation will be in capturing specialised supply better, cheaper, at scale. And there has never been a better time to source that.

Let’s start with ‘supply’ side of marketplaces and how it changes over time. You initially overpay to attract supply. Uber drivers at one point were making 6 figures INR per month in India.

Drivers were so happy that they bought more cars on loan and started running their own fleet. The incentives were that good.

Scale AI and others paid generously to build their initial workforce. Good money for people in developing countries looking for flexible work.

There are some good examples on how they seeded the supply side in the book ‘Empire of AI’. Over time. Your buyers (in this case, AI labs) start squeezing margins. They want cheaper, faster labeling. They want more from their spend. You have two choices: subsidise the cost or cut incentives that you generously provided earlier to your labellers.  Labeller payouts declined over time. For basic tasks, you always could launch in a few country where people were equally desperate for work. There was oversupply of supply.

And the models became better over time. For basic tasks, you do not need labellers anymore. So you start going after specialised tasks.

For specialised tasks you need domain experts. Domain experts (supply) need a reason to come to your platform.

This is why we need to understand not just the demand side, but also supply.

I think it is the best time to start a data labelling company if you can figure out how to gather differentiated supply. And assuming demand from the big labs persist.

The best marketplaces grow during economic crises. Not because they’re predatory, but because that’s when skilled supply becomes available. The 2008 recession gave us Airbnb and Uber. People needed money. They had idle assets (homes, cars).

Refer to Kevin Kwok’s underutilised fixed assets essay.

Covid created a similar moment for data labeling companies. Suddenly you had tech workers who got laid off in the first wave of layoffs, you had people working from home, professionals with time on their hands.

People who were not labellers earning a few dollars per hour in the Philippines, but experts who could charge far more for their domain knowledge.

Surge AI capitalised on this. Their supply quality shot up overnight.

Scale could still provide you labellers at scale, but Surge positioned themselves as a more premium option, with higher quality experts. Now we’re in another interesting moment.

AI is slowly displacing knowledge workers. This is the worst hiring market in a long time.

College graduates are unemployed.

Lawyers, analysts, domain experts are finding themselves with idle time.

Models don’t need basic labeling anymore. They need specialised knowledge.

Because they are getting better at general tasks but still struggle with domain specific knowledge. Every AI company trying to build vertical copilots (for legal, for doctors, and others) needs this specialised labeling.

Supply that did not need these jobs earlier, considered them low status, and maybe even feared providing expertise to help train models that would eventually replace them, have no choice but to join these platforms and monetise their knowledge.

A corporate lawyer reviewing contract clauses. A Bengali literature professor annotating cultural references. A former Goldman analyst explaining financial models. All these folks will be helping frontier models with the remaining 10% that is needed them to deploy fully specialised agents that replaces human workers.

One might ask: If AI is displacing knowledge workers and creating a surplus of desperate talent, and if these same workers can command premium prices for specialised labeling, why would a “Copilot for X” pay top dollar for their annotations rather than hiring them directly at lower wages?

The answer lies in how AI companies optimise for revenue per employee. They aim to minimise operational overhead and avoid adding non core roles to their payroll.

If a “Copilot for legal” company employed hundreds of lawyers, it would raise doubts about whether the company is truly leveraging AI to perform legal tasks. This scenario leads to satirical commentary such as the notion of Builder AI hiring thousands of Indians instead of having any real “AI” and the phrase “AI” standing for “Artificial Indians.” Ultimately, this means operationally heavy tasks will always be outsourced to trusted, high quality marketplaces rather than handled in house.

While reading Tegus interviews, I have come across examples of labs paying 1000$ for labelling tasks.

Not just foundational model labs, even companies like Apple might go to and ask Mercor for 50-100 MDs to help train their health data.

A Co-pilot for legal doesn’t need 10,000 random people telling it about law. It needs 100 lawyers who can spot when its legal reasoning is flawed. A Gujarati literature professor is more valuable for cultural alignment than 1,000 Gujarati mechanical annotators. 

Companies are innovating too. Companies like Mercor are betting on a more flexible, pure marketplace model, which is low touch, and just connects demand with specialised supply. I think it is more scalable in long term.

They are the anti-Scale AI.

Namma Yatri to Scale AI’s Uber.

They are more Tegus than GLG if it makes sense.

Data labeling started with basic image tagging. Moved to text annotation. Now heading toward expert knowledge curation. The platforms that can attract and retain domain experts will win.

Smart founders would be building for this future now. Not competing with Scale on volume. But creating boutique marketplaces for specific expertise. Law. Medicine. Finance. Regional knowledge and nuance. Each with its own supply dynamics and quality requirements.

AI will be simultaneously destroying jobs while creating new ones: knowledge workers losing traditional roles but finding new work teaching AI their expertise.

Note 1: I’m using ‘labeling’ as a catch all. In practice these firms ship full data-ops stacks: APIs, Python SDKs, in platform QA, and RLHF/SFT workflows and it is not easy to recreate all of this. I still think there are opportunities in this space and Mercor is a prime example of a new kind of player who can enter late and capture significant revenue, even though players like Scale exists.

Note 2: One Underutilised Supply Source: Retirees. Retired Doctors. Retired Lawyers. Many retirees feel lonely and seek purpose. They don’t want full time jobs, making them an ideal supply source for marketplaces needing specialised skills.  Gig work on typical on demand marketplaces tends to be low status and doesn’t use their specialised knowledge. Leveraging retirees in roles aligned with their expertise offers a much better alternative. This supply will not be as high intent as a specialised worker out of work. You can operate a marketplace with fewer but highly dedicated freelancers, almost acting as full-time employees, working longer hours and achieving higher utilisation or you can choose to have a larger pool of freelancers working fewer hours with moderate utilisation. This will be the later.

AI remix

2025-06-17 08:00:00

If everything is a remix, then how do you remix well:

  • I like using Mobbin for inspiration. I take screenshots of designs I like, then use ChatGPT to remix various images and styles. Once I’m happy with the output, I ask ChatGPT to break it down into layers, like separate out each component, and then continue designing in Figma.
  • I use Gemini 2.5 Pro for its multimodal context capabilities. Claude can’t process video, and for most animations or screen transitions, video is key to conveying how something works. So I start with Gemini 2.5 Pro in AI Studio to build that video context, then use it to craft a more informed prompt for Claude Opus, which helps me create the final output.

Truths, Religion, and Currency

2025-06-15 08:00:00

I was talking to a designer friend yesterday.

I was wondering whether I should get a Mobbin subscription.

I had one when I was working at my last company. I was not sure if it made sense to buy a subscription when I am neither a designer, nor someone who will be using it regularly.

And I saw quite a few cheaper alternatives. So I asked my friend if I should just go for a knock off of Mobbin.

I also asked him why his company pays a premium for it when cheaper alternatives exist. His answer was that if existing apps change their UI, Mobbin will the updated UI soon. The cheaper alternatives will still be showing the same old design six months from now.

Mobbin isn’t just a collection of screenshots, it’s a live feed of an app’s design evolution. You pay because you have access to an ever increasing supply of design inspiration that is refreshed constantly.

That conversation got me thinking about data companies.

Why do some charge X while others can command 5X the price for what looks like the same information.

The answer isn’t about having more data. It’s about having data that’s alive. We will talk about 2 ideas in this post:

  • Museum and News channels.
  • Truth vs Religion vs Currency (That I discovered thanks to the Crunchbase CEO who discussed Crunchbase’s moat in a recent podcast appearance.)

There are really two kinds of data businesses: Museums (with a constant collection that does not add more art) & News channels.

They just preserve old artefacts.

Once you’ve seen the Mona Lisa as a user, you don’t need to check it again next week. The artefact becomes stale after the first visit. There is no new version of the Mona Lisa every month. Most data vendors, especially the copycat ones who just scrape the incumbent, are museums.

They scrape something once, file it away, charge admission forever.

Take a typical Crunchbase competitor for example. Company founding dates. Patent records. Historical funding rounds. Cap table.

Useful? Sure. But also easy to copy. And because they are copycats whose only differentiator is pricing, they can’t even update their artefacts or add new artefacts as the same velocity as Crunchbase.

News channels are different. They matter because what they show today is different from yesterday. No one’ opens the television to catch yesterday’s news.

Museums sell truth in the past tense. News channels update you everyday on the latest ‘truth’ or ‘facts’ and tell you signals about the future. Think of the weatherman telling you how the weather will be today so that you can be better prepared.

You tune in because today’s truths aren’t yesterday’s. They are fresh. They are evolving.

Static data is a race to the bottom. Someone will scrape your museum and build a cheaper one next door. First mover charges X/month. Six months later, five competitors offer the “same” data for 1/5th the price. Live data that is updated constantly is different.

Here’s what makes data alive:

  • It changes faster than competitors can copy
  • Missing updates breaks something downstream
  • Users need it fresh, not just accurate

Facts or truths aren’t truly defensible long term, especially with AI eating the world.

Yes, you can have some unique data, that competition does not have, but competitors can often automate similar pipelines or license the same sources. Without barriers such as exclusive capture rights, strong network effects, or switching costs, truths (stale or live) by itself may not sustain a price premium.

Now coming to Truths vs Religion vs Currency.

Religion is prediction.

Excerpts from a Crunchbase CEO’s podcast appearance in the World of DaaS podcast:

“Can I predict with any sort of accuracy what we think a future valuation might be, for instance? So we’re talking about like, how do we do valuation predictions? And that’s where Crunchbase actually does the valuable thing that you always wanted to do. Because when you were coming to Crunchbase before, those are the questions you were asking and we’re just giving you like one data point saying, well, here’s the last funding round. And you would have to sort of figure that stuff out on your own. Now we’re using thousands of feature vectors that go and figure it out. There’s religion, which is what will happen in the future, which is what you were just talking about. Here’s the companies that will be selling, here’s the companies, there’s some sort of prediction. It sounds like what you’re saying is that religion will be more valuable than truth. It has to be. The tricky part of all that, of course, is tell me any prediction engine you’ve ever seen that’s good. They’re terrible. Almost all of them are terrible. That’s changing. In our case, we have 95% precision. It’s hard to even believe it’s that good because we have all these little secrets that make it easy to figure out, but no one else has access to them. They don’t think that way. The user has to start believing it and the user has to say, I’m going to put my business future into the hands of a company that’s predicting stuff when I have a history of not believing in predictions. Like real religions, if you believe a certain thing and someone else will think you’re going to hell, you say you’re a Bayesian, I hate Bayesians, well then it’s off right there. And the only way that any of these religion companies have ever become big companies is if they’ve crossed the chasm from religion to currency. And then it’s like, well, people actually don’t really believe in the religion, but it’s priced like a FICO. Maybe people don’t even believe in the FICO score, but it’s the currency. It’s in every single loan. So therefore, you have to trade in FICO scores. It’s almost like the US dollar in a way. And so therefore, the company becomes super valuable. How do you go from religion, where you could have thousands of religions to, okay, this is the canonical one that we’re all gonna agree with. You don’t want to have thousands of competitors either. At the end of the day, it does come down to prove us in the pudding. The nice thing about predictions is someday they come true. So I can show, it’s not just a faith-based thing then. It is now this thing like, look, we have proven time and time again. In our case, how we talk about it is, in April, 600 of our predictions came true. So we were able to go and say, these companies got acquired and they got funding. What was that worth? Because you missed it, what did you miss out on that? For you doing deals, if you miss out on the hot company, that company sells for $5 billion, zero ROI on not buying Crunchbase is very, very clear. If we do get competitors in that space, we’ll go and say, well, here’s our accuracy rate in the last six months, let’s compare, let the best one win. And that’s where we feel really good because no one else has any in the now data, unless they’re somehow getting into the process itself before it happens, which is unlikely to happen.”

(Any error in the above transcription & interpretation is mine.)

Back to this post.

Tegus is another successful data play. Started as an archive of expert call transcripts. Sounded like a museum. But investors schedule new calls every week. Tegus adds these fresh transcripts almost daily. Miss a month, and you miss the latest insights on things that matter to you as an investor. It is a living feed because users themselves demanded fresh insights.

I believe the strongest moat sits at the intersection of two things: predictions that refresh continuously and can become currency.

Currency > Religion > Live truths > Truths.

From “This company has raised fundraising” to “This company might raise funding” to “This company’s probability of raising in the next 3 months just went from 50% to 80% this week based on their hiring velocity and the positive feedback by Tegus experts.”

Building a successful data business will be determined by:

  • You capturing data at the source.
  • The data being accurate. And hence the ‘truth’.
  • The data being captured in a scalable way.
  • Ideally multiple connectors/ sources from where you pull the data, so that you are not dependent on one connector.
  • The data being constantly refreshed, time-series data, not stale data, so that if someone scrapes your data too that data will be stale while yours will always be updated.
  • Transparency around data. “Last updated 2 dats ago” builds more trust than vague claims of “comprehensive up to date data.”
  • Predicting (religion) and helping key decision makers. You need to sell decisions, not just data.
  • Prove your predictions works.Your prediction scores will become a currency only if it works. If your predictions help customers make more money, they’ll keep paying. Show the ROI.

There is another reason I have been digging into this topic. Every venture capitalist says that proprietary data is the key moat in today’s AI world.

You might remember some of my research prompts about finding overlooked proprietary data sources. I’ve been focused on this over the past few months. Tegus has become one of my favourite products, and I often wish I had built something similar. Tegus commands pricing power, charging $20,000 per subscription because its insights help VCs and hedge funds confidently deploy tens of millions.

Recently, a founder urged me to copy Tracxn and rebuild it for the AI age, claiming that deep research and AI agents now make it easy to gather information and launch a data platform.

However, I believe that simply scraping Tracxn’s database will never be enough to create a lasting data business. To succeed, you need a richer plan. This post shares my research on how to do that.

Synthetic paneer

2025-06-14 08:00:00

The synthetic paneer scandal was big on X dot com a few months back.
People were freaking out over food adulteration.

But did this actually change how people order food?

Every food delivery company has this goldmine of data sitting right there. They know who’s vegetarian based on order history. They can see if you suddenly stopped ordering paneer makhani after those videos went viral. They can track if you ever came back.

What I want to know is whether that outrage led to action and long-term behavior change in people.

The data analysis would be straightforward. I am sure all food delivery apps already tag users as vegetarian or non-vegetarian based on their order history. They should look at ordering patterns before and after the scandal.

If there was a dip, how big was it? Furthermore, they should examine order frequency of different cohorts, basket value (maybe they order fewer items), and specific dishes where they assume there might be adulteration.

Did they switch categories? Maybe non-vegetarians who ordered veg items from time to time stopped ordering veg items? Perhaps they still ordered veg items, but not paneer?

For different segments: Heavy, Medium, and Light food delivery users, whose behavior changed the most? Did people replace paneer with tofu, mushroom, soy, or other vegetarian alternatives, or did they stop ordering (cut ordering frequency) for a while?

Did these food delivery platforms measure weeks for: percentage of users who returned to old patterns, percentage who never came back, and percentage who stayed off food but migrated to groceries and cooking at home?

People who were heavy users, say ordering five times a week, cannot suddenly change their behavior and replace all these orders by cooking these meals. Convenience usually wins. If they hired a cook, and in India you can always get a cook, did they drop off completely from these platforms or did they continue to order occasionally?

Did their behavior shift from ordering from new restaurants to older, more established ones where trust is higher? Are they now ordering more from higher priced restaurants, assuming that better quality ingredients are used and the risk of adulteration is lower?

Did cloud kitchens suffer more than established restaurants? Did some places advertise “100% pure paneer”? Maybe they should have.

Now comes the most interesting part.

Most of these delivery apps aren’t just food delivery anymore. They’re super apps with grocery delivery. Swiggy has Instamart. Zomato has Blinkit. Zepto has Zepto Cafe & Zepto. So when someone stops ordering cooked paneer from restaurants, do they start buying raw paneer from the grocery side?

Do they trust Amul paneer from the grocery section of the super app but not restaurant paneer?

Did trusted labels (Amul, Mother Dairy) see higher sales in groceries over generic paneer? Do people lose trust in in-house brands of these super apps?

Even more interesting: what if more and more systemic gaps don’t lead to outrage, but instead lead to people just giving up? What if there’s a threshold after which people stop caring entirely, and start accepting it all as “the cost of living in India”?

A bridge collapses. A plane crashes. Food adulteration hits the news again. People panic, but for a bit. They complain on Twitter. They protest by posting memes. And then? They go into “fuck it” mode because they feel like they have no control. And when you feel powerless long enough, you don’t resist, you rationalise. “Everything is broken anyway.” Why care so much? Why not just order that Paneer Manchurian from the nearby place at 30% discount? They order from the same old restaurants again. They eat the same cheap adulterated paneer. Life goes on.

What if it’s not convenience that brings people back to their old behavior? It’s resignation.

So yeah, I want to see how long did user behavior go back to baseline. And for how many people the ordering frequency changed permanently.

I think people have short memories. My hypothesis is that heavy users come back fastest. They’re addicted to convenience. Light users might stay away, cut ordering frequency permanently because they have alternatives.

Maybe people didn’t stop ordering. They just switched items. No more paneer butter masala, but the chicken tikka orders went up. That’s not a trust issue with the platform. That’s a trust issue with a specific ingredient.

The cross-platform data would be fascinating. If someone stops ordering food from the food delivery product but their grocery orders went up, then the platform did not lose a customer. They just shifted spend. But if both drop? Then it could be a trust problem with the ingredient, how the sourcing was done, or maybe it is a brand problem. And these super apps need to regain trust.

What about the people who complained loudly on social media? Did their behavior actually change? (Food delivery apps won’t have this data though.)

This kind of analysis could influence product strategy. If people trust groceries more than restaurants, maybe you push the grocery product harder during food scandals and promise high quality ingredients. This is where having a super app helps. If certain restaurant brands maintain trust, maybe you highlight them more prominently.

Now, if only these super apps had payments integrated too like some Chinese and South East Asian super apps. I would have loved to see if spend moved offline. People ordered less, be it food or groceries, but went out to eat more, where they could physically see people making their food.

Does it mean that Darshinis in Bengaluru now attract more people?

Running this analysis would not be hard. I would definitely have done it if I was a PM at these super apps.

Victor Lazarte’s AI investing framework

2025-06-13 08:00:00

Victor Lazarte, Benchmark: This is the biggest technology shift we have ever seen. The revenue ramps are incredible. Take Mercor: I invested nine months ago, and they just announced a 100 × jump to a 75 million-dollar run rate, still growing 50 percent month over month. You do not see that every day.

But fast growth alone is not enough anymore. In the past, hitting 10 million ARR meant you were safe. Today the pace of change forces you to ask whether that revenue is durable.

At our partner meetings we keep seeing startups at 5 to 10 million ARR that may not exist in two years. We always ask, “If foundation models get ten times better, does this business become stronger or weaker?”

Many thin wrappers evaporate when the next model release absorbs their value. Imagine an app that formats building-permit paperwork with ChatGPT. It can reach a few million in revenue, but it disappears once the base model folds that feature in.

That single question about a ten-times-better model is very clarifying.

Our job as investors is less about predicting the far future and more about understanding the present. We focus on the areas where AI performance is improving fastest. That is easier than guessing distant scenarios.

We track benchmarks: Where are the eval scores rising most quickly? Then we invest there. The pattern is clear: AI improves fastest where you can measure the output objectively.

  • Code is the best example. You compile and test, so feedback is instant. Seed investing in coding tools is probably closed now, but we are glad we got in early.
  • Other text-based fields with clear exams or verifiable outputs—law, medicine, and so on—show the same rapid gain, which is why I backed two startups in AI video as well.
  • Domains without tight feedback loops, like household robotics, will take much longer.

So our framework is simple:

  1. Spot where benchmarks are improving the fastest.
  2. Ask whether those improvements make the business more or less durable as models advance.