2024-11-23 04:54:00
Published on November 22, 2024 8:53 PM GMT
On the heels of Donald Trump’s election and his promises to end the Department of Education, you may have seen claims like these spreading around X.
This claim is based on two datapoints. First, is the literacy rate of around 99% in 1979 which was measured by the US Census. After the Department of Education was created in the same year, the census stopped measuring literacy in their surveys and it’s since been tracked by the National Center for Education Statistics (NCES). The tweet’s second number comes from a recent NCES result that shows that around 16% of sampled Americans are at or below level 1 English literacy.
The problem is that these claims compare two completely different standards of literacy. The census measure of illiteracy is defined as:
The inability to read and write a simple message in English or in any other language. Illiteracy, in this sense, should be clearly distinguished from *functional illiteracy,* a term used to refer to persons who were incapable of understanding the kinds of written instructions that are needed for carrying out basic functions or tasks.
- Source
If you can write a few words or even just your name in any language, this census measure will count you as literate.
The more recent NCES data point is a measure of English functional literacy which they define as:
The ability to understand, evaluate, use and engage with written texts to participate in society, to achieve one’s goals, and to develop one’s knowledge and potential.
. . .
English literacy [means completing] tasks that require comparing and contrasting information, paraphrasing, or making low-level inferences.
- Source
So the more recent data shows a lower literacy rate because you need more reading comprehension to count as literate in this data and you need to know English. No conclusions about how literacy has changed over time can be supported based on comparing these two data points.
There are long term assessments of literacy that we can compare over time. Scores on the Long-Term Trend reading assessment from the NCES have been essentially flat since 1971.
So the claim that literacy rates have fallen substantially since the Department of Education was founded is false.
The real data on education is not as bad as collapsing literacy rates, but it is more than bad enough to merit removing or reforming the Department of Education.
Inflation adjusted spending per pupil tripled since 1970 while reading scores haven’t budged.
There has also been an astounding amount of credential inflation. The amount of time people spend in school has increased by more than three years since the 1970s as more people graduate high school and college, but performance on tests of skill or human capital is completely stagnant.
This suggests, a la Bryan Caplan’s Case against education, that many of these extra years of schooling are actually a socially inefficient zero-sum competition where it pays individually to get the most schooling and come out on top of your peers, but everyone would be better off if people invested less time and money in competing. Hundred billion dollar subsidies to student loans and higher education institutions have exacerbated this zero-sum race for little material gain.
Evidence for this: The NCES ran two rounds of a literacy test, one in 1992 and one in 2003. The overall average score on the test didn’t change (276 vs 275 out of 500), but within every educational attainment group scores dropped massively.
High school dropouts got less literate on average because the highest scoring dropouts in the 90s became the lowest scoring graduates in the 2000s as standards were lowered and more students were pushed through into more education. Literacy scores among Graduate degree holders dropped by 13-17 percentage points in a decade. If a graduate degree cannot even teach you how to read, it's probably not having large effects on any other more complex forms of human capital.
This means that across this decade of rising educational attainment, no one improved their reading skills at all. Instead, the standards for graduating from each level of schooling were just lowered and people spent more years slogging through high school or college.
The NCES hasn’t repeated this test and I couldn’t find breakdowns of scores by educational attainment over a longer period of time, but this trend of rising educational attainment due solely to lowering standards rather than rising ability has almost surely continued.
There are more issues one could cover here: subsidizing the student crisis, rewarding useless degrees at the expense of productive ones, and promoting the DEI ideology of the education profession.
But the main point is that fabricating data about the state of education in America is a terrible basis for reform It’s also unnecessary given how dire many parts of our education system actually are.
2024-11-23 04:53:59
Published on November 22, 2024 8:53 PM GMT
I've been thinking about something I will clumsily call paraddictions: desires that don't quite rise to the level of an addiction, but that have a disproportionate and hard-to-moderate influence over your behavior. Can they be used as a tool for motivation and behavior change?
epistemic status: Most of the ideas here are generally solid behavior change principles that I'm just applying to a specific type of situation. The larger thesis of "this kind of strategy works and might be generalizable" is one I have less evidence for, and experimenting with it could have significant downsides.
In 2011 my life wasn't going well. I was doing a postdoc in which I felt cynical yet overwhelmed, and thanks to depression and academic brainwashing, I felt like I wasn't good for anything else. A lot of the time I was spending most of my workday taking naps and playing flash games[1], and the high point of my day was when the clock hit 19:00 and my daily turns in Kingdom of Loathing would refill.
KoL is a turn-based HTML MMO[2] and I was deeply into its strategy, seasonal events, and community. You only get a limited number of turns per day, and if you miss them, they're gone.[3] I was intensely motivated to play those turns and missing a day felt like it would be a disaster.
One day, I'm not even sure why, I said to myself, Michael, you are going to get three hours of actual work on the diabetes study done today, or you're not playing KoL tonight. Michael, to his credit, accepted the challenge.
If you deal with depression or ADHD, you might not be shocked to hear that getting myself to start work was incredibly painful, and that once I started it went pretty well. I made honest-to-god progress and played my turns with extra gusto. I renewed the vow the next day, and kept adapting and expanding it over the weeks and months that followed.
There were a few days when I got distracted and didn't hit my goals. Not getting to play felt so unfair and awful, I just wanted to throw a tantrum. And after each one I tried to make hella sure it never happened again.
My life is better now! I eventually drifted away from KoL. I still use other rewards and scores and systems to keep myself on track, but nothing has ever motivated me like those daily KoL turns did. I'm not sure how I'd have been able to start working my way out of my hole without a reward I wanted so badly. Is there a way to find other motivators like that -- or at least to put them to use if you have them?
Was my relationship with Kingdom of Loathing unhealthy? It met most of the non-biochemical criteria for substance use disorder from the American Psychiatric Association. The most interesting ones were "cravings and urges", "continuing to use, even when it causes problems in relationships", and "continuing to use, even when you know you have a physical or psychological problem that could have been caused or made worse by the substance". But another way of looking at those patterns is that I had a disproportionately powerful reinforcer on my hands, one that could push my behavior in directions I wouldn't take from purely endogenous motivation.
In everyday language people usually refer to this as an addiction, and I think that most people would say an addiction is one that by definition can't be controlled or put to constructive use. I don't see any value in arguing that point so I'll just call the things I'm talking about paraddictions.
How can you identify if you have a paraddiction like this in your life that might be useful for changing your behavior? Some proposed criteria:
You know what else is a disproportionately behavior-shaping response that people engage in even when it's harmful? Ugh fields. Procrastination. Depressive withdrawal. Maybe it takes something equally irrational to get past them.
My process only worked because even though I wanted to play KoL so badly I'd push myself in new and insane ways, I was also able to deny it to myself.
To some extent that's just me -- I've always been hyper-responsive to gamification, rules, and reward/punishment systems. But I think there are a few features of what I did that could be generalized:
It also illustrates a couple of general good motivation hacking / gamification practices:
For example, if I had been binge drinking every day, the best way to make progress on my work would have been to binge drink less.
A: That might be a fake alternative. If it wasn't in my power to immediately and completely stop drinking, maybe transforming it into a reward would help me bootstrap my way to doing better and drinking less.
On the other hand, that also sounds like a bullshit excuse that someone would use to keep binge drinking when they might really be capable of cutting back. I do see the risk. My advice would be:
First of all, definitely not. I'm not sure if anyone should ever take life advice from me, but especially not if it's about a behavioral pattern that could go very badly wrong.
I have toyed with doing this from time to time, taking up dumb little mobile games with brain-hijacking reinforcement loops just so I'd have something to reward myself with. But I don't even know how I'd find something I wanted as badly as I wanted to play KoL! Overall, this advice applies best if you already have a paraddiction in your life.
A third option might be to take something it's normal to want desperately and hold that back as a reward. Food? Sex? Sleep? If you fail to hit your goals rarely enough, missing out won't do too much harm. But that feels fundamentally inhumane in a way that withholding a non-necessary reward doesn't.
I wonder if paraddictions naturally appear when they might be useful and go away when they're not. That is: If life sucks (in at least one domain), it's easy to develop a dependence on whatever lets you escape. If your life improves, that almost by definition means that you're developing options that are close to your paraddiction on some combination of reward and meaningfulness. Then you're less likely to be dependent on any one thing. That again would suggest that this advice is most applicable if you already have both an existing paraddiction and a lot of room to make improvements in your life.
I got tremendous benefits from taking a compulsive behavior and regulating it for use as a motivator to do other things. I'm not sure if it's possible, or desirable, to engineer situations like that but if there's already something that you find disproportionately and intensely reinforcing, you might be able to put it to use.
This advice is potentially dangerous depending on the nature of your (para)addiction and its current effects on your life, so you probably should not listen to me without hard consideration and input from others.
I would love to hear if anyone else has tried something this, if they think it worked well, and how they responded if the motivator started losing its power.
If you weren't around for the golden age of games made in Adobe Flash, substitute "mobile games"
Also notable for peak Gen-X-ironic stick figure art, They Might be Giants references, and a 15+ year catalog of interacting mechanics that makes Magic: the Gathering look tastefully minimalist.
KoL players, this is a simplification but you know what I mean.
2024-11-23 04:11:56
Published on November 22, 2024 8:11 PM GMT
In software design, several key principles share a common foundation: the concept of factoring, or breaking complex problems into smaller, manageable parts. Three prominent examples are:
These principles, though known by different names, all rely on factoring to create efficient, reliable, and maintainable systems. Even when developers aren't sure how to factor a problem, they often intuitively recognize that it should be factored: this notion of factoring has a fundamental role in effective software design.
In this post, we consider how this idea of factoring applies to mathematical theorem proving. Similar to the Unix philosophy, there is a sort of factoring that takes place in mathematics: breaking down difficult conjectures into simpler lemmas. But just as in software, collaboration is crucial for solving these problems, with mathematicians often building upon one another’s work, even without full awareness of all the details involved. This is reminiscent of Milton Friedman’s Lesson of the Pencil: the wood chopper on one side of the world has nothing in common with the granite miner on another, and neither of them know how to make a pencil by themselves. Yet, their incentives fall such that pencils get manufactured as a global project.
Just as in software and in the manufacturing of a pencil, success relies on collaboration and breaking problems into manageable parts. To that end, we present a first attempt at an incentive structure to encourage collaborative theorem proving, inspired by a recent concept called Plausible Fiction.
In a recent Topos Institute blog post, Spivak proposed a general theory of change, which he called Plausible Fiction. A work of plausible fiction has four requirements:
The mechanism that makes plausible fiction work is gap-filling. To collaborate on a work of plausible fiction, one finds a gap in the story and fills it in with more plausible fiction. Eventually, the gaps get small enough that they can be fulfilled easily, e.g. “the bottle of water is carried out of the store and given to Alice” is only fiction until Bob carries the bottle of water out to Alice. Participants with different strengths can collaborate to make the changes they wish to see in the world.
The mathematics community is also one with a great diversity of strengths. Consider the Liquid Tensor Experiment, in which a global community of mathematicians, led by Peter Scholze, worked with the Lean proof assistant to verify many complex and interrelated proofs. Each contributor specialized in different areas, reminiscent of the modular components in software, bringing unique skills to tackle specific lemmas. The experiment's success showed how modern mathematical proofs can be global projects, with blueprint software helping to organize disparate contributions into one verified whole.
Just as blueprint software organizes collaboration in formalizing conjectures, Plausible Fiction enables participants to envision and construct desired futures by progressively filling gaps. Here, we aim to merge these ideas, using Plausible Fiction as a framework for collaborative mathematical theorem proving, where larger conjectural gaps are filled by simpler ones and eventually proven.
Indeed, plausible fiction and mathematical conjecture have a lot in common. When a mathematician thinks that a statement is likely to be true, or at least plausible, they publicize it as their conjecture. This gives an incentive for other mathematicians to prove it, though sometimes their proofs are based on further conjectures. For example, Bill Thurston proved that Poincaré's Conjecture about 3-dimensional spheres followed from his own Geometrization Conjecture, which Perelman later proved. And some of Perelman’s lemmas were considered difficult enough that other professional mathematicians had to complete the proofs.
One could say that a mathematician’s skill is in factoring conjectures into simpler ones: “You want to prove A, do you? Well, I can prove A for you, but only if someone can find a proof of B, C, and D”. Once the conjectures get simple enough, they’re just carried out by hand, as Bob did with the water bottle in the story above. In the Mechanism section below, we’ll give a simple data structure and process that could serve as a backend for a platform. We’re calling this the plausibly factored platform, which we believe will encourage theorem proving. In the Outlook section we’ll explain the sense in which it is very much a work in progress.
Past work has provided some market designs for incentivizing logical steps and mathematical proofs. Demski previously suggested a market in MathBucks. In this system, mathematicians are rewarded not only for proofs but for contributing to understanding with intermediary explanations. DaemonicSigil rightly points out, “Simply directly declaring a $100 million reward for a solution would probably not work. For one thing, there's the issue of corollary-sniping where the prize wouldn't give anyone an incentive to publish solutions to hard intermediate steps of the problem, since the prize as a whole only goes to the one who solves the entire problem as a whole”. DaemonicSigil proposes a mechanism in which traders can buy truth for a dollar and get false for free, but can trade any logically equivalent expressions with each other on the open market. Ought’s factored cognition research also touched on these ideas, especially the relay programming experiment.
DaemonicSigil’s design relies on a market maker selling Truth and Falsum symbols that parties can to create logical “goods” to trade with each other. We are concerned that the overall transparency of the collaboration between parties is lower in this setting, since they can profit off of each other’s follies and alignment is lower.
What we propose here is only a first pass at a platform, and we expect it to be refined as time goes on. Indeed, this is how plausible fiction is meant to work. Our job is only to make the ideas more plausible and leave gaps for others to fill. Let’s begin by fixing a proof assistant, like Agda, Coq, Lean, etc., as well as an associated Knowledge Base of proven theorems.
The primary data type that is relevant here is the following record type:
Contract := { BuyerName : String, PleaseProve : Type, Bounty : ℝ>0 }
This is fairly self-explanatory: whoever wants a conjecture to be proven is called a buyer. They formulate a type in the proof assistant, and they offer a certain bounty for it to be proven. To submit their request, they pay the money to PPP, at which point it is in escrow, meaning they cannot rescind it. (Perhaps they can give some time-bound—“if this isn’t proven by 2025, then I rescind my offer”—but we won’t address this issue in the present post.)
The game is played by proposing factorizations of existing conjectures. A factorization can be visualized like this
Here, T might be a conjecture such as A & B ⇒ C & D. Another party, say Alice, might propose that T1 and T2 imply T, where T1 is A => C & E and T2 is E & B => D. Now, T1 and T2 are smaller gaps that could be filled by someone else. This is Alice’s “alpha”, i.e. her private intellectual property. So contributions of this sort constitute a second factor we need to include in the market.
The market itself consists of a public market and private ledger:
MarketState := PublicMarket x PrivateLedger
Let’s start with the public market. It is just a time-varying list of contracts:
PublicMarket := List(Contract)
For example, in 1904 the contracts of the public market state was perhaps as follows:
[...,
(
Fermat,
“∀(a,b,c : ℕ≥1). ∀(n : ℕ≥3). (an + bn = cn) ⇒ False”,
$357
),
(
Poincaré,
“Every simply connected, closed 3-manifold is homeomorphic to a 3-dimensional sphere”,
$99
),
...]
meaning that someone named Fermat had a conjecture about natural numbers on the books, for which the bounty was $357, and someone named Poincaré had a conjecture about 3-manifolds on the books, for which the bounty was $99. In fact, what we wrote for Poincaré’s contract is not quite valid: it should compile as a type according to a given proof assistant. Let’s refer to the compiling type as PoincareConj.
Similarly, the private ledger is just a list of private entries
PrivateLedger := List(PrivateEntry)
where a private entry is the following data type:
PrivateEntry := {
ContributorName : String,
Premises : List(Type),
Conclusion : Type,
⍺ : Premises -> Conclusion }
The market data structure maintains the private ledger, but does not publish it; we’ll discuss reasons for privacy below in the Analysis section. The proof term ⍺ is verified by the theorem prover as a term of the given type, Premises -> Conclusion. When the Premises list is empty, we say that this entry is closed, because we have a closed term (in the sense of Type theory) of the Conclusion type.
At any time, a user can either submit a contract or a private entry to the API. Upon submission, the market processes the entry and triggers the appropriate actions. The growing public Knowledge Base also functions like a hint database as in Coq or Lean. As it grows, the power of trivial mechanistic automations grows with it. So for example, basic inference rules like modus ponens should be in the hint database at initialization, whereas more involved knowledge like Thurston’s GeomConjecture -> PoincareConjecture seen below can only enter the lexicon after some market activity.
Each private entry acts as a trigger. Once all its premises are in the associated Knowledge Base, its Conclusion has been proven, and two events are triggered. First, any contract for which PleaseProve = Conclusion is cleared: its WillPay is paid to ContributorName and it is removed from the PublicMarket list. Second, ⍺ is itself added to the Knowledge Base and thereby becomes public. (If multiple contributors have private entries whose last premise is satisfied at the same time, and these entries simultaneously satisfy the contract, the bounty is split equally among all such contributors.)
A very special kind of contribution is that with an empty list of premises (a “thunk”, or a proof from unit). This is in fact a theorem of type Conclusion. As above, any contract with PleaseProve=Conclusion is cleared, and Conclusion is added to the knowledge base. In this clearing action, truth “bubbles up” through the market triggering payouts.
For example, Thurston could submit a private entry
(Thurston, [GeomConjecture], PoincareConj, 3DMKGHG)
where 3DMKGHG refers to his work in Three-Dimensional Manifolds, Kleinian Groups and Hyperbolic Geometry, or a Lean/Coq compiling version thereof. He could simultaneously offer a contract:
(Thurston, GeomConjecture, $90).
Then Perelman comes along and proves GeomConjecture (let’s assume that Perelman proves it completely, not needing any help from Xiping and Huaidong, etc.) in a private entry
(Perelman, [], GeomConjecture, EFRFGA)
where EFRFGA consists of some compiling theorems from The Entropy Formula for the Ricci Flow and its Geometric Applications and other papers. At this point, this entry is triggered and the Geometrization Conjecture is considered proven. It is added to the Knowledge Base, and Perelman is paid $90 from Thurston’s contract. Immediately following this, Thurston’s private entry is triggered, since its only premise is proven, and the Poincaré Conjecture is also considered proven. It is added to the Knowledge Base and Thurston is paid $99 from Poincaré’s contract.
As should be clear, any party can be a counterparty to some contracts and principal to others; that is, they may receive money from some and pay money to others. This is just like in a category: an object can both be the domain of some arrow and codomain of others.
A question we promised to address is why we need privacy in the private entries, where proofs ⍺: P1 ∧ ⋯ ∧ Pn → Q are contributed, e.g. by Alice. The reason is that without this, someone else could find equivalent P1’≅ P1, modify Alice’s proof to ⍺’: P1’ ∧ ⋯ ∧ Pn → Q, not have to pay a bounty on P2 ,..., Pn, and if P1’ is proven before P1, get all the reward. You might say that the system should constantly check for such equivalences, but doing so for every pair of premises in the system would require a huge amount of overhead. Moreover, as David Jaz Myers points out, every two true statements are equivalent.
One last issue that we should mention is that splitting a conjunction is expensive. For example, suppose (Alice, P&Q, $10) is a $10 bounty on P&Q. Bob, that clever rascal, contributes some private entry (Bob, [P, Q], P&Q, auto) and two public contracts (Bob, P, $3) and (Bob, Q, $4). In fact, Bob has taken a significant risk; his $7 is in escrow and he may never see the $10. Even with a time-bound on his bounty, he could be in trouble. Suppose that P is easily provable but conjecture Q turns out to be false. Tomorrow, Carl submits the private entry (Carl, [], P, prf), triggering the payout of Bob’s $3.
In general, factoring a conjecture and offering a bounty on the factors is a risk. It remains to be seen if this is bearable by a market—i.e. whether our fictional platform is plausible—or not. If not, we hope others will fill this gap: this is how plausible fiction works.
A crucial design decision is to offload the whole problem of estimating the plausibility of a lemma and its eventual usefulness to each user’s implicit pricing function. The first desiderata that users ought to aim for when implementing this function is that the cost should be higher for fictions that are less plausible, and lower for more plausible or “easier” fictions. As counterparties filling gaps, one should price the filling of a wider gap higher than the filling of a slimmer gap. The second desiderata is that one should feel free to price higher lemmas that could plausibly pay back in multiple ways, a lemma that is useful for a multiplicity of theorems should have that multiplicity reflected in its price.
A crucial feature of this game is that it’s agnostic about whether players are some agent architecture based on an LLM or humans. In some contexts, compensating AI agents for their labor may be appropriate, and this system could set a useful precedent.
We think the kinds of proofs we might see coming out of the Plausibly Factored game may be more interpretable and elucidating than those emitted from a pure next-token predictor. Running this market could offer a new way to solve Lean programming benchmarks like miniF2F, and we could observe the outputs to compare them to what LLMs do to solve miniF2F now to check that hypothesis. Indeed, if a program specification is given as a type then finding a term of that type ( a program that satisfies the spec) is again the sort of endeavor that could be factored
As DaemonicSigil pointed out, current mathematical incentives may do a bad job at eliciting intermediary work like important lemmas and cruxy (pivotal) conjectures. The Plausibly Factored protocol shares reward for major theorems with people who provided intermediary lemmas.
Plausible fiction was originally intended to be a public and transparent process, where each collaborator fills gaps openly. In contrast, the current iteration of our Plausibly Factored platform is a bit of a bizarro version, introducing competition and private submissions.
How did these reversals happen? In the original plausible fiction framework, the incentive was an abstract, collective good—a "good future" that would hopefully naturally motivate individuals to participate. In the mathematical context, however, financial incentives may help stimulate participation in this formal and precise process, so that contributions can be transparently verified and rewarded through formal proof systems. That said, financial incentives are just one way to stimulate participation, intended to complement existing models such as academic grants and institutional support. The ethical implications of using financial incentives are still in question for us, and we invite comment along those lines.
One gap in our protocol is the lack of a “royalties” system to incentivize “test of time” lemmas that get reused over and over again. Users only get paid once for the bounties that utilize their lemma at insertion time. This is a downside of the current system, but many naive implementations of royalties would come with their own downsides to mitigate (such as, we’d have to make sure no one could get a piece of modus ponens).
In conclusion, we want to stress that the Plausibly Factored platform is a work in progress. The system, particularly in its current form, is designed for smaller-scale projects or informal games between friends and colleagues, and is not meant to scale into a market-driven overhaul of mathematical research. To address concerns about wealth concentration, future iterations could include mechanisms like capped bounties or community-driven funding pools. These steps would ensure that the system remains balanced, avoiding the concentration of power among a few wealthy participants. If readers have suggestions on how to improve the platform, particularly to cultivate more open collaboration, please let us know in the comments.
2024-11-23 03:01:15
Published on November 22, 2024 7:01 PM GMT
This is a crosspost from https://chillphysicsenjoyer.substack.com/p/pursuing-physics-research-part-time.
Disclaimer - I’m a part-time research associate doing biophysics with a uni research group in the UK. But I have a day job in an unrelated field that pays the bills.
Whilst I’ve read many personal accounts of research from full-time research students in the academic system, I haven’t heard as much from those pursuing research part-time - independently or otherwise.
I’ve always found this weird. Out of the set of people who are really interested in stuff, most people can’t, or don’t want, to go into academia full time. There are loads of valid reasons - financial, skill or geographical constraints. And so, doing unpaid research on the weekends seems like the only way for this kind of person to sate their interests meaningfully. And so I wonder why I haven’t read as much stuff by more people doing this kind of thing.
So as someone doing research part time alongside their day job, I wanted to reflect a bit on my priors about likelihood of success, and about trying to do two things well. The main thing that I wanted to argue is that one's effectiveness doing research part-time is probably a lot higher than the time adjusted effectiveness of a comparable researcher. Specifically, I think there are loads of arguments on why its a lot larger than just (effectiveness of a comparable researcher) * (part time hours / full time hours). And it's more fun!
For the past year, I’ve worked in finance whilst doing biophysics research part-time at a university. I work on spectroscopy.
It took me around four years to get in the place where I could comfortably hold a job in finance and also find a supervisor. After I graduated I worked for big corporations for several years. It got to a point until I could manage my working hours so that I could leave reliably around 5pm, giving a few hours in the day left to work on science. Whilst I was doing this, I published about physics, and continued to study it independently from textbooks. Then I cold emailed supervisors for around two years until a research group at a university was willing to spare me some time to teach me about a field and have me help out.
First the obvious - I think that part-time scientific research could be a great setup for working people who are still interested in science, but don’t want the downsides of academia and vice versa. In terms of downsides, academia doesn’t give you as much money in white-collar type jobs (think tech, finance, consulting) on average, in comparatively, the problems in industry jobs can be more dull aesthetically. By doing both, you are hedging against the the non-naturalness of one field versus the material rewards of the other.
But here’s the novel part. I used to think that a part-timer working perhaps 20% of the time a full timer spends, would only be 20% as effective. But actually, I’m willing to argue that this isn’t true and there probably is a lot of ‘boost’ that jacks up their effectiveness much higher - my market for this is maybe (30%/90%). Wide range I know. But bear with me!
Here are the boosts. I actually think that you can get great results doing research as a hobby because
I think these two things are crucial for success. The slack allows you to look at risky and niche ideas are more likely to yield better research rewards if they are true, since surprising results will trigger further questions.
Also, since you are more likely to do better at topics you enjoy, getting money from a day job allows you to actually purse your interests or deviate from your supervisor’s wishes. Conversely, it also allows you to give up when you’re not enjoying something.
On pressure, Richard Feynman has anecdotally written that the pressure to do great work in a formal academic system was stifling, and it was the freedom to play with physics that really lead to results. When you’re working a day job, for the most part, you’re not pressured on funding. Considering that PhD stipends in the UK are well below median income, you’re probably more comfortable, are slightly happier because you’re not worried about money if you’re working a day job.
Then there’s the fact that before the 1900s, public science funding wasn’t even a thing at all, and a bunch of great science was basically just done by amateur enthusiasts like Darwin. Einstein part-timed it as well. I think this is summarised in a great comment by Anna Salomon on a LW post that was similar in sprit to this one
‘Maybe. But a person following up on threads in their leisure time, and letting the threads slowly congeal until they turn out to turn into a hobby, is usually letting their interests lead them initially without worrying too much about "whether it's going anywhere," whereas when people try to "found" something they're often trying to make it big, trying to make it something that will be scalable and defensible. I like that this post is giving credit to the first process, which IMO has been historically pretty useful pretty often. I'd also point to the old tradition of "gentlemen scientists" back before the era of publicly funded science, who performed very well per capita; I would guess that high performance was at least partly because there was more low-hanging fruit back then, but my personal guess is that that wasn't the only cause. ‘
Still main drawback is that you get less time, which feels like a huge disadvantage. But is this really? Sometimes I wonder. Considering that most of modern quantum physics was discovered in a 6 month timespan by scientists mostly under the age of twenty five, I doubt that time is really a constraint in doing great research. Given that research is also mostly just pot shots, and assuming that the likelihood of discovering something cool is very small, I would think that the incremental probability of finding something cool changes from 0.001% to 0.1%. But that’s still really small in both cases! So what’s the difference?
I have worried a lot that this setup is slightly suboptimal in terms of ‘being exceptional at something’. But for one - I think being exceptional is overrated. And even if you do want to go down that road, I actually think this is the best way to get good at things if you are the type to get bored easily.
Whilst I agree that focusing on ‘one thing at a time’ (like the career you are in, or the sport you play) is generally a great strategy for results - I do think that the success of this strategy really hinges on the type of person you are. And I think people should be optimising conditional on who they are, not on what the average person needs.
Mathematically, I see a great argument for trying to not focus on one thing if you’re the type that gets bored easily. If the way to get good at something is to maximise the total hours doing it across your whole lifespan, then you want to have a strategy that lets you achieve that no matter how ugly it might look. So focusing a lot in the beginning might be a bad strategy, since you probably will burn out in a short time. Where as if you did slightly less everyday but kept at it for longer, then you are much more likely to rack up a lot more hours cumulatively doing that activity.
As for the doubts, I’ve often thought that it was suboptimal for me to split my time in two - hence the quote at the top of this page. As time has past though, I’ve started to doubt this. Provided you’re the type of person to get bored easily, and are willing to do both your work and research for a long period of time, you might up in a better spot long term!
This is all well and good, provided if it’s possible for someone to get a research job to do part time, and have enough spare time from their day job to have a solid go. But it kind of took a lot out of me to achieve that.
2024-11-23 02:46:26
Published on November 22, 2024 6:46 PM GMT
Imagine a sequence of binary outcomes generated independently and identically by some stochastic process. After observing N outcomes, with n successes, Laplace's Rule of Succession suggests that our confidence in another success should be (n+1)/(N+2). This corresponds to a uniform prior over [0,1] for the underlying probability. But should we really be uniform about probabilities?
I think a uniform prior is wrong for three reasons:
I propose this mixture distribution:
w1 * logistic-normal(0, sigma^2) + w2 * 0.5(dirac(0) + dirac(1)) + w3 * thomae_{100}(α) + w4 * uniform(0,1)
where:
Ideally, our prior should be a mixture of every possible probabilistic program, weighted by 2^(-K) where K is its Kolmogorov complexity. This would properly capture our preference for simple mechanisms. However, such a distribution is impossible to represent, compute, or apply. Instead, I propose my prior as a tractable distribution that resolves what I think are the most egregious problems with Laplace's law of succession.
Now that I've found the appropriate approximation for the universal prior over binary outcomes, the path to solving induction is clear. First, we'll extend this to pairs of binary outcomes, then triples, and so on. I expect to have sequence of length 10 nailed by Tuesday, and full Solomonoff Induction by Q1 2025.
I've built an interactive demo to explore this distribution. The default parameters (w1=0.3, w2=0.1, w3=0.3, w4=0.3, sigma=5, alpha=2) reflect my intuition about the relative frequency of these different types of programs in practice. This gives a more realistic prior for many real-world scenarios where we're trying to infer the behavior of unknown processes that might be deterministic, fair, or genuinely random in various ways. What do you think? Is there a simple model which serves as a better prior?