MoreRSS

site iconEST修改

EST = Extrospect, Sein & Tao ,后端工程师。
请复制 RSS 到你的阅读器,或快速订阅到 :

Inoreader Feedly Follow Feedbin Local Reader

EST的 RSS 预览

CBOR 和 MsgPack 是一回事

2025-09-27 20:01:00

翻旧账的时候无意中发现的。MessagePack的实现者 mdhb 说:

Disclaimer: I wrote and maintain a MessagePack implementation.
CBOR is MessagePack. The story is that Carsten Bormann wanted to create an IETF standardized MP version, the creators asked him not to (after he acted in pretty bad faith), he forked off a version, added some very ill-advised tweaks, named it after himself, and submitted it anyway.
I wrote this up years ago (https://news.ycombinator.com/item?id=14072598), and since then the only thing they've addressed is undefined behavior when a decoder encounters an unknown simple value.

以及

There's no reason an MP implementation has to be slower than a CBOR implementation. If a given library wanted to be very fast it could be. If anything, the fact that CBOR more or less requires you to allocate should put a ceiling on how fast it can really be. Or, put another way, benchmarks of dynamic language implementations of a serialization format aren't a high signal indication of its speed ceiling. If you use a dynamic language and speed is a concern to this degree, you'd write an adapter yourself, probably building on one of the low level implementations.
That said, people are usually disappointed by MP's speed over JSON. A lot of engineering hours have gone into making JSON fast, to the point where I don't think it ever made sense to choose MP over it for speed reasons (there are other good reasons). Other posters here have pointed out that your metrics are usually dominated by something else.
But finally, CBOR is fine! The implementations are good and it's widely used. Users of CBOR and MP alike will probably have very similar experiences unless you have a niche use case (on an embedded device that can't allocate, you really need bignums, etc).

看它又翻了一堆旧帐。。hmmmm。。。好吧。

最近在看 ATProto 发现它既可以 CBOR 也可以 JSON

大集团,小组织和原子个体

2025-09-25 10:27:00

来自 Terence TaoHN上讨论很热烈

Some loosely organized thoughts on the current Zeitgeist. They were inspired by the response to my recent meta-project mentioned in my previous post https://mathstodon.xyz/@tao/115254145226514817, where within 24 hours I became aware of a large number of ongoing small-scale collaborative math projects with their own modest but active community (now listed at https://mathoverflow.net/questions/500720/list-of-crowdsourced-math-projects-actively-seeking-participants ); but they are from the perspective of a human rather than a mathematician.

As a crude first approximation, one can think of human society as the interaction between entities at four different scales:

  1. Individual humans

  2. Small organized groups of humans (e.g., close or extended family; friends; local social or religious organizations; informal sports clubs; small businesses and non-profits; ad hoc collaborations on small projects; small online communities)

  3. Large organized groups of humans (e.g., large companies; governments; global institutions; professional sports clubs; large political parties or movements; large social media sites)

  4. Large complex systems (e.g., the global economy; the environment; the geopolitical climate; popular culture and "viral" topics; the collective state of science and technology).

An individual human without any of the support provided by larger organized groups is only able to exist at quite primitive levels, as any number of pieces of post-apocalyptic fiction can portray. Both small and large organized groups offer significant economies of scale and division of labor that provide most of the material conveniences that we take for granted in the modern world: abundant food, access to power, clean water, internet; cheap, safe and affordable long distance travel; and so forth. It is also only through such groups that one can meaningfully interact with (and even influence) the largest scale systems that humans are part of.

But the benefits and dynamics of small and large groups are quite different. Small organized groups offer some economy of scale, but - being essentially below Dunbar's number https://en.wikipedia.org/wiki/Dunbar%27s_number in size - also fill social and emotional needs, and the average participant in such groups can feel connected to such groups and able to have real influence on their direction. Their dynamics can range anywhere from extremely healthy to extremely dysfunctional and toxic, or anything in between; but in the latter cases there is real possibility of individuals able to effect change in the organization (or at least to escape it and leave it to fail on its own).

Large organized groups can offer substantially more economies of scale, and so can outcompete small organizations based on the economic goods they offer. They also have more significant impact on global systems than either average individuals or small organizations. But the social and emotional services they provide are significantly less satisfying and authentic. And unless an individual is extremely wealthy, well-connected, or popular, they are unlikely to have any influence on the direction of such a large organization, except possibly through small organizations acting as intermediaries. In particular, when a large organization becomes dysfunctional, it can be an extremely frustrating task to try to correct its course (and if it is extremely large, other options such as escaping it or leaving it to fail are also highly problematic).

My tentative theory is that the systems, incentives, and technologies in modern world have managed to slightly empower the individual, and massively empower large organizations, but at the significant expense of small organizations, whose role in the human societal ecosystem has thus shrunk significantly, with many small organizations either weakening in influence or transitioning to (or absorbed by) large organizations. While this imbalanced system does provide significant material comforts (albeit distributed rather unequally) and some limited feeling of agency, it has led at the level of the individual to feelings of disconnection, alienation, loneliness, and cynicism or pessimism about the ability to influence future events or meet major challenges, except perhaps through the often ruthless competition to become wealthy or influential enough to gain, as an individual, a status comparable to a small or even large organization. And larger organizations have begun to imperfectly step in the void formed by the absence of small communities, providing synthetic social or emotional goods that are, roughly speaking, to more authentic such products as highly processed "junk" food is to more nutritious fare, due to the inherently impersonal nature of such organizations (particularly in the modern era of advanced algorithms and AI, which when left to their own devices tend to exacerbate the trends listed above).

Much of the current debate on societal issues is then framed as conflicts between large organizations (e.g., opposing political parties, or extremely powerful or wealthy individuals with a status comparable to such organizations), conflicts between large organizations and average individuals, or a yearning for a return to a more traditional era where legacy small organizations recovered their former role. While these are valid framings, I think one aspect we could highlight more is the valuable (though usually non-economic) roles played by emerging grassroots organizations, both in providing "softer" benefits to individuals (such as a sense of purpose, and belonging) and as a way to meaningfully connect with larger organizations and systems; and be more aware of what the tradeoffs are when converting such an organization to a larger one (or component of a larger organization).

读完之后很惆怅。双亲+子女组成核心家庭 里Michelle Obama说的那句话犹在耳畔。

人存在的意义和价值是什么?大多数时候,是由TA所在社会群体里的(某种形式的)地位决定的。

大集团往往是纯粹的经济价值来源和利益机器

小组织给人归属感

个体的结局只有一个——孤单

想到这里,突然又手痒想键政了。秦汉公民兵的崩坏,被世家大族吸收消化;六镇府兵制的隋唐,不过是回光返照,最后不得不换成雇佣兵和藩镇。到了大怂国感觉汉人是完全不会打仗了,这也跟山河四省彻底原子化,科举这个上升通道沦为「个体」刷分机器有莫大的干系。小组织(户)和大集团(族)都完蛋,加上打压工商业,无法形成新的行会 - 商团 乃至财富汇集托举的 文艺 - 科学 团体,整个社会结构要么是男耕女织的这种极端原子化的小农家庭,要么是皇权这种巨无霸;文官集团彻底被以个体利益出发的党争玩坏。整体民心是个什么状态呢?一个民族,一个国家里的每个个体,既无法从组织里得到利益,也无法得到归属感,反正生活都是苦和累,今朝有酒今朝醉,天子换谁来当都一个屌样,那么结论很容易得出:整个汉地的「自然组织度」约等于0 。朱88的军户制度,也是行政和官僚手段强行拉高组织度一种无奈,但是最后还是被更高组织度的八旗吊打。

不得不说老外的 civil society 这一套说法 还是很有道理的

要说这一切的罪恶之源是什么呢?先来想去,铁犁+纺车,从井田制崩坏就开始了?

为啥孔老二崇周礼?因为当年的农业生产是纯纯人力,得分工和群体劳动才能生存,国君祭天是一件严肃的生产分配大会,而不是后来流于礼节形式。为啥游牧部落总是能找到机会在某一个点突破防线?因为轮牧制度是根本,只要部落之间平息仇杀一致对外,就可以凭借天生的高组织度完成更复杂的战术和战略目标。

所以21世纪的铁犁是什么呢?AI?

不敢想不敢想。看着孩子现在 pad 上安装了豆包、千问、deepseek等众多app,我寻思,可能人类社会连父母和后代的养育联系,可能在未来某一个时刻都要断掉了吧。

呃,网上其实现在已经是这个风气了,原生家庭的“罪”罄竹难书,老登只有爆金币一个用途了。

xHTML5

2025-09-15 00:12:00

有 ChatGPT学姿势就是快

<!DOCTYPE html [
  <!ENTITY myEntity "Hello World">
  <!ENTITY wow "alert('hi')">
]>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
  &myEntity;
  <script>
    &wow;
  </script>

</body>
</html>

保存到本地 1.xhtml,双击浏览器打开。谁能想到这也行?xml entity 还可以这样玩?

只要MIME为 application/xhtml+xml 就可以把HTML5当XML渲染。

部署了个在线demo: https://lab.est.im/svg-text/entity.xhtml

不敢相信自己眼睛,确认下是HTML5:

  • document.compatMode 返回 CSS1Compat 没问题
  • ocument.doctype 返回 <!DOCTYPE html> 没毛病

如果是老的兼容模式:

  • document.compatModeBackCompat
  • document.doctype.publicId-//W3C//DTD HTML 4.01//EN
  • document.doctype.systemIdhttp://www.w3.org/TR/html4/strict.dtd

思路打开了。哈哈

disqus已卸载,手搓了套blog评论系统 - req4cmt

2025-09-10 22:38:00

一直用 disqus 主要是觉得方便(懒)。但是自从被墙了就很不方便了。加上近期开始强制插入广告,就更讨厌了

想找个替代品,很多基于 github 的,在 issue 盖楼,我觉得不方便备份,希望是能直接写入 repo的。这样一个 git clone 就搬家了。更不用说几乎都是基于 OAuth 的 github 登录实际上能拿到你所有 repo 的 scope,也就是能读写你所有公开(甚至私有)仓库内容。我一般都不点

于是决定手搓一个。

首先考虑的就是如何把内容写入 git repo。这个在做 gitweets 已经调研尝试过了。

本着在 cloudflare worker 白嫖的心态,这种 serverless 的环境肯定不允许装 git 命令行或者 libgit 这种 .so,所以考虑 pure python 的 dulwich。折腾了一下 python binding 发现CF居然是个WASM转译。那还不如js。还好nodejs生态也很丰富,有个纯js的实现 isomorphic-git

然后手搓了一下发现需要个 memfs 在内存里模拟文件IO。

其次是如何读取内容,评论列表以 域名/路径.jsonl 格式保存为纯文本,一行一条评论JSON,方便diff。将来可以做成 pull request 当成 moderation

本来想通过 git-http 来读文件,发现太慢了。干脆走捷径直接反代 https://raw.githubusercontent.com/ 就行。本来也不用反代,一是这个域名被DNS屏蔽,二是加一个该死的CORS头才能跨域。

跑通之后接着就模仿disqus实现页面一段.js嵌入,渲染表单,展示评论列表等等,我的 js/css 实在捉急就搓了套最基础的。

能用就行!

防止spam这方面也没多想,看很多人说做 hidden input 就能拦住绝大部分,那就先这样跑着。除了评论框这个 textarea 甚至名字都是选填

尽可能做兼容,让在没有 .js 的情况下也能提交表单。当然得在嵌入页面的时候弄 <noscript>

该项目严重依赖 cloudflare 和 github 两位赛博菩萨的免费额度,所以请求过多会被控频。实在不行弄个KV队列之类的。但是我这博客这么冷清,多虑了?

最后,把 disqus 老的评论都导出来了。毕竟是多年的回忆。

项目放在 https://github.com/est/req4cmt 欢迎点评。

抗日

2025-09-03 11:34:00

看完阅兵式直播,记录一些最近几年才了解到的抗战细节:

  • 1941年4月13日,苏联承认伪满洲国
  • 1945年2月08日,雅尔塔会议,罗斯福同意斯大林,保障大连港、中东铁路、南满铁路的利益,以及恢复俄罗斯海军在旅顺口的租赁
    Yalta Conference
    -1945年6月24日,苏联在莫斯科红场举行「伟大的卫国战争」胜利阅兵仪式,🇯🇵帝国陆军武官矢部忠太,帝国海军武官 臼井淑郎作为嘉宾受邀列席检阅红军。


- 1945年8月06日,美军用「Little Boy」核平广岛
- 1945年8月08日,苏联撕毁 《苏日中立条约》向🇯🇵宣战
- 1945年8月09日,美军用「Fat Man」核平长崎
- 1945年9月02日,🇯🇵在东京湾「USS Missouri」战列舰上向盟军投降
- 1955年5月26日,12万苏军的最后一批撤离旅顺、大连。至此大陆外国驻军清零。

东北就是东亚的波兰?

gitweets:单html实现独立微博,拿git历史当feed流发推

2025-08-17 00:00:00

twitter争议不断持续多年,先是各种 cancel culture 闹得动静很大,被一龙马买了之后更甚,社区分裂到 mstdn nostr bsky支流,各种话题炒上天,在众多替代品里,2022年看到个最别具一格的:

拿 git 当微博使

  • 发推: git commit --allow-empty
  • 加关注: git remote add <alias> <their fork url>
  • 转发: git cherry-pick <their "tweet">

脑洞大开。而且git基于merkle tree的,p2p 历史不可篡改,有web3那味了。

当时就饶有兴趣,挖了个坑准备搓个web界面。但是限于涣散的注意力,以及对css这种抽象排版玩不转,一直拖沓没做好。

周末心血来潮,外带 AI 工具加持,进展神速。目前已经基本可用。

项目叫 gitweets ,意思是用 git 发 tweets,网址在 https://f.est.im/ 。二级域名 ffeedf.est 也就是 fest 表示。。。 盛会的意思。

feature list

  1. 把任意 github repo 渲染成微博
  2. 给任意 github repo 发推。其原理是,通过REST API新增一条 commit 。
  3. 发图!如果 commit message 以冒号结尾,而且恰好也在本次新增加了位于 static 下面的图片文件 那么会尝试去加载图片作为附件渲染
  4. 写 commit 基于 OAuth app 实现。浏览器记录 access_token 到 cookie,理论上可用 8个小时。过期重登
  5. 如果你的 commit 有是通过 -S 参数提交带签名 ,那么展示为蓝色表示verified

记录一些坑

  1. OAuth app vs Github App。前者是代客做事;后者是独立主体单独账号,类人行为,多用于 CI/CD
  2. OAuth app 的 scope 如果是 repo 可以读写你所有仓库代码,包括私有仓库!网上的很多基于 github 的第三方评论系统有这个隐患!
  3. 我这里用的是 public_repo,读写所有公开仓库。毕竟 github是个开源社区,拿公开git来当feed使,要安全一些。
  4. 更安全的办法是只能读写单个指定的repo。要实现API读写git,在github有下列几种方法:
    • REST API。可以用 fine grained PAT 读写一个repo
    • GraphQL API。
    • git 协议。走 github.com:22 端口。可以采用 deploy key 或者私人账号 ssh key
    • git-http。走 https://github.com:443
  5. Private Access Token (缩写 PAT) 可以完全控制个人或者团队账号,fine grained PAT 可以只控制指定的几个仓库
  6. deploy key默认只读,可以改成读写,一库一用,不能复用
  7. REST API 列举 commits 不能获得当前 commit 改了哪些文件。需要额外N+1每个commit再次查询详情。背后的原因估计是 git 内部 ref 和 blob 是严格区分的。甚至可能是分开存的数据库表
  8. 网站是跑在赛博菩萨 cloudflare worker 上的。这种所谓“serverless”平台很强大了。功能齐全没啥缺的。甚至可以发起 tcp 连接。
  9. 本来计划走 git-ssh 或者 git-http 协议,想了下js操作binary太复杂了,弄个 libgit2 之类的库估计很重。还是REST方便
  10. REST API 新增一个 empty commit 有多复杂? 1. 获得当前branch 的 sha 2. 获得该 sha 的 tree 3. 新增一个 sha+tree 的commit 4. 把 ref 指向第三步的 sha 。啊,就不能一步完成么。下次有机会看看GraphQL 能不能一次调用完成
  11. Github 的API 强制要求 User-Agent 。你可以乱写但是不能没有
  12. Github 虽然返回了 Access-Control-Allow-Origin: *,但是现代浏览器他妈的不认这个 * 。所以在浏览器只能匿名调用 GET ,如果 POSTPATCH 带了 credentials: "include" 直接拒绝。网站必须显式指定允许哪个具体的 origin
  13. cloudflare的 Response.redirect('/') 直接挂掉。原来是 3xx 跳转不允许相对路径。

why?

由于习惯,古法写web,一个html包含了 css js 。无二次加载,无第三方依赖库。除了不能写死的全部写死 🤣。无需build。

源码放在 https://github.com/est/gitweets/ 。该仓库的 commit 历史也作为feed展示在 https://f.est.im/

接下来准备用类似的思路实现网站评论系统,代替现在的 disqus ,虽然它是免费的,但是广告太多了。

可能有人要问:why ?闲的蛋痛?

我想,首先的确蛋痛,because we can。其次是不想在平台,处处受人限制。然后也是最重要的,self-host。所有数据资料都在一个repo打包带走,备份什么的很方便。比如以前wordpress受众多功能全,但是后来大家都 hexo 之类的静态blog了。

我心目中 gitweets 就是“静态”微博的一种。虽然它现在还是依赖 github API。等有空了可以试试生成纯静态页面。

ToDo

  1. 如何发视频 音频
  2. 如何转发
  3. 如何混合展示多个repo的feed。基于 pull request ?

欢迎评论或者提 issue