Yufan | 雨帆的 RSS 预览

蓝色时期

2025-04-08 22:13:32

对美术史和毕加索略有了解的朋友，想必都知道“蓝色时期”指的是毕加索在 1900 年至 1904 年间，以蓝色和绿色为主色调创作的近乎单色的作品。这部漫画以此为标题，讲述的是主角八虎——一个从未想过以绘画为人生方向的高中生，却因在美术课上用纯蓝色描绘清晨的涩谷而爱上画画，并最终通过努力考入东京艺术大学的故事。

关于艺考，日本和中国的情况其实颇为相似。我读高二时，班里曾有一位同学突然请了长假，直到高考前才再次出现。再见时，他手里提着一个装满几百支削好铅笔的画箱，还有各式沉甸甸的绘画工具。后来才知道，他选择了艺考。在画室里，学生们里三层外三层地围着模特，没日没夜地练习素描、速写和水彩，只为了在竞争中多一分胜算。

后来和妻子（版画专业）恋爱时，我才更深入地了解到艺考背后的艰辛——那简直是一场“苦难行军”。在中国，艺考与高考是分开的，且时间更早。在艺术类统考尚未普及的年代，考生们必须带着沉重的画具，奔波于各地参加不同院校的独立招生考试。他们需要在人物素描、水彩等科目的评选中脱颖而出，拿到专业合格证后，再回头恶补文化课，确保高考成绩达到录取线。这种双重压力，非亲身经历者难以体会。

目前我刚看完《蓝色时期》的前六卷，故事恰好停在八虎考上东京艺术大学。尽管剧情走向与我提到的真实经历大体吻合，但漫画对紧张感的刻画简直令人窒息。无论是考前争分夺秒的练习，还是面对挫折时的迷茫与自我怀疑，那种交织着焦虑与渴望的情绪，想必每一位经历过高考的人都能感同身受。

更触动人心的是，漫画赤裸地展现了“热爱”与“现实”的落差：八虎最初只因喜欢画画而拿起画笔，可真正踏入这条道路后，才发现一切远非想象中简单。努力未必能换来回报，有人天赋异禀轻松成功，也有人看似从容实则拼尽全力。这种残酷的真实性，通过漫画的叙事扑面而来，让人无处可逃。

此外，作者巧妙地将美术知识融入剧情。从色彩理论到构图透视，这些本可能枯燥的内容，通过八虎的学习过程生动呈现。对大多数读者而言，这或许是第一次系统接触这些概念——若能因此对美术产生兴趣，甚至尝试创作，便是这部漫画额外的馈赠。

用一晚读完六卷，既让我重温了青春时代的热血与执着，也再次感叹王道漫画的感染力。但现实远比故事残酷：漫画中“2000 人录取 50 人”的比例已令人咋舌，而国内四大美院的竞争，实则是万里挑一的修罗场。

百年大同，两年情深

2025-03-29 17:16:21

我的初中大同中学占地仅20余亩，是厦门市规模最小的中学之一。然而，这所小小的学校却带给我们家庭作坊般的温暖，更教会我们何谓”大爱”。正如新校歌所唱：“诞生在1925年，与祖国一同历经战火洗礼”。如今，大同中学已迎来建校百年华诞。在这跨越新旧中国的办学历程中，无数莘莘学子曾在此求学。我有幸于2005年入读大同中学，见证了学校 80 周年校庆的盛况。

对我而言，虽然在大同的求学时光仅有两年，却成为记忆中永不褪色的光点。当年的许多老师至今仍与我保持联系。我的物理启蒙老师叶小静，以生动的教学让我爱上物理；化学老师林舜卿则利用课余时间组织全年级的化学竞赛培训，为我打开了化学世界的大门。时至今日，我仍深深感激她们的悉心栽培。

大同中学的旧校歌虽难在互联网上寻得音频，但这首改编自《礼记·大同篇》的校歌，我至今仍能完整吟唱：

大道之行，天下为公。幼有所长，老有所终。

弼成教义，吾校所宗。莘莘学子，以陶以熔。

如金在冶，如玉在砻。顾名思义，是谓大同。

维此大同，靖山之麓，岩有虎溪，洞有白鹿。

名贤遗踪，宏兹乐育，挹彼清芬，松风谡谡。

百年树人，十年树木。发扬踔厉，振我民族。

Vector DB Research for comparing the Milvus with Elasticsearch

2025-01-16 08:13:32

Background

In the application scenarios of Elasticsearch, the storage of large amounts of data may significantly impact the read and write performance of Elasticsearch. Therefore, it is necessary to split indexes according to certain data types. This article explains through relevant technical research whether splitting data on Elasticsearch will affect query results in AI search scenarios. It also compares the implementation principles of other vector databases currently available in the industry with those currently using Elasticsearch.

Goals

Elasticsearch vs. Milvus: Comparison in AIC use cases

Investigate the data storage mechanisms and query processes of mainstream vector databases in the current industry (Qdrant, Milvus). Conduct an in-depth analysis of how they handle data updates (such as incremental updates and deletion operations) and compare them with Elasticsearch.
The impact of single-table and multi-table design on similarity calculation in the Elasticsearch BM25 model

Study the Elasticsearch differences between single-index and multi-index structures in the BM25 calculation, particularly their impact on efficiency and accuracy during calculations.

Elasticsearch vs. Milvus: Comparison in storage, query, etc

Overall Architecture

Elasticsearch Architecture

Elasticsearch architecture is straightforward. Each node in a cluster can handle requests and redirect them to the appropriate data nodes for searching. We use blue-green deployment for scaling up or down, which enhances stability requirements.

Cons: Currently, we only use two types of Elasticsearch nodes: data nodes and master nodes. Every data node serves all roles, which may not be as clear-cut as Milvus’s architecture.

Multiple Milvus Architecture

The Milvus Lite is the core search engine part with the embedded storage for local prototype verification. It’s written in Python and can be integrated into any AI python project.

The Milvus standalone is based on Docker compose with a milvus instance, a MinIO instance and an etcd instance. The Milvus Distributed is used in Cloud and production with all the required modules. In the most case, we are talking about the Milvus Distributed in this report.

Milvus Distributed Architecture

Milvus has a shared storage massively parallel processing (MPP) architecture, with storage and computing resources independent of one another. The data and the control plane are disaggregated, and its architecture comprises four layers: access layer, coordinator services, worker nodes, and storage. Each layer is independent of the others for better disaster recovery and scalability.

Access Layer: This layer serves as the endpoint for the users. Composed of stateless proxies, the access layer validates client requests before returning the final results to the client. The proxy uses load-balancing components like Nginx and NodePort to provide a unified service address.
Coordinator Service: This layer serves as the system’s brain, assigning tasks to worker nodes. The coordinator service layer performs critical operations, including data management, load balancing, data declaration, cluster topology management, and timestamp generation.
Worker Nodes: The worker nodes follow the instructions from the coordinator service layer and execute data manipulation language (DML) commands. Due to the separation of computing and storage, these nodes are stateless in nature. When deployed on Kubernetes, the worker nodes facilitate disaster recovery and system scale-out.
Storage: Responsible for data persistence, the storage layer consists of meta storage, log broker, and object storage. Meta storage stores snapshots of metadata, such as message consumption checkpoints and node status. On the other hand, object storage stores snapshots of index files, logs, and intermediate query results. The log broker functions as a pub-sub system supporting data playback and recovery.

Even in a minimal standalone Milvus deployment. We need an OSS service like Minio or S3, A etcd standalone cluster and a milvus instance. It’s quite complex architecture and mainly deployed and used on K8S.

Summary

	Elasticsearch	Milvus
Complexity	Simple, only master nodes and data nodes.	Complex, require OSS, etcd and different types of milvus nodes. But can be deployed by using Amazon EKS.
Potential Bottleneck	As the increase of the number of Elasticsearch cluster. We may need more replicas to balance the query for avoiding hot zone.	Etcd requires high performance disk for better serving metadata. It could be a bottleneck when the query increases. Files on object storage need to be pulled to the local disk and eventually loaded into memory for querying. If this process switches frequently, the performance may not necessarily be good.
Scaling	Require blue-green deployment to get the online cluster to be scaled	Easy to scale on k8s. The compute node instance number can be changed on demand.
Storage	Every data node’s hard disk. Require to add new data node to increase the storage. S3 is only used as the backup storage.	OSS based. S3 can be used for storage all the metrics.
AA Switch	Require two identical Elasticsearch cluster.	No need to AA switch. Just reload the query nodes or add more query nodes.
Upgrade	Same as the scaling.	Use helm command on k8s cluster.

Data Writing Flow

Index Flow in Elasticsearch

In this diagram, we can see how a new document is stored by Elasticsearch. As soon as it “arrives”, it is committed to a transaction log called “translog” and to a memory buffer. The translog is how Elasticsearch can recover data that was only in memory in case of a crash.

All the documents in the memory buffer will generate a single in-memory Lucene segment when the “refresh” operation happens. This operation is used to make new documents available for search.

Depending on different triggers, eventually, all of those segments are merged into a single segment and saved to disk and the translog is cleared.

This diagram shows the whole routine for a simple index request.

Data Writing Flow in Milvus

The picture above shows all the modules used in data writing. All the data writing requests are triggered in the SDK. The SDK send the request through the Load Balancer to the proxy node. The number of the proxy node instances could be varied. The Proxy node cached data and request the segment information for writing the data into the message storage.

Message storage is mainly a Pulsar based platform for persistence the data. It is the same as the translog in Elasticsearch. The main difference is that Milvus don’t need a MQ service in the frontend. You can directly write data through it’s interface. And don’t need bulk request in Elasticsearch.

The data node consumes the data through message storage and flush it into the object storage finally.

Data model in Vector

Data Model Elasticsearch

As we can see from the diagram, Elasticsearch shards each Lucene index across the available nodes. A shard can be a primary shard or replica shard. Each shard is a Lucene Index, each one of those indexes can have multiple segments, each segment is an complete HNSW graph.

Data Model in Milvus

Milvus provides users with the largest concept called Collection, which can be mapped to a table in a traditional database and is equivalent to an Index in Elasticsearch. Each Collection is divided into multiple Shards, with two Shards by default. The number of Shards depends on how much data you need to write and how many nodes you want to distribute the writing across for processing.

Each Shard contains many Partitions, which have their own data attributes. A Shard itself is divided based on the hash of the primary key, while Partitions are often divided based on fields or Partition Tags that you specify. Common ways of partitioning include dividing by the date of data entry, by user gender, or by user age. One major advantage of using Partitions during queries is that if you add a Partition tag, it can help filter out a lot of data.

Shard is more about helping you expand write operations, while Partition helps improve read performance during read operations. Each Partition within a Shard corresponds to many small Segments. A Segment is the smallest unit of scheduling in our entire system and is divided into Growing Segments and Sealed Segments. A Growing Segment is subscribed by the Query Node, where users continuously write data until it becomes large enough; once it reaches the default limit of 512MB, writing is prohibited, turning it into a Sealed Segment, upon which some vector indexes are built for the Sealed Segment.

A stored procedure is organized by segments and uses a columnar storage method, where each primary key, column, and vector is stored in a separate file.

Vector Query

Index Types

Both Elasticsearch and Milvus require memory to load vector files and perform queries. But Milvus offers a file-based index type named DiskANN for large datasets, which doesn’t require loading all the data but indexes into memory for reducing the memory consumption.

As for Elasticsearch, the dense vector on HNSW is the only solution. The default dimension is float. But Elasticsearch provides the optimized HNSW for reducing the size or increase the performance. To use a quantized index, you can set your index type to int8_hnsw, int4_hnsw, or bbq_hnsw.

Supported index	Classification	Scenario
FLAT	N/A	Relatively small dataset Requires a 100% recall rate
IVF_FLAT	N/A	High-speed query Requires a recall rate as high as possible
IVF_SQ8	Quantization-based index	Very high-speed query Limited memory resources Accepts minor compromise in recall rate
IVF_PQ	Quantization-based index	High-speed query Limited memory resources Accepts minor compromise in recall rate
HNSW	Graph-based index	Very high-speed query Requires a recall rate as high as possible Large memory resources
HNSW_SQ	Quantization-based index	Very high-speed query Limited memory resources Accepts minor compromise in recall rate
HNSW_PQ	Quantization-based index	Medium speed query Very limited memory resources Accepts minor compromise in recall rate
HNSW_PRQ	Quantization-based index	Medium speed query Very limited memory resources Accepts minor compromise in recall rate
SCANN	Quantization-based index	Very high-speed query Requires a recall rate as high as possible Large memory resources

Query Flow in Elasticsearch

The query phase above consists of the following three steps:

The client sends a search request to Node 3, which creates an empty priority queue of size from + size.
Node 3 forwards the search request to a primary or replica copy of every shard in the index. Each shard executes the query locally and adds the results into a local sorted priority queue of size from + size.
Each shard returns the doc IDs and sort values of all the docs in its priority queue to the coordinating node, Node 3, which merges these values into its own priority queue to produce a globally sorted list of results.

The distributed fetch phase consists of the following steps:

The coordinating node identifies which documents need to be fetched and issues a multi GET request to the relevant shards.
Each shard loads the documents and enriches them, if required, and then returns the documents to the coordinating node.
Once all documents have been fetched, the coordinating node returns the results to the client.

Query Flow in Milvus

In the reading path, query requests are broadcast through DqRequestChannel, and query results are aggregated to the proxy via gRPC.

As a producer, the proxy writes query requests into DqRequestChannel. The way Query Node consumes DqRequestChannel is quite special: each Query Node subscribes to this Channel so that every message in the Channel is broadcasted to all Query Nodes.

After receiving a request, the Query Node performs a local query and aggregates at the Segment level before sending the aggregated result back to the corresponding Proxy via gRPC. It should be noted that there is a unique ProxyID in the query request identifying its originator. Based on this, different query results are routed by Query Nodes to their respective Proxies.

Once it determines that it has collected all of the Query Nodes’ results, Proxy performs global aggregation to obtain the final query result and returns it to the client. It should be noted that both in queries and results there exists an identical and unique RequestID which marks each individual query; based on this ID, Proxy distinguishes which set of results belong to one specific request.

Compare BM25 between Elasticsearch and Milvus

Why we still care about BM25 in RAG

Hybrid Search has long been an important method for improving the quality of Retrieval-Augmented Generation (RAG) search. Despite the remarkable performance of dense embedding-based search techniques, which have demonstrated significant progress in building deep semantic interactions between queries and documents as the model scale and pre-training datasets have expanded, there are still notable limitations. These include issues such as poor interoperability and suboptimal performance when dealing with long-tail queries and rare terms.

For many RAG applications, pre-trained models often lack domain-specific corpus support, and in some scenarios, their performance is even inferior to BM25-based keyword matching retrieval. Against this backdrop, Hybrid Search combines the semantic understanding capabilities of dense vector search with the precision of keyword matching, offering a more efficient solution to address these challenges. It has become a key technology for enhancing search effectiveness.

How to calculate BM25

BM25 (best matching) is a ranking function used by search engine to estimate the relevance of documents to a given search query.

Here is BM25 calculation formula for a query on document . contains keywords , , … , .

is the number of the times that the keyword occurs in the document .
is the length of the document in words.
(average document length) is the average document length in the text collection from which documents are drawn.
and are free parameters, used for advanced optimization. In common case, && and .

IDF (inverse document frequency) weight of the query term , where is the total number of documents in the collection, and is the number of documents containing .

Why TF-IDF (BM25) as the main calculation

A term that appears in many documents does not provide as much information about the relevance of a document. Using a logarithmic scale ensures that as the document frequency of a term increases, its influence on the BM25 score grows more slowly. Without a logarithmic function, common terms would disproportionately affect the score.

How Elasticsearch calculate the BM25

By default, Elasticsearch calculates scores on a per-shard basis by leveraging the Lucene built-in function org.apache.lucene.search.similarities.BM25Similarity. It’s also the default similarity algorithm in the Lucene’s IndexSearcher. If we want to get the index level score calculation, we need to change the search_type from query_then_fetch to dfs_query_then_fetch.

In dfs_query_then_fetch search, we will add the org.elasticsearch.search.dfs.DfsPhase in searching. It will collect all the status in DfsSearchResult which contains the shards document information and hits, etc. The SearchPhaseController will aggregate all the dfs search results into a AggregatedDfs to calculate the score. We can use this search type to get a consistent BM25 score across multiple index.

Do we need use dfs_query_then_fetch in cross-indexes query

The only difference between multiple indexes or shard based BM25 calculation is the IDF. But if the data are well distributed among all the indexes and the document count are large enough in every shard. The difference for IDF could be tiny because we use logarithmic. You can get the growth trend in the second chart above. In this scenario, we don’t need to use dfs_query_then_fetch to calculate the global BM25 which requires more resource to cache and calculate.

Sparse-BM25 in Milvus

Starting from version 2.4, Milvus supports sparse vectors, and from version 2.5, it provides BM25 retrieval capabilities based on sparse vectors. With the built-in Sparse-BM25, Milvus offers native support for lexical retrieval. The specific features include:

Tokenization and Data Preprocessing: Implemented based on the open-source search library Tantivy, including features such as stemming, lemmatization, and stop-word filtering.
Distributed Vocabulary and Term Frequency Management: Efficient support for managing and calculating term frequencies in large-scale corpora.
Sparse Vector Generation and Similarity Calculation: Sparse vectors are constructed using the term frequency (Corpus TF) of the corpus, and query sparse vectors are built based on the query term frequency (Query TF) and global inverse document frequency (IDF). Similarity is then calculated using a specific BM25 distance function.
Inverted Index Support: Implements an inverted index based on the WAND algorithm, with support for the Block-Max WAND algorithm and graph indexing currently under development.

Pros and Cons of Sparse-BM25 in Milvus

Full-text search in Milvus is still under heavy development which can see a lot of bugs in GitHub.
Full-text search require creating extra Spare-Index on collections (the document set) which isn’t out of box like Elasticsearch.
Hybrid search on a collection with both ANN with BM25 can be ranked in a single requests and get the top K like Elasticsearch’s reciprocal rank fusion (RRF) since 8.8.0.

走着走着，就到了

2025-01-05 23:20:32

去年年底，我和老婆一起参加了她好友的婚礼，同席的都是女方的朋友，其中有两对夫妻刚刚有了孩子，大家在讨论宝宝怎么还不会抬头翻身。此时，我和老婆相视一笑，眼角里满是默契。

当有女儿时，我和老婆就开始焦虑。人家宝宝三个月就能翻身，怎么我女儿到现在还只会抬头，大动作是不是有些滞后？睡觉时，一定要给她换着侧睡，千万别让她得舟状头畸形。接着，当要开始学走路时，我又开始焦虑，别人家的宝宝早就能爬了，可我这女儿怎么都不会满地爬。快到一岁时，我又开始焦虑，怎么还不会说话，只能吐出简单的字词，而别人家的宝宝已经能背唐诗了。就这样，我们在一个又一个所谓的“应该会”中焦虑着……

三年后的今天，再回头看当初的焦虑，不仅觉得有些好笑，笑我们当时对每个时间点的过度关注，也笑我们对成长的无知。比如翻身这样的动作，等女儿对这个世界的好奇超越了仰视的范围后，某个晚上，在睡前玩耍时，她自然而然地翻了个身，看向了另一边。而她在客厅里开始站起来用“围栏”磨牙后，便学着扶着围栏走，后来脱离围栏，现在已经能在家里满地爬上爬下。可见，宝宝每一次的成长，都不是简简单单一蹴而就的，而是由许多微不足道的积累构成。而当时的我们，既过度关注结果，也忽视了过程中应该做的引导。直到后面，女儿一次又一次地在不经意间给了我们惊讶，也让我们有了新的认知。

二〇二二年年底，老婆的好友送给女儿一辆自行车作为生日礼物。春天一到，我们便打算让女儿骑自行车。刚开始，女儿是抗拒的，嘴里说着：“不要，我不想骑。”后来，我们无奈地把自行车放在客厅她的玩具边上，希望她能出于好奇心想去骑。三月，阳光明媚，小区里也有孩子出来玩和骑车。女儿突然提出：“我想要骑自行车。”起初，她不会蹬脚踏，加上她是个急性子，每次都是反方向狂蹬，但自行车丝毫不动。无论怎么引导她，都无济于事，我们只能无奈地看着。五月，芜湖市花月季花开，花丛中一片淡粉色。某天早上，我们收到了母亲发来的视频，视频中女儿骑着自行车沿着月季花坛前进，她的动作是往前踩一点，再把脚踏往回踩，再往前踩一点，周而复始。相比以前完全不会骑车，她已经进步了不少。到了十月，家边的青弋江水位下降，露出了浅滩的石块。此时，女儿也能在江堤上蹬着脚踏车骑行，速度飞快。

女儿在我们眼皮底下日日成长，似乎没有什么变化，但又真的是一天一个样。某天晚上，家庭微信群里出现了一张照片，照片中是家门口的地垫，本来放在鞋柜底下的爷爷的拖鞋现在被整齐地摆放在地垫上。再一问才知道，女儿因为知道爷爷今天周五要回家，提前偷偷把鞋子摆在门口等他。这件事没人教过她，但她做到了，让人既暖心又欣喜。后来，她还把老婆下班前的鞋摆得整整齐齐，再后来，是我的鞋。上周末，我和女儿说：“鞋子要把鞋口朝外，这样回家的人才好穿。”周五傍晚，妈妈发来了新的照片，三双鞋整整齐齐地摆在门口。那一刻，我的心里百感交集。

十年前，我读林志颖的《我对时间有耐心》时刚刚大学毕业，无法体会林志颖当初对于 Kimi 的种种焦虑和后来的坦然。如今，审视与女儿一同成长的这三年，短暂而又漫长。虽然我日子过得粗糙，已经很少有心情和时间记录下这些让我触动的瞬间，但在合肥工作时偶尔想起，心里便会涌起一股温暖。女儿教会我和老婆最多的便是重新认识成长——这个我们自己亲身经历的事情，因为时间的冲刷而忘却，现在通过养育下一代重新习得，这何尝不是一种“成长”。

我想人生大概如此吧，充满矛盾，不断获得，又不断失去，有优雅也有许多挣扎。就像是一场即兴表演，哪有什么主题可言呢？走着走着，就到了。

消失的文学杂志

2025-01-05 05:49:32

继《文学报》停刊之后，《书城》杂志也宣布休刊，而《上海文化》也已无法订阅。这一报两刊作为上海文学的重镇，自创刊至今已逾四十年，遗憾未能挺过 2025 年的新年钟声。我是它们的读者，21 年开始居家办公后，逐渐养成订阅的习惯。每周四下午到小区门卫取《文学报》，内心充满欣慰。在这个纸质书逐渐淡出日常的当下，一份文学报显得如此珍贵且不合时宜，而它坚持到了今天，我倍感欣慰。

《书城》以各类知识性的随笔散文见长，是我每年都珍藏的杂志。与《读书》和《读库》相比，它更注重趣味性，随笔体的风格读来令人愉悦。而《上海文化》则是我心中极为敬重的纯文学评论杂志，聚焦当代文学，尤其对新人的关注令人钦佩，远胜《小说评论》。其编辑张定浩、黄德海等人以开放多元的文学趣味，不断捕捉文坛动态，在批评与理论中别具一格，展现了鲜明的海派风格。

一报两刊的停刊，昭示着纯文学市场的进一步萎缩，以及电子化阅读的全面普及。作为纸质书的爱好者，我或许正成为时代中的少数派。资讯的发达与文学阅读的无纸化已是大势所趋，这并非悲哀，而是现实。毕竟，如今的读物不是稀缺，而是过剩。报刊杂志塑造了我们对文学的理解，甚至激发了读者的创作冲动。

当文学以物质形态退出生活，是否意味着文学已无处不在？《文学报》虽然不复存在，但文学依然延续，只不过转移到了手机等电子设备上。这种转移将塑造出怎样的文学形式，我难以预料。纸质书与报刊的存在，带给人们沉浸与珍视，而在屏幕上的滑动与点击却容易使人匆匆掠过。尽管信息更加丰富，但信息的过载是否也成了一种压迫？纸质文学的舒朗与留白，像有规律的饮食，是否更有助于心灵的平静与滋养？我不得而知。

今天的人们在现代生活中奔波忙碌，留给纯文学的时间愈发稀少。20 年前，或许因经济拮据无法购书，而如今在电子化时代，图书种类繁多且获取便捷，却反而有人不再阅读。时代的错位让我们跳过了一些东西，这种改变究竟是好是坏，我无从判断。

悼念这一报两刊的停刊，缅怀它们曾带来的愉悦与思考，但文学本身不会“停刊”。

愿文学继续照亮我们的生命与灵魂。

消失的“她”

2024-12-31 22:46:12

时代的浪潮滚滚向前，互联网的发展瞬息万变。我立于茫茫信息流之间，却难找昔日“网上冲浪”的痕迹。

——题记

时隔 8 年后再次登陆当当网后台，竟发现页面丝毫未变，也许它和淘宝网页版一样，早已被移动互联网取而代之。难能可贵的是，感谢它的不变，我还能找到最初的订单记录，一窥当时的故事。

订单中的最早记录始于 2008 年的寒假，是本高中数学竞赛的书。这本书已经随着我的搬家消失，而对应的购书缘由，我想了很久。拨开脑中一层层尘封的记忆，终于在今天上午这个明媚的冬日，忆起了那个明眸皓齿的她……

那个时候我刚上高一，家住在厦门的将军祠，旁边便是厦门最牛逼的高中之一——厦门一中。那里有个不错的小书店，叫科文图书，专门卖一些特定的教辅书、竞赛用书。我在当当网买的这本书，最初便是在这座线下书店购买的。

虽然我中考的分数可以报考厦门一中，但因为没有厦门户口，我实际上读的是厦大附属科技中学。每天的行程便是从将军祠坐旅游大巴去1小时开外的珍珠湾上学。这趟旅程漫长且枯燥，沿途的海景风光旖旎，久了也便失去了新鲜感，直到那一天的到来……

10 月的某个早晨，等车的我突然发现站台出现了熟悉的科中校服，而且是位姑娘。我好奇地走上前：“你好，你是科技中学的吗？”由此打开了话匣子。经过交谈得知，对方是因为父母工作刚刚转校来此的学生，目前在我校借读。往后的日子里，每天等车时，我都能看到她。那段公交旅途，便不再那么漫长了。有时候，公交车到了目的地，我竟觉得时间过得太快，心里不禁想着，为什么学校不能再远一点？

她很喜欢数学，当时的科技中学还没有竞赛相关的培训，但家边上的厦门一中，每年假期都有数学竞赛培训。我便和她约定好在寒假一起溜进一中旁听。一中使用的辅导书，我在科文书店只买到了一本，但还缺给她准备的一本。由此，我学会了网购和邮局汇款，几经周转买到了这本书。

寒假如期而至，但她却迟迟未到。我已不再记得在一中阶梯教室等她的那段时光，只记得自己把写有签名的线下购买的书提前给了她，说：“你先预习。”而这，竟是我和她的最后一次对话。

开学后，我再也没有在公交车站见到她。社恐的我，终于鼓起勇气，找到她的教室。才从她的同学口中得知，她在寒假时转学去了福州。

我最终还是没去参加数学竞赛，我把所有的时间都用于攻读化学。那本当当网购买的数学竞赛书，被我藏于某个角落，高中毕业后被父母当成废纸出售。它最终变成了纸浆，再也找不回来。

我终究不清楚那本借给她的书是否看完了。只记得那个不算冷的冬天，那个社恐的我，一个人坐在厦门一中的教室，周围熙熙攘攘，但世界与我无关。

Yufan | 雨帆修改