diff --git a/doc/en/SUMMARY.md b/doc/en/SUMMARY.md index dd26514..99ca460 100644 --- a/doc/en/SUMMARY.md +++ b/doc/en/SUMMARY.md @@ -1,21 +1,21 @@ # Table of contents -- [Welcome](README.md) +* [Welcome](README.md) ## About -- [About CVSA Project](about/this-project.md) -- [Scope of Inclusion](about/scope-of-inclusion.md) +* [About CVSA Project](about/this-project.md) +* [Scope of Inclusion](about/scope-of-inclusion.md) ## Architecure -- [Overview](architecure/overview.md) -- [Database Structure](architecure/database-structure/README.md) - - [Type of Song](architecure/database-structure/type-of-song.md) -- [Message Queue](architecure/message-queue.md) -- [Artificial Intelligence](architecure/artificial-intelligence.md) +* [Overview](architecure/overview.md) +* [Crawler](architecure/crawler.md) +* [Database Structure](architecure/database-structure/README.md) + * [Type of Song](architecure/database-structure/type-of-song.md) +* [Artificial Intelligence](architecure/artificial-intelligence.md) ## API Doc -- [Catalog](api-doc/catalog.md) -- [Songs](api-doc/songs.md) +* [Catalog](api-doc/catalog.md) +* [Songs](api-doc/songs.md) diff --git a/doc/en/architecure/crawler.md b/doc/en/architecure/crawler.md new file mode 100644 index 0000000..e60f132 --- /dev/null +++ b/doc/en/architecure/crawler.md @@ -0,0 +1,4 @@ +# Crawler + +A central aspect of CVSA's technical design is its emphasis on automation. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database. + diff --git a/doc/en/architecure/message-queue.md b/doc/en/architecure/message-queue.md deleted file mode 100644 index a48f058..0000000 --- a/doc/en/architecure/message-queue.md +++ /dev/null @@ -1,7 +0,0 @@ -# Message Queue - -We rely on message queues to manage the various tasks that [the cralwer](overview.md#crawler)needs to perform. - -### Code Path - -Currently, the code related to message queues are located at `lib/mq` and `src`. diff --git a/doc/en/architecure/overview.md b/doc/en/architecure/overview.md index c17bc0c..fc694fe 100644 --- a/doc/en/architecure/overview.md +++ b/doc/en/architecure/overview.md @@ -14,18 +14,29 @@ layout: # Overview -The whole CVSA system can be sperate into three different parts: +The CVSA is a [monorepo](https://en.wikipedia.org/wiki/Monorepo) codebase, mainly using TypeScript as the development language. With [Deno workspace](https://docs.deno.com/runtime/fundamentals/workspaces/), the major part of the codebase is under `packages/`. -- Frontend -- API -- Crawler +**Project structure:** -The frontend is driven by [Astro](https://astro.build/) and is used to display the final CVSA page. The API is driven by -[Hono](https://hono.dev) and is used to query the database and provide REST/GraphQL APIs that can be called by out -website, applications, or third parties. The crawler is our automatic data collector, used to automatically collect new -songs from bilibili, track their statistics, etc. +``` +cvsa +├── deno.json +├── packages +│ ├── backend +│ ├── core +│ ├── crawler +│ └── frontend +└── README.md +``` + +**Package Breakdown:** + +* **`backend`**: This package houses the server-side logic, built with the [Hono](https://hono.dev/) web framework. It's responsible for interacting with the database and exposing data through REST and GraphQL APIs for consumption by the frontend, internal applications, and third-party developers. +* **`frontend`**: The user-facing web interface of CVSA is developed using [Astro](https://astro.build/). This package handles the presentation layer, displaying information fetched from the database. +* **`crawler`**: This automated data collection system is a key component of CVSA. It's designed to automatically discover and gather new song data from bilibili, as well as track relevant statistics over time. +* **`core`**: This package contains reusable and generic code that is utilized across multiple workspaces within the CVSA monorepo. ### Crawler -Automation is the biggest highlight of CVSA's technical design. To achieve this, we use a message queue powered by -[BullMQ](https://bullmq.io/) to concurrently process various tasks in the data collection life cycle. +Automation is the biggest highlight of CVSA's technical design. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data collection lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database. + diff --git a/doc/zh/.gitbook/assets/1.yaml b/doc/zh/.gitbook/assets/1.yaml new file mode 100644 index 0000000..29eb6dc --- /dev/null +++ b/doc/zh/.gitbook/assets/1.yaml @@ -0,0 +1,106 @@ +openapi: 3.0.0 +info: + title: CVSA API + version: v1 + +servers: + - url: https://api.projectcvsa.com + +paths: + /video/{id}/snapshots: + get: + summary: 获取视频快照列表 + description: 根据视频 ID 获取视频的快照列表。视频 ID 可以是以 "av" 开头的数字,以 "BV" 开头的 12 位字母数字,或者一个正整数。 + parameters: + - in: path + name: id + required: true + schema: + type: string + description: "视频 ID (如: av78977256, BV1KJ411C7CW, 78977256)" + - in: query + name: ps + schema: + type: integer + minimum: 1 + description: 每页返回的快照数量 (pageSize),默认为 1000。 + - in: query + name: pn + schema: + type: integer + minimum: 1 + description: 页码 (pageNumber),用于分页查询。offset 与 pn 只能选择一个。 + - in: query + name: offset + schema: + type: integer + minimum: 1 + description: 偏移量,用于基于偏移量的查询。offset 与 pn 只能选择一个。 + - in: query + name: reverse + schema: + type: boolean + description: 是否反向排序(从旧到新),默认为 false。 + responses: + '200': + description: 成功获取快照列表 + content: + application/json: + schema: + type: array + items: + type: object + properties: + id: + type: integer + description: 快照 ID + aid: + type: integer + description: 视频的 av 号 + views: + type: integer + description: 视频播放量 + coins: + type: integer + description: 视频投币数 + likes: + type: integer + description: 视频点赞数 + favorites: + type: integer + description: 视频收藏数 + shares: + type: integer + description: 视频分享数 + danmakus: + type: integer + description: 视频弹幕数 + replies: + type: integer + description: 视频评论数 + '400': + description: 无效的查询参数 + content: + application/json: + schema: + type: object + properties: + message: + type: string + description: 错误消息 + errors: + type: object + description: 详细的错误信息 + '500': + description: 服务器内部错误 + content: + application/json: + schema: + type: object + properties: + message: + type: string + description: 错误消息 + error: + type: object + description: 详细的错误信息 \ No newline at end of file diff --git a/doc/zh/SUMMARY.md b/doc/zh/SUMMARY.md index da69740..c44766c 100644 --- a/doc/zh/SUMMARY.md +++ b/doc/zh/SUMMARY.md @@ -1,11 +1,11 @@ # Table of contents -- [欢迎](README.md) +* [欢迎](README.md) ## 关于 -- [关于本项目](about/this-project.md) -- [收录范围](about/scope-of-inclusion.md) +* [关于本项目](about/this-project.md) +* [收录范围](about/scope-of-inclusion.md) ## 技术架构 @@ -18,5 +18,5 @@ ## API 文档 -- [目录](api-doc/catalog.md) -- [歌曲](api-doc/songs.md) +* [目录](api-doc/catalog.md) +* [视频快照](api-doc/video-snapshot.md) diff --git a/doc/zh/api-doc/catalog.md b/doc/zh/api-doc/catalog.md index b76ea7a..5298b49 100644 --- a/doc/zh/api-doc/catalog.md +++ b/doc/zh/api-doc/catalog.md @@ -1,3 +1,4 @@ # 目录 -- [歌曲](songs.md) +* [视频快照](video-snapshot.md) + diff --git a/doc/zh/api-doc/songs.md b/doc/zh/api-doc/songs.md deleted file mode 100644 index fd3d99c..0000000 --- a/doc/zh/api-doc/songs.md +++ /dev/null @@ -1,3 +0,0 @@ -# 歌曲 - -暂未实现。 diff --git a/doc/zh/api-doc/video-snapshot.md b/doc/zh/api-doc/video-snapshot.md new file mode 100644 index 0000000..c143151 --- /dev/null +++ b/doc/zh/api-doc/video-snapshot.md @@ -0,0 +1,6 @@ +# 视频快照 + +{% openapi src="../.gitbook/assets/1.yaml" path="/video/{id}/snapshots" method="get" %} +[1.yaml](../.gitbook/assets/1.yaml) +{% endopenapi %} + diff --git a/doc/zh/architecture/database-structure/README.md b/doc/zh/architecture/database-structure/README.md index 14605b0..44a5b5d 100644 --- a/doc/zh/architecture/database-structure/README.md +++ b/doc/zh/architecture/database-structure/README.md @@ -2,6 +2,8 @@ CVSA 使用 [PostgreSQL](https://www.postgresql.org/) 作为数据库。 +CVSA 设计了两个 + CVSA 的所有公开数据(不包括用户的个人数据)都存储在名为 `cvsa_main` 的数据库中,该数据库包含以下表: - songs:存储歌曲的主要信息