merge: branch 'gitbook' into ref/structure

This commit is contained in:
alikia2x (寒寒) 2025-04-06 02:01:05 +08:00
commit a8292d7b6b
Signed by: alikia2x
GPG Key ID: 56209E0CCD8420C6
10 changed files with 156 additions and 36 deletions

View File

@ -1,21 +1,21 @@
# Table of contents # Table of contents
- [Welcome](README.md) * [Welcome](README.md)
## About ## About
- [About CVSA Project](about/this-project.md) * [About CVSA Project](about/this-project.md)
- [Scope of Inclusion](about/scope-of-inclusion.md) * [Scope of Inclusion](about/scope-of-inclusion.md)
## Architecure ## Architecure
- [Overview](architecure/overview.md) * [Overview](architecure/overview.md)
- [Database Structure](architecure/database-structure/README.md) * [Crawler](architecure/crawler.md)
- [Type of Song](architecure/database-structure/type-of-song.md) * [Database Structure](architecure/database-structure/README.md)
- [Message Queue](architecure/message-queue.md) * [Type of Song](architecure/database-structure/type-of-song.md)
- [Artificial Intelligence](architecure/artificial-intelligence.md) * [Artificial Intelligence](architecure/artificial-intelligence.md)
## API Doc ## API Doc
- [Catalog](api-doc/catalog.md) * [Catalog](api-doc/catalog.md)
- [Songs](api-doc/songs.md) * [Songs](api-doc/songs.md)

View File

@ -0,0 +1,4 @@
# Crawler
A central aspect of CVSA's technical design is its emphasis on automation. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database.

View File

@ -1,7 +0,0 @@
# Message Queue
We rely on message queues to manage the various tasks that [the cralwer](overview.md#crawler)needs to perform.
### Code Path
Currently, the code related to message queues are located at `lib/mq` and `src`.

View File

@ -14,18 +14,29 @@ layout:
# Overview # Overview
The whole CVSA system can be sperate into three different parts: The CVSA is a [monorepo](https://en.wikipedia.org/wiki/Monorepo) codebase, mainly using TypeScript as the development language. With [Deno workspace](https://docs.deno.com/runtime/fundamentals/workspaces/), the major part of the codebase is under `packages/`. 
- Frontend **Project structure:**
- API
- Crawler
The frontend is driven by [Astro](https://astro.build/) and is used to display the final CVSA page. The API is driven by ```
[Hono](https://hono.dev) and is used to query the database and provide REST/GraphQL APIs that can be called by out cvsa
website, applications, or third parties. The crawler is our automatic data collector, used to automatically collect new ├── deno.json
songs from bilibili, track their statistics, etc. ├── packages
│ ├── backend
│ ├── core
│ ├── crawler
│ └── frontend
└── README.md
```
**Package Breakdown:**
* **`backend`**: This package houses the server-side logic, built with the [Hono](https://hono.dev/) web framework. It's responsible for interacting with the database and exposing data through REST and GraphQL APIs for consumption by the frontend, internal applications, and third-party developers.
* **`frontend`**: The user-facing web interface of CVSA is developed using [Astro](https://astro.build/). This package handles the presentation layer, displaying information fetched from the database.
* **`crawler`**: This automated data collection system is a key component of CVSA. It's designed to automatically discover and gather new song data from bilibili, as well as track relevant statistics over time.
* **`core`**: This package contains reusable and generic code that is utilized across multiple workspaces within the CVSA monorepo.
### Crawler ### Crawler
Automation is the biggest highlight of CVSA's technical design. To achieve this, we use a message queue powered by Automation is the biggest highlight of CVSA's technical design. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data collection lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database.
[BullMQ](https://bullmq.io/) to concurrently process various tasks in the data collection life cycle.

View File

@ -0,0 +1,106 @@
openapi: 3.0.0
info:
title: CVSA API
version: v1
servers:
- url: https://api.projectcvsa.com
paths:
/video/{id}/snapshots:
get:
summary: 获取视频快照列表
description: 根据视频 ID 获取视频的快照列表。视频 ID 可以是以 "av" 开头的数字,以 "BV" 开头的 12 位字母数字,或者一个正整数。
parameters:
- in: path
name: id
required: true
schema:
type: string
description: "视频 ID (如: av78977256, BV1KJ411C7CW, 78977256)"
- in: query
name: ps
schema:
type: integer
minimum: 1
description: 每页返回的快照数量 (pageSize),默认为 1000。
- in: query
name: pn
schema:
type: integer
minimum: 1
description: 页码 (pageNumber)用于分页查询。offset 与 pn 只能选择一个。
- in: query
name: offset
schema:
type: integer
minimum: 1
description: 偏移量用于基于偏移量的查询。offset 与 pn 只能选择一个。
- in: query
name: reverse
schema:
type: boolean
description: 是否反向排序(从旧到新),默认为 false。
responses:
'200':
description: 成功获取快照列表
content:
application/json:
schema:
type: array
items:
type: object
properties:
id:
type: integer
description: 快照 ID
aid:
type: integer
description: 视频的 av 号
views:
type: integer
description: 视频播放量
coins:
type: integer
description: 视频投币数
likes:
type: integer
description: 视频点赞数
favorites:
type: integer
description: 视频收藏数
shares:
type: integer
description: 视频分享数
danmakus:
type: integer
description: 视频弹幕数
replies:
type: integer
description: 视频评论数
'400':
description: 无效的查询参数
content:
application/json:
schema:
type: object
properties:
message:
type: string
description: 错误消息
errors:
type: object
description: 详细的错误信息
'500':
description: 服务器内部错误
content:
application/json:
schema:
type: object
properties:
message:
type: string
description: 错误消息
error:
type: object
description: 详细的错误信息

View File

@ -1,11 +1,11 @@
# Table of contents # Table of contents
- [欢迎](README.md) * [欢迎](README.md)
## 关于 <a href="#about" id="about"></a> ## 关于 <a href="#about" id="about"></a>
- [关于本项目](about/this-project.md) * [关于本项目](about/this-project.md)
- [收录范围](about/scope-of-inclusion.md) * [收录范围](about/scope-of-inclusion.md)
## 技术架构 <a href="#architecture" id="architecture"></a> ## 技术架构 <a href="#architecture" id="architecture"></a>
@ -18,5 +18,5 @@
## API 文档 <a href="#api-doc" id="api-doc"></a> ## API 文档 <a href="#api-doc" id="api-doc"></a>
- [目录](api-doc/catalog.md) * [目录](api-doc/catalog.md)
- [歌曲](api-doc/songs.md) * [视频快照](api-doc/video-snapshot.md)

View File

@ -1,3 +1,4 @@
# 目录 # 目录
- [歌曲](songs.md) * [视频快照](video-snapshot.md)

View File

@ -1,3 +0,0 @@
# 歌曲
暂未实现。

View File

@ -0,0 +1,6 @@
# 视频快照
{% openapi src="../.gitbook/assets/1.yaml" path="/video/{id}/snapshots" method="get" %}
[1.yaml](../.gitbook/assets/1.yaml)
{% endopenapi %}

View File

@ -2,6 +2,8 @@
CVSA 使用 [PostgreSQL](https://www.postgresql.org/) 作为数据库。 CVSA 使用 [PostgreSQL](https://www.postgresql.org/) 作为数据库。
CVSA 设计了两个
CVSA 的所有公开数据(不包括用户的个人数据)都存储在名为 `cvsa_main` 的数据库中,该数据库包含以下表: CVSA 的所有公开数据(不包括用户的个人数据)都存储在名为 `cvsa_main` 的数据库中,该数据库包含以下表:
- songs存储歌曲的主要信息 - songs存储歌曲的主要信息