merge: branch 'gitbook' into ref/structure

This commit is contained in:
alikia2x (寒寒) 2025-04-06 02:01:05 +08:00
commit a8292d7b6b
Signed by: alikia2x
GPG Key ID: 56209E0CCD8420C6
10 changed files with 156 additions and 36 deletions

View File

@ -1,21 +1,21 @@
# Table of contents
- [Welcome](README.md)
* [Welcome](README.md)
## About
- [About CVSA Project](about/this-project.md)
- [Scope of Inclusion](about/scope-of-inclusion.md)
* [About CVSA Project](about/this-project.md)
* [Scope of Inclusion](about/scope-of-inclusion.md)
## Architecure
- [Overview](architecure/overview.md)
- [Database Structure](architecure/database-structure/README.md)
- [Type of Song](architecure/database-structure/type-of-song.md)
- [Message Queue](architecure/message-queue.md)
- [Artificial Intelligence](architecure/artificial-intelligence.md)
* [Overview](architecure/overview.md)
* [Crawler](architecure/crawler.md)
* [Database Structure](architecure/database-structure/README.md)
* [Type of Song](architecure/database-structure/type-of-song.md)
* [Artificial Intelligence](architecure/artificial-intelligence.md)
## API Doc
- [Catalog](api-doc/catalog.md)
- [Songs](api-doc/songs.md)
* [Catalog](api-doc/catalog.md)
* [Songs](api-doc/songs.md)

View File

@ -0,0 +1,4 @@
# Crawler
A central aspect of CVSA's technical design is its emphasis on automation. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database.

View File

@ -1,7 +0,0 @@
# Message Queue
We rely on message queues to manage the various tasks that [the cralwer](overview.md#crawler)needs to perform.
### Code Path
Currently, the code related to message queues are located at `lib/mq` and `src`.

View File

@ -14,18 +14,29 @@ layout:
# Overview
The whole CVSA system can be sperate into three different parts:
The CVSA is a [monorepo](https://en.wikipedia.org/wiki/Monorepo) codebase, mainly using TypeScript as the development language. With [Deno workspace](https://docs.deno.com/runtime/fundamentals/workspaces/), the major part of the codebase is under `packages/`. 
- Frontend
- API
- Crawler
**Project structure:**
The frontend is driven by [Astro](https://astro.build/) and is used to display the final CVSA page. The API is driven by
[Hono](https://hono.dev) and is used to query the database and provide REST/GraphQL APIs that can be called by out
website, applications, or third parties. The crawler is our automatic data collector, used to automatically collect new
songs from bilibili, track their statistics, etc.
```
cvsa
├── deno.json
├── packages
│ ├── backend
│ ├── core
│ ├── crawler
│ └── frontend
└── README.md
```
**Package Breakdown:**
* **`backend`**: This package houses the server-side logic, built with the [Hono](https://hono.dev/) web framework. It's responsible for interacting with the database and exposing data through REST and GraphQL APIs for consumption by the frontend, internal applications, and third-party developers.
* **`frontend`**: The user-facing web interface of CVSA is developed using [Astro](https://astro.build/). This package handles the presentation layer, displaying information fetched from the database.
* **`crawler`**: This automated data collection system is a key component of CVSA. It's designed to automatically discover and gather new song data from bilibili, as well as track relevant statistics over time.
* **`core`**: This package contains reusable and generic code that is utilized across multiple workspaces within the CVSA monorepo.
### Crawler
Automation is the biggest highlight of CVSA's technical design. To achieve this, we use a message queue powered by
[BullMQ](https://bullmq.io/) to concurrently process various tasks in the data collection life cycle.
Automation is the biggest highlight of CVSA's technical design. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data collection lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database.

View File

@ -0,0 +1,106 @@
openapi: 3.0.0
info:
title: CVSA API
version: v1
servers:
- url: https://api.projectcvsa.com
paths:
/video/{id}/snapshots:
get:
summary: 获取视频快照列表
description: 根据视频 ID 获取视频的快照列表。视频 ID 可以是以 "av" 开头的数字,以 "BV" 开头的 12 位字母数字,或者一个正整数。
parameters:
- in: path
name: id
required: true
schema:
type: string
description: "视频 ID (如: av78977256, BV1KJ411C7CW, 78977256)"
- in: query
name: ps
schema:
type: integer
minimum: 1
description: 每页返回的快照数量 (pageSize),默认为 1000。
- in: query
name: pn
schema:
type: integer
minimum: 1
description: 页码 (pageNumber)用于分页查询。offset 与 pn 只能选择一个。
- in: query
name: offset
schema:
type: integer
minimum: 1
description: 偏移量用于基于偏移量的查询。offset 与 pn 只能选择一个。
- in: query
name: reverse
schema:
type: boolean
description: 是否反向排序(从旧到新),默认为 false。
responses:
'200':
description: 成功获取快照列表
content:
application/json:
schema:
type: array
items:
type: object
properties:
id:
type: integer
description: 快照 ID
aid:
type: integer
description: 视频的 av 号
views:
type: integer
description: 视频播放量
coins:
type: integer
description: 视频投币数
likes:
type: integer
description: 视频点赞数
favorites:
type: integer
description: 视频收藏数
shares:
type: integer
description: 视频分享数
danmakus:
type: integer
description: 视频弹幕数
replies:
type: integer
description: 视频评论数
'400':
description: 无效的查询参数
content:
application/json:
schema:
type: object
properties:
message:
type: string
description: 错误消息
errors:
type: object
description: 详细的错误信息
'500':
description: 服务器内部错误
content:
application/json:
schema:
type: object
properties:
message:
type: string
description: 错误消息
error:
type: object
description: 详细的错误信息

View File

@ -1,11 +1,11 @@
# Table of contents
- [欢迎](README.md)
* [欢迎](README.md)
## 关于 <a href="#about" id="about"></a>
- [关于本项目](about/this-project.md)
- [收录范围](about/scope-of-inclusion.md)
* [关于本项目](about/this-project.md)
* [收录范围](about/scope-of-inclusion.md)
## 技术架构 <a href="#architecture" id="architecture"></a>
@ -18,5 +18,5 @@
## API 文档 <a href="#api-doc" id="api-doc"></a>
- [目录](api-doc/catalog.md)
- [歌曲](api-doc/songs.md)
* [目录](api-doc/catalog.md)
* [视频快照](api-doc/video-snapshot.md)

View File

@ -1,3 +1,4 @@
# 目录
- [歌曲](songs.md)
* [视频快照](video-snapshot.md)

View File

@ -1,3 +0,0 @@
# 歌曲
暂未实现。

View File

@ -0,0 +1,6 @@
# 视频快照
{% openapi src="../.gitbook/assets/1.yaml" path="/video/{id}/snapshots" method="get" %}
[1.yaml](../.gitbook/assets/1.yaml)
{% endopenapi %}

View File

@ -2,6 +2,8 @@
CVSA 使用 [PostgreSQL](https://www.postgresql.org/) 作为数据库。
CVSA 设计了两个
CVSA 的所有公开数据(不包括用户的个人数据)都存储在名为 `cvsa_main` 的数据库中,该数据库包含以下表:
- songs存储歌曲的主要信息