merge: branch 'gitbook' into ref/structure
This commit is contained in:
commit
a8292d7b6b
@ -1,21 +1,21 @@
|
||||
# Table of contents
|
||||
|
||||
- [Welcome](README.md)
|
||||
* [Welcome](README.md)
|
||||
|
||||
## About
|
||||
|
||||
- [About CVSA Project](about/this-project.md)
|
||||
- [Scope of Inclusion](about/scope-of-inclusion.md)
|
||||
* [About CVSA Project](about/this-project.md)
|
||||
* [Scope of Inclusion](about/scope-of-inclusion.md)
|
||||
|
||||
## Architecure
|
||||
|
||||
- [Overview](architecure/overview.md)
|
||||
- [Database Structure](architecure/database-structure/README.md)
|
||||
- [Type of Song](architecure/database-structure/type-of-song.md)
|
||||
- [Message Queue](architecure/message-queue.md)
|
||||
- [Artificial Intelligence](architecure/artificial-intelligence.md)
|
||||
* [Overview](architecure/overview.md)
|
||||
* [Crawler](architecure/crawler.md)
|
||||
* [Database Structure](architecure/database-structure/README.md)
|
||||
* [Type of Song](architecure/database-structure/type-of-song.md)
|
||||
* [Artificial Intelligence](architecure/artificial-intelligence.md)
|
||||
|
||||
## API Doc
|
||||
|
||||
- [Catalog](api-doc/catalog.md)
|
||||
- [Songs](api-doc/songs.md)
|
||||
* [Catalog](api-doc/catalog.md)
|
||||
* [Songs](api-doc/songs.md)
|
||||
|
4
doc/en/architecure/crawler.md
Normal file
4
doc/en/architecure/crawler.md
Normal file
@ -0,0 +1,4 @@
|
||||
# Crawler
|
||||
|
||||
A central aspect of CVSA's technical design is its emphasis on automation. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database.
|
||||
|
@ -1,7 +0,0 @@
|
||||
# Message Queue
|
||||
|
||||
We rely on message queues to manage the various tasks that [the cralwer](overview.md#crawler)needs to perform.
|
||||
|
||||
### Code Path
|
||||
|
||||
Currently, the code related to message queues are located at `lib/mq` and `src`.
|
@ -14,18 +14,29 @@ layout:
|
||||
|
||||
# Overview
|
||||
|
||||
The whole CVSA system can be sperate into three different parts:
|
||||
The CVSA is a [monorepo](https://en.wikipedia.org/wiki/Monorepo) codebase, mainly using TypeScript as the development language. With [Deno workspace](https://docs.deno.com/runtime/fundamentals/workspaces/), the major part of the codebase is under `packages/`. 
|
||||
|
||||
- Frontend
|
||||
- API
|
||||
- Crawler
|
||||
**Project structure:**
|
||||
|
||||
The frontend is driven by [Astro](https://astro.build/) and is used to display the final CVSA page. The API is driven by
|
||||
[Hono](https://hono.dev) and is used to query the database and provide REST/GraphQL APIs that can be called by out
|
||||
website, applications, or third parties. The crawler is our automatic data collector, used to automatically collect new
|
||||
songs from bilibili, track their statistics, etc.
|
||||
```
|
||||
cvsa
|
||||
├── deno.json
|
||||
├── packages
|
||||
│ ├── backend
|
||||
│ ├── core
|
||||
│ ├── crawler
|
||||
│ └── frontend
|
||||
└── README.md
|
||||
```
|
||||
|
||||
**Package Breakdown:**
|
||||
|
||||
* **`backend`**: This package houses the server-side logic, built with the [Hono](https://hono.dev/) web framework. It's responsible for interacting with the database and exposing data through REST and GraphQL APIs for consumption by the frontend, internal applications, and third-party developers.
|
||||
* **`frontend`**: The user-facing web interface of CVSA is developed using [Astro](https://astro.build/). This package handles the presentation layer, displaying information fetched from the database.
|
||||
* **`crawler`**: This automated data collection system is a key component of CVSA. It's designed to automatically discover and gather new song data from bilibili, as well as track relevant statistics over time.
|
||||
* **`core`**: This package contains reusable and generic code that is utilized across multiple workspaces within the CVSA monorepo.
|
||||
|
||||
### Crawler
|
||||
|
||||
Automation is the biggest highlight of CVSA's technical design. To achieve this, we use a message queue powered by
|
||||
[BullMQ](https://bullmq.io/) to concurrently process various tasks in the data collection life cycle.
|
||||
Automation is the biggest highlight of CVSA's technical design. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data collection lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database.
|
||||
|
||||
|
106
doc/zh/.gitbook/assets/1.yaml
Normal file
106
doc/zh/.gitbook/assets/1.yaml
Normal file
@ -0,0 +1,106 @@
|
||||
openapi: 3.0.0
|
||||
info:
|
||||
title: CVSA API
|
||||
version: v1
|
||||
|
||||
servers:
|
||||
- url: https://api.projectcvsa.com
|
||||
|
||||
paths:
|
||||
/video/{id}/snapshots:
|
||||
get:
|
||||
summary: 获取视频快照列表
|
||||
description: 根据视频 ID 获取视频的快照列表。视频 ID 可以是以 "av" 开头的数字,以 "BV" 开头的 12 位字母数字,或者一个正整数。
|
||||
parameters:
|
||||
- in: path
|
||||
name: id
|
||||
required: true
|
||||
schema:
|
||||
type: string
|
||||
description: "视频 ID (如: av78977256, BV1KJ411C7CW, 78977256)"
|
||||
- in: query
|
||||
name: ps
|
||||
schema:
|
||||
type: integer
|
||||
minimum: 1
|
||||
description: 每页返回的快照数量 (pageSize),默认为 1000。
|
||||
- in: query
|
||||
name: pn
|
||||
schema:
|
||||
type: integer
|
||||
minimum: 1
|
||||
description: 页码 (pageNumber),用于分页查询。offset 与 pn 只能选择一个。
|
||||
- in: query
|
||||
name: offset
|
||||
schema:
|
||||
type: integer
|
||||
minimum: 1
|
||||
description: 偏移量,用于基于偏移量的查询。offset 与 pn 只能选择一个。
|
||||
- in: query
|
||||
name: reverse
|
||||
schema:
|
||||
type: boolean
|
||||
description: 是否反向排序(从旧到新),默认为 false。
|
||||
responses:
|
||||
'200':
|
||||
description: 成功获取快照列表
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
type: array
|
||||
items:
|
||||
type: object
|
||||
properties:
|
||||
id:
|
||||
type: integer
|
||||
description: 快照 ID
|
||||
aid:
|
||||
type: integer
|
||||
description: 视频的 av 号
|
||||
views:
|
||||
type: integer
|
||||
description: 视频播放量
|
||||
coins:
|
||||
type: integer
|
||||
description: 视频投币数
|
||||
likes:
|
||||
type: integer
|
||||
description: 视频点赞数
|
||||
favorites:
|
||||
type: integer
|
||||
description: 视频收藏数
|
||||
shares:
|
||||
type: integer
|
||||
description: 视频分享数
|
||||
danmakus:
|
||||
type: integer
|
||||
description: 视频弹幕数
|
||||
replies:
|
||||
type: integer
|
||||
description: 视频评论数
|
||||
'400':
|
||||
description: 无效的查询参数
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
type: object
|
||||
properties:
|
||||
message:
|
||||
type: string
|
||||
description: 错误消息
|
||||
errors:
|
||||
type: object
|
||||
description: 详细的错误信息
|
||||
'500':
|
||||
description: 服务器内部错误
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
type: object
|
||||
properties:
|
||||
message:
|
||||
type: string
|
||||
description: 错误消息
|
||||
error:
|
||||
type: object
|
||||
description: 详细的错误信息
|
@ -1,11 +1,11 @@
|
||||
# Table of contents
|
||||
|
||||
- [欢迎](README.md)
|
||||
* [欢迎](README.md)
|
||||
|
||||
## 关于 <a href="#about" id="about"></a>
|
||||
|
||||
- [关于本项目](about/this-project.md)
|
||||
- [收录范围](about/scope-of-inclusion.md)
|
||||
* [关于本项目](about/this-project.md)
|
||||
* [收录范围](about/scope-of-inclusion.md)
|
||||
|
||||
## 技术架构 <a href="#architecture" id="architecture"></a>
|
||||
|
||||
@ -18,5 +18,5 @@
|
||||
|
||||
## API 文档 <a href="#api-doc" id="api-doc"></a>
|
||||
|
||||
- [目录](api-doc/catalog.md)
|
||||
- [歌曲](api-doc/songs.md)
|
||||
* [目录](api-doc/catalog.md)
|
||||
* [视频快照](api-doc/video-snapshot.md)
|
||||
|
@ -1,3 +1,4 @@
|
||||
# 目录
|
||||
|
||||
- [歌曲](songs.md)
|
||||
* [视频快照](video-snapshot.md)
|
||||
|
||||
|
@ -1,3 +0,0 @@
|
||||
# 歌曲
|
||||
|
||||
暂未实现。
|
6
doc/zh/api-doc/video-snapshot.md
Normal file
6
doc/zh/api-doc/video-snapshot.md
Normal file
@ -0,0 +1,6 @@
|
||||
# 视频快照
|
||||
|
||||
{% openapi src="../.gitbook/assets/1.yaml" path="/video/{id}/snapshots" method="get" %}
|
||||
[1.yaml](../.gitbook/assets/1.yaml)
|
||||
{% endopenapi %}
|
||||
|
@ -2,6 +2,8 @@
|
||||
|
||||
CVSA 使用 [PostgreSQL](https://www.postgresql.org/) 作为数据库。
|
||||
|
||||
CVSA 设计了两个
|
||||
|
||||
CVSA 的所有公开数据(不包括用户的个人数据)都存储在名为 `cvsa_main` 的数据库中,该数据库包含以下表:
|
||||
|
||||
- songs:存储歌曲的主要信息
|
||||
|
Loading…
Reference in New Issue
Block a user