doc: GitBook - added about page

This commit is contained in:
alikia2x (寒寒) 2025-02-10 23:06:14 +00:00 committed by gitbook-bot
parent 9e569572cd
commit 869ac1ad3d
No known key found for this signature in database
GPG Key ID: 07D2180C7B12D0FF
7 changed files with 78 additions and 0 deletions

View File

@ -10,6 +10,9 @@
## Architecure ## Architecure
* [Overview](architecure/overview.md) * [Overview](architecure/overview.md)
* [Database Structure](architecure/database-structure/README.md)
* [Type of Song](architecure/database-structure/type-of-song.md)
* [Artificial Intelligence](architecure/artificial-intelligence.md)
## API Doc ## API Doc

View File

@ -1,2 +1,19 @@
# Scope of Inclusion # Scope of Inclusion
CVSA contains many aspects of Chinese Vocal Synthesis, including songs, albums, artists (publisher, manipulators, arranger, etc), singers and voice engines / voicebanks. 
For a **song**, it must meet the following conditions to be included in CVSA:
### Category 30
In principle, the songs featured in CVSA must be included in a video categorized under VOCALOID·UTAU (ID 30) that is posted on Bilibili. In some special cases, this rule may not be enforced. 
### At Leats One Line of Chinese
The lyrics of the song must contain at least one line in Chinese. This means that even if a voicebank that only supports Chinese is used, if the lyrics of the song do not contain Chinese, it will not be included in the CVSA.
### Using Vocal Synthesizer
To be included in CVSA, at least one line of the song must be produced by a Vocal Synthesizer (including harmony vocals).
We define a vocal synthesizer as a software or system that generates synthesized singing voices by algorithmically modeling vocal characteristics and producing audio from input parameters such as lyrics, pitch, and dynamics, encompassing both waveform-concatenation-based (e.g., VOCALOID, UTAU) and AI-based (e.g., Synthesizer V, ACE Studio) approaches, **but excluding voice conversion tools that solely alter the timbre of pre-existing recordings** (e.g., [so-vits svc](https://github.com/svc-develop-team/so-vits-svc)).

View File

@ -1,2 +1,11 @@
# About CVSA Project # About CVSA Project
CVSA (Chinese Vocal Synthesis Archive) aims to collect as much content as possible about the Chinese Vocal Synthesis community in a highly automation-assisted way. 
Unlike existing projects such as [VocaDB](https://vocadb.net), CVSA collects and displays the following content in an automated and manually edited way:
* Metadata of songs (name, duration, publisher, singer, etc.)
* Descriptive information of songs (content introduction, creation background, lyrics, etc.)
* Engagement data snapshots of songs, i.e. historical snapshots of their engagement data (including views, favorites, likes, etc.) on the [Bilibili](https://en.wikipedia.org/wiki/Bilibili) website.
* Information about artists, albums, vocal synthesizers, and voicebanks.

View File

@ -1,2 +1,3 @@
# Songs # Songs
Not implemented yet.

View File

@ -0,0 +1,13 @@
# Artificial Intelligence
CVSA's automated workflow relies heavily on artificial intelligence for information extraction and classification.
The AI systems we currently use are:
### The Filter
Located at `/filter/` under project root dir, it classifies a video in the [category 30](../about/scope-of-inclusion.md#category-30) into the following categories:
* 0: Not related to Chinese vocal synthesis
* 1: A original song with Chinese vocal synthesis
* 2: A cover/remix song with Chinese vocal synthesis

View File

@ -0,0 +1,11 @@
# Database Structure
CVSA uses [PostgreSQL](https://www.postgresql.org/) as our database.
All public data of CVSA (excluding users' personal data) is stored in a database named `cvsa_main`, which contains the following tables:
* songs: stores the main information of songs
* bili\_user: stores snapshots of Bilibili user information
* all\_data: metadata of all videos in [category 30](../../about/scope-of-inclusion.md#category-30).
* labelling\_result: Contains label of videos in `all_data`tagged by our [AI system](../artificial-intelligence.md#the-filter).

View File

@ -0,0 +1,24 @@
# Type of Song
The **Unrelated type** refers specifically to videos that are not in our [Scope of Inclusion](../../about/scope-of-inclusion.md).
### Table: `songs`
The `type` column used in the `songs` table.
| Type | Description |
| ---- | ------------ |
| 0 | Unrelated |
| 1 | Original |
| 2 | Cover |
| 3 | Remix |
| 4 | Instrumental |
| 10 | Others |
### Table: `labelling_result`
| Label | Description |
| ----- | ---------------------- |
| 0 | AI tagged: Unrelated |
| 1 | AI tagged: Original |
| 2 | AI tagged: Cover/Remix |