From f34633dc3598869fefe29fbadccf60135de14c78 Mon Sep 17 00:00:00 2001 From: alikia2x Date: Sat, 5 Apr 2025 17:59:42 +0000 Subject: [PATCH] doc: GitBook - No subject --- doc/en/SUMMARY.md | 12 ++++++------ doc/en/architecure/crawler.md | 4 ++++ doc/en/architecure/message-queue.md | 7 ------- doc/en/architecure/overview.md | 27 +++++++++++++++++++++------ 4 files changed, 31 insertions(+), 19 deletions(-) create mode 100644 doc/en/architecure/crawler.md delete mode 100644 doc/en/architecure/message-queue.md diff --git a/doc/en/SUMMARY.md b/doc/en/SUMMARY.md index 1724ba3..99ca460 100644 --- a/doc/en/SUMMARY.md +++ b/doc/en/SUMMARY.md @@ -1,21 +1,21 @@ # Table of contents -- [Welcome](README.md) +* [Welcome](README.md) ## About -- [About CVSA Project](about/this-project.md) -- [Scope of Inclusion](about/scope-of-inclusion.md) +* [About CVSA Project](about/this-project.md) +* [Scope of Inclusion](about/scope-of-inclusion.md) ## Architecure * [Overview](architecure/overview.md) +* [Crawler](architecure/crawler.md) * [Database Structure](architecure/database-structure/README.md) * [Type of Song](architecure/database-structure/type-of-song.md) -* [Message Queue](architecure/message-queue.md) * [Artificial Intelligence](architecure/artificial-intelligence.md) ## API Doc -- [Catalog](api-doc/catalog.md) -- [Songs](api-doc/songs.md) +* [Catalog](api-doc/catalog.md) +* [Songs](api-doc/songs.md) diff --git a/doc/en/architecure/crawler.md b/doc/en/architecure/crawler.md new file mode 100644 index 0000000..e60f132 --- /dev/null +++ b/doc/en/architecure/crawler.md @@ -0,0 +1,4 @@ +# Crawler + +A central aspect of CVSA's technical design is its emphasis on automation. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database. + diff --git a/doc/en/architecure/message-queue.md b/doc/en/architecure/message-queue.md deleted file mode 100644 index 4fa4877..0000000 --- a/doc/en/architecure/message-queue.md +++ /dev/null @@ -1,7 +0,0 @@ -# Message Queue - -We rely on message queues to manage the various tasks that [the cralwer ](overview.md#crawler)needs to perform. - -### Code Path - -Currently, the code related to message queues are located at `lib/mq` and `src`. diff --git a/doc/en/architecure/overview.md b/doc/en/architecure/overview.md index e46c887..fc694fe 100644 --- a/doc/en/architecure/overview.md +++ b/doc/en/architecure/overview.md @@ -14,14 +14,29 @@ layout: # Overview -The whole CVSA system can be sperate into three different parts: +The CVSA is a [monorepo](https://en.wikipedia.org/wiki/Monorepo) codebase, mainly using TypeScript as the development language. With [Deno workspace](https://docs.deno.com/runtime/fundamentals/workspaces/), the major part of the codebase is under `packages/`. -* Frontend -* API -* Crawler +**Project structure:** -The frontend is driven by [Astro](https://astro.build/) and is used to display the final CVSA page. The API is driven by [Hono](https://hono.dev) and is used to query the database and provide REST/GraphQL APIs that can be called by out website, applications, or third parties. The crawler is our automatic data collector, used to automatically collect new songs from bilibili, track their statistics, etc. +``` +cvsa +├── deno.json +├── packages +│ ├── backend +│ ├── core +│ ├── crawler +│ └── frontend +└── README.md +``` + +**Package Breakdown:** + +* **`backend`**: This package houses the server-side logic, built with the [Hono](https://hono.dev/) web framework. It's responsible for interacting with the database and exposing data through REST and GraphQL APIs for consumption by the frontend, internal applications, and third-party developers. +* **`frontend`**: The user-facing web interface of CVSA is developed using [Astro](https://astro.build/). This package handles the presentation layer, displaying information fetched from the database. +* **`crawler`**: This automated data collection system is a key component of CVSA. It's designed to automatically discover and gather new song data from bilibili, as well as track relevant statistics over time. +* **`core`**: This package contains reusable and generic code that is utilized across multiple workspaces within the CVSA monorepo. ### Crawler -Automation is the biggest highlight of CVSA's technical design. To achieve this, we use a message queue powered by [BullMQ](https://bullmq.io/) to concurrently process various tasks in the data collection life cycle. +Automation is the biggest highlight of CVSA's technical design. The data collection process within the `crawler` is orchestrated using a message queue powered by [BullMQ](https://bullmq.io/). This enables concurrent processing of various tasks involved in the data collection lifecycle. State management and data persistence are handled by a combination of Redis for caching and real-time data, and PostgreSQL as the primary database. +