ref: rename NetScheduler to NetworkDelegate
This commit is contained in:
parent
20668609dd
commit
94e19690d1
@ -9,11 +9,11 @@
|
||||
|
||||
## Architecure
|
||||
|
||||
* [Overview](architecure/overview.md)
|
||||
* [Database Structure](architecure/database-structure/README.md)
|
||||
* [Type of Song](architecure/database-structure/type-of-song.md)
|
||||
* [Message Queue](architecure/message-queue.md)
|
||||
* [Artificial Intelligence](architecure/artificial-intelligence.md)
|
||||
- [Overview](architecure/overview.md)
|
||||
- [Database Structure](architecure/database-structure/README.md)
|
||||
- [Type of Song](architecure/database-structure/type-of-song.md)
|
||||
- [Message Queue](architecure/message-queue.md)
|
||||
- [Artificial Intelligence](architecure/artificial-intelligence.md)
|
||||
|
||||
## API Doc
|
||||
|
||||
|
@ -7,23 +7,34 @@ For a **song**, it must meet the following conditions to be included in CVSA:
|
||||
|
||||
### Category 30
|
||||
|
||||
In principle, the songs must be featured in a video that is categorized under the VOCALOID·UTAU (ID 30) category in [Bilibili](https://en.wikipedia.org/wiki/Bilibili) in order to be observed by our [automation program](../architecure/overview.md#crawler). We welcome editors to manually add songs that have not been uploaded to bilibili / categorized under this category.
|
||||
In principle, the songs must be featured in a video that is categorized under the VOCALOID·UTAU (ID 30) category in
|
||||
[Bilibili](https://en.wikipedia.org/wiki/Bilibili) in order to be observed by our
|
||||
[automation program](../architecure/overview.md#crawler). We welcome editors to manually add songs that have not been
|
||||
uploaded to bilibili / categorized under this category.
|
||||
|
||||
#### NEWS
|
||||
|
||||
Recently, Bilibili seems to be offlining the sub-category. This means the VOCALOID·UTAU category can no longer be entered from the frontend, and producers can no longer upload videos to this category (instead, they can only choose the parent category "Music"). 
|
||||
Recently, Bilibili seems to be offlining the sub-category. This means the VOCALOID·UTAU category can no longer be
|
||||
entered from the frontend, and producers can no longer upload videos to this category (instead, they can only choose the
|
||||
parent category "Music"). 
|
||||
|
||||
According to our experiments, Bilibili still retains the code logic of sub-categories in the backend, and newly published songs may still be in the VOCALOID·UTAU sub-category, and the related APIs can still work normally. However, there are [reports](https://www.bilibili.com/opus/1041223385394184199) that some of the new songs have been placed under the "Music General" sub-category.\
|
||||
We are still waiting for Bilibili's follow-up actions, and in the future, we may adjust the scope of our automated program's crawling.
|
||||
According to our experiments, Bilibili still retains the code logic of sub-categories in the backend, and newly
|
||||
published songs may still be in the VOCALOID·UTAU sub-category, and the related APIs can still work normally. However,
|
||||
there are [reports](https://www.bilibili.com/opus/1041223385394184199) that some of the new songs have been placed under
|
||||
the "Music General" sub-category.\
|
||||
We are still waiting for Bilibili's follow-up actions, and in the future, we may adjust the scope of our automated
|
||||
program's crawling.
|
||||
|
||||
### At Leats One Line of Chinese / Chinese Virtual Singer
|
||||
|
||||
The lyrics of the song must contain at least one line in Chinese. Otherwise, if the lyrics of the song do not contain Chinese, it will only be included in the CVSA only if a Chinese virtual singer has been used.
|
||||
The lyrics of the song must contain at least one line in Chinese. Otherwise, if the lyrics of the song do not contain
|
||||
Chinese, it will only be included in the CVSA only if a Chinese virtual singer has been used.
|
||||
|
||||
We define a **Chinese virtual singer** as follows:
|
||||
|
||||
1. The singer primarily uses Chinese voicebank (i.e. the most widely used voickbank for the singer is Chinese)
|
||||
2. The singer is operated by a company, organization, individual or group located in Mainland China, Hong Kong, Macau or Taiwan.
|
||||
2. The singer is operated by a company, organization, individual or group located in Mainland China, Hong Kong, Macau or
|
||||
Taiwan.
|
||||
|
||||
### Using Vocal Synthesizer
|
||||
|
||||
|
@ -9,10 +9,13 @@ The AI systems we currently use are:
|
||||
Located at `/filter/` under project root dir, it classifies a video in the
|
||||
[category 30](../about/scope-of-inclusion.md#category-30) into the following categories:
|
||||
|
||||
* 0: Not related to Chinese vocal synthesis
|
||||
* 1: A original song with Chinese vocal synthesis
|
||||
* 2: A cover/remix song with Chinese vocal synthesis
|
||||
- 0: Not related to Chinese vocal synthesis
|
||||
- 1: A original song with Chinese vocal synthesis
|
||||
- 2: A cover/remix song with Chinese vocal synthesis
|
||||
|
||||
### The Predictor
|
||||
|
||||
Located at `/pred/`under the project root dir, it predicts the future views of a video. This is a regression model that takes historical view trends of a video, other contextual information (such as the current time), and future time points to be predicted as feature inputs, and outputs the increment in the video's view count from "now" to the specified future time point.
|
||||
Located at `/pred/`under the project root dir, it predicts the future views of a video. This is a regression model that
|
||||
takes historical view trends of a video, other contextual information (such as the current time), and future time points
|
||||
to be predicted as feature inputs, and outputs the increment in the video's view count from "now" to the specified
|
||||
future time point.
|
||||
|
@ -5,10 +5,11 @@ CVSA uses [PostgreSQL](https://www.postgresql.org/) as our database.
|
||||
All public data of CVSA (excluding users' personal data) is stored in a database named `cvsa_main`, which contains the
|
||||
following tables:
|
||||
|
||||
* songs: stores the main information of songs
|
||||
* bili\_user: stores snapshots of Bilibili user information
|
||||
* all\_data: metadata of all videos in [category 30](../../about/scope-of-inclusion.md#category-30).
|
||||
* labelling\_result: Contains label of videos in `all_data`tagged by our [AI system](../artificial-intelligence.md#the-filter).
|
||||
* video\_snapshot: Statistical data of videos that are fetched regularly (e.g., number of views, etc.), we call this fetch process as "snapshot".
|
||||
* snapshot\_schedule: The scheduling information for video snapshots.
|
||||
|
||||
- songs: stores the main information of songs
|
||||
- bili\_user: stores snapshots of Bilibili user information
|
||||
- all\_data: metadata of all videos in [category 30](../../about/scope-of-inclusion.md#category-30).
|
||||
- labelling\_result: Contains label of videos in `all_data`tagged by our
|
||||
[AI system](../artificial-intelligence.md#the-filter).
|
||||
- video\_snapshot: Statistical data of videos that are fetched regularly (e.g., number of views, etc.), we call this
|
||||
fetch process as "snapshot".
|
||||
- snapshot\_schedule: The scheduling information for video snapshots.
|
||||
|
@ -1,6 +1,6 @@
|
||||
# Message Queue
|
||||
|
||||
We rely on message queues to manage the various tasks that [the cralwer ](overview.md#crawler)needs to perform.
|
||||
We rely on message queues to manage the various tasks that [the cralwer](overview.md#crawler)needs to perform.
|
||||
|
||||
### Code Path
|
||||
|
||||
|
@ -16,12 +16,16 @@ layout:
|
||||
|
||||
The whole CVSA system can be sperate into three different parts:
|
||||
|
||||
* Frontend
|
||||
* API
|
||||
* Crawler
|
||||
- Frontend
|
||||
- API
|
||||
- Crawler
|
||||
|
||||
The frontend is driven by [Astro](https://astro.build/) and is used to display the final CVSA page. The API is driven by [Hono](https://hono.dev) and is used to query the database and provide REST/GraphQL APIs that can be called by out website, applications, or third parties. The crawler is our automatic data collector, used to automatically collect new songs from bilibili, track their statistics, etc.
|
||||
The frontend is driven by [Astro](https://astro.build/) and is used to display the final CVSA page. The API is driven by
|
||||
[Hono](https://hono.dev) and is used to query the database and provide REST/GraphQL APIs that can be called by out
|
||||
website, applications, or third parties. The crawler is our automatic data collector, used to automatically collect new
|
||||
songs from bilibili, track their statistics, etc.
|
||||
|
||||
### Crawler
|
||||
|
||||
Automation is the biggest highlight of CVSA's technical design. To achieve this, we use a message queue powered by [BullMQ](https://bullmq.io/) to concurrently process various tasks in the data collection life cycle.
|
||||
Automation is the biggest highlight of CVSA's technical design. To achieve this, we use a message queue powered by
|
||||
[BullMQ](https://bullmq.io/) to concurrently process various tasks in the data collection life cycle.
|
||||
|
@ -9,12 +9,12 @@
|
||||
|
||||
## 技术架构 <a href="#architecture" id="architecture"></a>
|
||||
|
||||
* [概览](architecture/overview.md)
|
||||
* [数据库结构](architecture/database-structure/README.md)
|
||||
* [歌曲类型](architecture/database-structure/type-of-song.md)
|
||||
* [人工智能](architecture/artificial-intelligence.md)
|
||||
* [消息队列](architecture/message-queue/README.md)
|
||||
* [LatestVideosQueue 队列](architecture/message-queue/latestvideosqueue-dui-lie.md)
|
||||
- [概览](architecture/overview.md)
|
||||
- [数据库结构](architecture/database-structure/README.md)
|
||||
- [歌曲类型](architecture/database-structure/type-of-song.md)
|
||||
- [人工智能](architecture/artificial-intelligence.md)
|
||||
- [消息队列](architecture/message-queue/README.md)
|
||||
- [LatestVideosQueue 队列](architecture/message-queue/latestvideosqueue-dui-lie.md)
|
||||
|
||||
## API 文档 <a href="#api-doc" id="api-doc"></a>
|
||||
|
||||
|
@ -4,11 +4,10 @@ CVSA 使用 [PostgreSQL](https://www.postgresql.org/) 作为数据库。
|
||||
|
||||
CVSA 的所有公开数据(不包括用户的个人数据)都存储在名为 `cvsa_main` 的数据库中,该数据库包含以下表:
|
||||
|
||||
* songs:存储歌曲的主要信息
|
||||
* bilibili\_user:存储 Bilibili 用户信息快照
|
||||
* bilibili\_metadata:[分区 30](../../about/scope-of-inclusion.md#vocaloiduatu-fen-qu) 中所有视频的元数据
|
||||
* labelling\_result:包含由我们的 AI 系统 标记的 `all_data` 中视频的标签。
|
||||
* latest\_video\_snapshot:存储视频最新的快照
|
||||
* video\_snapshot:存储视频的快照,包括特定时间下视频的统计信息(播放量、点赞数等)
|
||||
* snapshot\_schedule:视频快照的规划信息,为辅助表
|
||||
|
||||
- songs:存储歌曲的主要信息
|
||||
- bilibili\_user:存储 Bilibili 用户信息快照
|
||||
- bilibili\_metadata:[分区 30](../../about/scope-of-inclusion.md#vocaloiduatu-fen-qu) 中所有视频的元数据
|
||||
- labelling\_result:包含由我们的 AI 系统 标记的 `all_data` 中视频的标签。
|
||||
- latest\_video\_snapshot:存储视频最新的快照
|
||||
- video\_snapshot:存储视频的快照,包括特定时间下视频的统计信息(播放量、点赞数等)
|
||||
- snapshot\_schedule:视频快照的规划信息,为辅助表
|
||||
|
@ -1,2 +1 @@
|
||||
# LatestVideosQueue 队列
|
||||
|
||||
|
@ -20,8 +20,7 @@ layout:
|
||||
|
||||
位于项目目录`packages/crawler` 下,它负责以下工作:
|
||||
|
||||
* 抓取新的视频并收录作品
|
||||
* 持续监控视频的播放量等统计信息
|
||||
- 抓取新的视频并收录作品
|
||||
- 持续监控视频的播放量等统计信息
|
||||
|
||||
整个 crawler 由 BullMQ 消息队列驱动,使用 Redis 和 PostgreSQL 管理状态。
|
||||
|
||||
|
@ -20,7 +20,7 @@ export const dbCredMiddleware = createMiddleware(async (c, next) => {
|
||||
c.set("dbCred", connection);
|
||||
await next();
|
||||
connection.release();
|
||||
})
|
||||
});
|
||||
|
||||
declare module "hono" {
|
||||
interface ContextVariableMap {
|
||||
|
@ -6,13 +6,13 @@ import { registerHandler } from "./register.ts";
|
||||
|
||||
export const app = new Hono();
|
||||
|
||||
app.use('/video/*', dbMiddleware);
|
||||
app.use('/user', dbCredMiddleware);
|
||||
app.use("/video/*", dbMiddleware);
|
||||
app.use("/user", dbCredMiddleware);
|
||||
|
||||
app.get("/", ...rootHandler);
|
||||
|
||||
app.get('/video/:id/snapshots', ...getSnapshotsHanlder);
|
||||
app.post('/user', ...registerHandler);
|
||||
app.get("/video/:id/snapshots", ...getSnapshotsHanlder);
|
||||
app.post("/user", ...registerHandler);
|
||||
|
||||
const fetch = app.fetch;
|
||||
|
||||
|
@ -19,7 +19,7 @@ export const userExists = async (username: string, client: Client) => {
|
||||
`;
|
||||
const result = await client.queryObject(query, [username]);
|
||||
return result.rows.length > 0;
|
||||
}
|
||||
};
|
||||
|
||||
export const registerHandler = createHandlers(async (c: ContextType) => {
|
||||
const client = c.get("dbCred");
|
||||
|
@ -5,27 +5,25 @@ import { createHandlers } from "./utils.ts";
|
||||
export const rootHandler = createHandlers((c) => {
|
||||
let singer: Singer | Singer[] | null = null;
|
||||
const shouldShowSpecialSinger = Math.random() < 0.016;
|
||||
if (getSingerForBirthday().length !== 0){
|
||||
if (getSingerForBirthday().length !== 0) {
|
||||
singer = getSingerForBirthday();
|
||||
for (const s of singer) {
|
||||
delete s.birthday;
|
||||
s.message = `祝${s.name}生日快乐~`
|
||||
s.message = `祝${s.name}生日快乐~`;
|
||||
}
|
||||
}
|
||||
else if (shouldShowSpecialSinger) {
|
||||
} else if (shouldShowSpecialSinger) {
|
||||
singer = pickSpecialSinger();
|
||||
}
|
||||
else {
|
||||
} else {
|
||||
singer = pickSinger();
|
||||
}
|
||||
return c.json({
|
||||
"project": {
|
||||
"name": "中V档案馆",
|
||||
"motto": "一起唱吧,心中的歌!"
|
||||
"motto": "一起唱吧,心中的歌!",
|
||||
},
|
||||
"status": 200,
|
||||
"version": VERSION,
|
||||
"time": Date.now(),
|
||||
"singer": singer
|
||||
})
|
||||
})
|
||||
"singer": singer,
|
||||
});
|
||||
});
|
||||
|
@ -46,8 +46,7 @@ export const getSnapshotsHanlder = createHandlers(async (c: ContextType) => {
|
||||
let videoId: string | number = idParam as string;
|
||||
if (videoId.startsWith("av")) {
|
||||
videoId = parseInt(videoId.slice(2));
|
||||
}
|
||||
else if (await number().isValid(videoId)) {
|
||||
} else if (await number().isValid(videoId)) {
|
||||
videoId = parseInt(videoId);
|
||||
}
|
||||
const queryParams = await SnapshotQueryParamsSchema.validate(c.req.query());
|
||||
|
@ -1,4 +1,4 @@
|
||||
import { createFactory } from 'hono/factory'
|
||||
import { createFactory } from "hono/factory";
|
||||
|
||||
const factory = createFactory();
|
||||
|
||||
|
@ -1,8 +1,15 @@
|
||||
import type { Client } from "https://deno.land/x/postgres@v0.19.3/mod.ts";
|
||||
import type { VideoSnapshotType } from "./schema.d.ts";
|
||||
|
||||
export async function getVideoSnapshots(client: Client, aid: number, limit: number, pageOrOffset: number, reverse: boolean, mode: 'page' | 'offset' = 'page') {
|
||||
const offset = mode === 'page' ? (pageOrOffset - 1) * limit : pageOrOffset;
|
||||
export async function getVideoSnapshots(
|
||||
client: Client,
|
||||
aid: number,
|
||||
limit: number,
|
||||
pageOrOffset: number,
|
||||
reverse: boolean,
|
||||
mode: "page" | "offset" = "page",
|
||||
) {
|
||||
const offset = mode === "page" ? (pageOrOffset - 1) * limit : pageOrOffset;
|
||||
const queryDesc: string = `
|
||||
SELECT *
|
||||
FROM video_snapshot
|
||||
@ -23,8 +30,15 @@ export async function getVideoSnapshots(client: Client, aid: number, limit: numb
|
||||
return queryResult.rows;
|
||||
}
|
||||
|
||||
export async function getVideoSnapshotsByBV(client: Client, bv: string, limit: number, pageOrOffset: number, reverse: boolean, mode: 'page' | 'offset' = 'page') {
|
||||
const offset = mode === 'page' ? (pageOrOffset - 1) * limit : pageOrOffset;
|
||||
export async function getVideoSnapshotsByBV(
|
||||
client: Client,
|
||||
bv: string,
|
||||
limit: number,
|
||||
pageOrOffset: number,
|
||||
reverse: boolean,
|
||||
mode: "page" | "offset" = "page",
|
||||
) {
|
||||
const offset = mode === "page" ? (pageOrOffset - 1) * limit : pageOrOffset;
|
||||
const queryAsc = `
|
||||
SELECT vs.*
|
||||
FROM video_snapshot vs
|
||||
@ -41,7 +55,7 @@ export async function getVideoSnapshotsByBV(client: Client, bv: string, limit: n
|
||||
WHERE bm.bvid = $1
|
||||
ORDER BY vs.created_at DESC
|
||||
LIMIT $2 OFFSET $3
|
||||
`
|
||||
`;
|
||||
const query = reverse ? queryAsc : queryDesc;
|
||||
const queryResult = await client.queryObject<VideoSnapshotType>(query, [bv, limit, offset]);
|
||||
return queryResult.rows;
|
||||
|
@ -1,5 +1,5 @@
|
||||
import { Pool } from "https://deno.land/x/postgres@v0.19.3/mod.ts";
|
||||
import { postgresConfig } from "@core/db/pgConfig.ts";
|
||||
import { postgresConfig } from "@core/db/pgConfig";
|
||||
|
||||
const pool = new Pool(postgresConfig, 12);
|
||||
|
||||
|
32
packages/crawler/db/withConnection.ts
Normal file
32
packages/crawler/db/withConnection.ts
Normal file
@ -0,0 +1,32 @@
|
||||
import { Client } from "https://deno.land/x/postgres@v0.19.3/mod.ts";
|
||||
import { db } from "db/init.ts";
|
||||
|
||||
/**
|
||||
* Executes a function with a database connection.
|
||||
* @param operation The function that accepts the `client` as the parameter.
|
||||
* @param errorHandling Optional function to handle errors.
|
||||
* If no error handling function is provided, the error will be re-thrown.
|
||||
* @param cleanup Optional function to execute after the operation.
|
||||
* @returns The result of the operation or undefined if an error occurred.
|
||||
*/
|
||||
export async function withDbConnection<T>(
|
||||
operation: (client: Client) => Promise<T>,
|
||||
errorHandling?: (error: unknown) => void,
|
||||
cleanup?: () => void,
|
||||
): Promise<T | undefined> {
|
||||
const client = await db.connect();
|
||||
try {
|
||||
return await operation(client);
|
||||
} catch (error) {
|
||||
if (errorHandling) {
|
||||
errorHandling(error);
|
||||
return;
|
||||
}
|
||||
throw error;
|
||||
} finally {
|
||||
client.release();
|
||||
if (cleanup) {
|
||||
cleanup();
|
||||
}
|
||||
}
|
||||
}
|
@ -38,7 +38,8 @@
|
||||
"src/": "./src/",
|
||||
"onnxruntime": "npm:onnxruntime-node@1.19.2",
|
||||
"chalk": "npm:chalk",
|
||||
"@core/db/schema": "../core/db/schema.d.ts"
|
||||
"@core/db/schema": "../core/db/schema.d.ts",
|
||||
"@core/db/pgConfig": "../core/db/pgConfig.ts"
|
||||
},
|
||||
"exports": "./main.ts"
|
||||
}
|
||||
|
@ -11,7 +11,6 @@ import {
|
||||
getLatestSnapshot,
|
||||
getSnapshotsInNextSecond,
|
||||
getVideosWithoutActiveSnapshotSchedule,
|
||||
hasAtLeast2Snapshots,
|
||||
scheduleSnapshot,
|
||||
setSnapshotStatus,
|
||||
snapshotScheduleExists,
|
||||
@ -22,12 +21,13 @@ import { HOUR, MINUTE, SECOND, WEEK } from "$std/datetime/constants.ts";
|
||||
import logger from "log/logger.ts";
|
||||
import { SnapshotQueue } from "mq/index.ts";
|
||||
import { insertVideoSnapshot } from "mq/task/getVideoStats.ts";
|
||||
import { NetSchedulerError } from "mq/scheduler.ts";
|
||||
import { NetSchedulerError } from "net/delegate.ts";
|
||||
import { getBiliVideoStatus, setBiliVideoStatus } from "db/allData.ts";
|
||||
import { truncate } from "utils/truncate.ts";
|
||||
import { lockManager } from "mq/lockManager.ts";
|
||||
import { getSongsPublihsedAt } from "db/songs.ts";
|
||||
import { bulkGetVideoStats } from "net/bulkGetVideoStats.ts";
|
||||
import { getAdjustedShortTermETA } from "../scheduling.ts";
|
||||
|
||||
const priorityMap: { [key: string]: number } = {
|
||||
"milestone": 1,
|
||||
@ -101,52 +101,6 @@ export const closetMilestone = (views: number) => {
|
||||
return 10000000;
|
||||
};
|
||||
|
||||
const log = (value: number, base: number = 10) => Math.log(value) / Math.log(base);
|
||||
|
||||
/*
|
||||
* Returns the minimum ETA in hours for the next snapshot
|
||||
* @param client - Postgres client
|
||||
* @param aid - aid of the video
|
||||
* @returns ETA in hours
|
||||
*/
|
||||
export const getAdjustedShortTermETA = async (client: Client, aid: number) => {
|
||||
const latestSnapshot = await getLatestSnapshot(client, aid);
|
||||
// Immediately dispatch a snapshot if there is no snapshot yet
|
||||
if (!latestSnapshot) return 0;
|
||||
const snapshotsEnough = await hasAtLeast2Snapshots(client, aid);
|
||||
if (!snapshotsEnough) return 0;
|
||||
|
||||
const currentTimestamp = new Date().getTime();
|
||||
const timeIntervals = [3 * MINUTE, 20 * MINUTE, 1 * HOUR, 3 * HOUR, 6 * HOUR, 72 * HOUR];
|
||||
const DELTA = 0.00001;
|
||||
let minETAHours = Infinity;
|
||||
|
||||
for (const timeInterval of timeIntervals) {
|
||||
const date = new Date(currentTimestamp - timeInterval);
|
||||
const snapshot = await findClosestSnapshot(client, aid, date);
|
||||
if (!snapshot) continue;
|
||||
const hoursDiff = (latestSnapshot.created_at - snapshot.created_at) / HOUR;
|
||||
const viewsDiff = latestSnapshot.views - snapshot.views;
|
||||
if (viewsDiff <= 0) continue;
|
||||
const speed = viewsDiff / (hoursDiff + DELTA);
|
||||
const target = closetMilestone(latestSnapshot.views);
|
||||
const viewsToIncrease = target - latestSnapshot.views;
|
||||
const eta = viewsToIncrease / (speed + DELTA);
|
||||
let factor = log(2.97 / log(viewsToIncrease + 1), 1.14);
|
||||
factor = truncate(factor, 3, 100);
|
||||
const adjustedETA = eta / factor;
|
||||
if (adjustedETA < minETAHours) {
|
||||
minETAHours = adjustedETA;
|
||||
}
|
||||
}
|
||||
|
||||
if (isNaN(minETAHours)) {
|
||||
minETAHours = Infinity;
|
||||
}
|
||||
|
||||
return minETAHours;
|
||||
};
|
||||
|
||||
export const collectMilestoneSnapshotsWorker = async (_job: Job) => {
|
||||
const client = await db.connect();
|
||||
try {
|
||||
|
51
packages/crawler/mq/scheduling.ts
Normal file
51
packages/crawler/mq/scheduling.ts
Normal file
@ -0,0 +1,51 @@
|
||||
/*
|
||||
* Returns the minimum ETA in hours for the next snapshot
|
||||
* @param client - Postgres client
|
||||
* @param aid - aid of the video
|
||||
* @returns ETA in hours
|
||||
*/
|
||||
import { findClosestSnapshot, getLatestSnapshot, hasAtLeast2Snapshots } from "db/snapshotSchedule.ts";
|
||||
import { truncate } from "utils/truncate.ts";
|
||||
import { closetMilestone } from "./exec/snapshotTick.ts";
|
||||
import { Client } from "https://deno.land/x/postgres@v0.19.3/mod.ts";
|
||||
import { HOUR, MINUTE } from "$std/datetime/constants.ts";
|
||||
|
||||
const log = (value: number, base: number = 10) => Math.log(value) / Math.log(base);
|
||||
|
||||
export const getAdjustedShortTermETA = async (client: Client, aid: number) => {
|
||||
const latestSnapshot = await getLatestSnapshot(client, aid);
|
||||
// Immediately dispatch a snapshot if there is no snapshot yet
|
||||
if (!latestSnapshot) return 0;
|
||||
const snapshotsEnough = await hasAtLeast2Snapshots(client, aid);
|
||||
if (!snapshotsEnough) return 0;
|
||||
|
||||
const currentTimestamp = new Date().getTime();
|
||||
const timeIntervals = [3 * MINUTE, 20 * MINUTE, 1 * HOUR, 3 * HOUR, 6 * HOUR, 72 * HOUR];
|
||||
const DELTA = 0.00001;
|
||||
let minETAHours = Infinity;
|
||||
|
||||
for (const timeInterval of timeIntervals) {
|
||||
const date = new Date(currentTimestamp - timeInterval);
|
||||
const snapshot = await findClosestSnapshot(client, aid, date);
|
||||
if (!snapshot) continue;
|
||||
const hoursDiff = (latestSnapshot.created_at - snapshot.created_at) / HOUR;
|
||||
const viewsDiff = latestSnapshot.views - snapshot.views;
|
||||
if (viewsDiff <= 0) continue;
|
||||
const speed = viewsDiff / (hoursDiff + DELTA);
|
||||
const target = closetMilestone(latestSnapshot.views);
|
||||
const viewsToIncrease = target - latestSnapshot.views;
|
||||
const eta = viewsToIncrease / (speed + DELTA);
|
||||
let factor = log(2.97 / log(viewsToIncrease + 1), 1.14);
|
||||
factor = truncate(factor, 3, 100);
|
||||
const adjustedETA = eta / factor;
|
||||
if (adjustedETA < minETAHours) {
|
||||
minETAHours = adjustedETA;
|
||||
}
|
||||
}
|
||||
|
||||
if (isNaN(minETAHours)) {
|
||||
minETAHours = Infinity;
|
||||
}
|
||||
|
||||
return minETAHours;
|
||||
};
|
@ -1,4 +1,4 @@
|
||||
import netScheduler from "mq/scheduler.ts";
|
||||
import networkDelegate from "./delegate.ts";
|
||||
import { MediaListInfoData, MediaListInfoResponse } from "net/bilibili.d.ts";
|
||||
import logger from "log/logger.ts";
|
||||
|
||||
@ -16,7 +16,7 @@ export async function bulkGetVideoStats(aids: number[]): Promise<MediaListInfoDa
|
||||
for (const aid of aids) {
|
||||
url += `${aid}:2,`;
|
||||
}
|
||||
const data = await netScheduler.request<MediaListInfoResponse>(url, "bulkSnapshot");
|
||||
const data = await networkDelegate.request<MediaListInfoResponse>(url, "bulkSnapshot");
|
||||
const errMessage = `Error fetching metadata for aid list: ${aids.join(",")}:`;
|
||||
if (data.code !== 0) {
|
||||
logger.error(errMessage + data.code + "-" + data.message, "net", "fn:getVideoInfo");
|
||||
|
@ -19,7 +19,7 @@ interface ProxiesMap {
|
||||
[name: string]: Proxy;
|
||||
}
|
||||
|
||||
type NetSchedulerErrorCode =
|
||||
type NetworkDelegateErrorCode =
|
||||
| "NO_PROXY_AVAILABLE"
|
||||
| "PROXY_RATE_LIMITED"
|
||||
| "PROXY_NOT_FOUND"
|
||||
@ -28,9 +28,9 @@ type NetSchedulerErrorCode =
|
||||
| "ALICLOUD_PROXY_ERR";
|
||||
|
||||
export class NetSchedulerError extends Error {
|
||||
public code: NetSchedulerErrorCode;
|
||||
public code: NetworkDelegateErrorCode;
|
||||
public rawError: unknown | undefined;
|
||||
constructor(message: string, errorCode: NetSchedulerErrorCode, rawError?: unknown) {
|
||||
constructor(message: string, errorCode: NetworkDelegateErrorCode, rawError?: unknown) {
|
||||
super(message);
|
||||
this.name = "NetSchedulerError";
|
||||
this.code = errorCode;
|
||||
@ -59,7 +59,7 @@ function shuffleArray<T>(array: T[]): T[] {
|
||||
return newArray;
|
||||
}
|
||||
|
||||
class NetScheduler {
|
||||
class NetworkDelegate {
|
||||
private proxies: ProxiesMap = {};
|
||||
private providerLimiters: LimiterMap = {};
|
||||
private proxyLimiters: OptionalLimiterMap = {};
|
||||
@ -69,15 +69,6 @@ class NetScheduler {
|
||||
this.proxies[proxyName] = { type, data };
|
||||
}
|
||||
|
||||
removeProxy(proxyName: string): void {
|
||||
if (!this.proxies[proxyName]) {
|
||||
throw new Error(`Proxy ${proxyName} not found`);
|
||||
}
|
||||
delete this.proxies[proxyName];
|
||||
// Clean up associated limiters
|
||||
this.cleanupProxyLimiters(proxyName);
|
||||
}
|
||||
|
||||
private cleanupProxyLimiters(proxyName: string): void {
|
||||
for (const limiterId in this.proxyLimiters) {
|
||||
if (limiterId.startsWith(`proxy-${proxyName}`)) {
|
||||
@ -294,7 +285,7 @@ class NetScheduler {
|
||||
}
|
||||
}
|
||||
|
||||
const netScheduler = new NetScheduler();
|
||||
const networkDelegate = new NetworkDelegate();
|
||||
const videoInfoRateLimiterConfig: RateLimiterConfig[] = [
|
||||
{
|
||||
window: new SlidingWindow(redis, 0.3),
|
||||
@ -368,14 +359,14 @@ but both should come after addProxy and addTask to ensure proper setup and depen
|
||||
*/
|
||||
|
||||
const regions = ["shanghai", "hangzhou", "qingdao", "beijing", "zhangjiakou", "chengdu", "shenzhen", "hohhot"];
|
||||
netScheduler.addProxy("native", "native", "");
|
||||
networkDelegate.addProxy("native", "native", "");
|
||||
for (const region of regions) {
|
||||
netScheduler.addProxy(`alicloud-${region}`, "alicloud-fc", region);
|
||||
networkDelegate.addProxy(`alicloud-${region}`, "alicloud-fc", region);
|
||||
}
|
||||
netScheduler.addTask("getVideoInfo", "bilibili", "all");
|
||||
netScheduler.addTask("getLatestVideos", "bilibili", "all");
|
||||
netScheduler.addTask("snapshotMilestoneVideo", "bilibili", regions.map((region) => `alicloud-${region}`));
|
||||
netScheduler.addTask("snapshotVideo", "bili_test", [
|
||||
networkDelegate.addTask("getVideoInfo", "bilibili", "all");
|
||||
networkDelegate.addTask("getLatestVideos", "bilibili", "all");
|
||||
networkDelegate.addTask("snapshotMilestoneVideo", "bilibili", regions.map((region) => `alicloud-${region}`));
|
||||
networkDelegate.addTask("snapshotVideo", "bili_test", [
|
||||
"alicloud-qingdao",
|
||||
"alicloud-shanghai",
|
||||
"alicloud-zhangjiakou",
|
||||
@ -383,7 +374,7 @@ netScheduler.addTask("snapshotVideo", "bili_test", [
|
||||
"alicloud-shenzhen",
|
||||
"alicloud-hohhot",
|
||||
]);
|
||||
netScheduler.addTask("bulkSnapshot", "bili_strict", [
|
||||
networkDelegate.addTask("bulkSnapshot", "bili_strict", [
|
||||
"alicloud-qingdao",
|
||||
"alicloud-shanghai",
|
||||
"alicloud-zhangjiakou",
|
||||
@ -391,13 +382,13 @@ netScheduler.addTask("bulkSnapshot", "bili_strict", [
|
||||
"alicloud-shenzhen",
|
||||
"alicloud-hohhot",
|
||||
]);
|
||||
netScheduler.setTaskLimiter("getVideoInfo", videoInfoRateLimiterConfig);
|
||||
netScheduler.setTaskLimiter("getLatestVideos", null);
|
||||
netScheduler.setTaskLimiter("snapshotMilestoneVideo", null);
|
||||
netScheduler.setTaskLimiter("snapshotVideo", null);
|
||||
netScheduler.setTaskLimiter("bulkSnapshot", null);
|
||||
netScheduler.setProviderLimiter("bilibili", biliLimiterConfig);
|
||||
netScheduler.setProviderLimiter("bili_test", bili_test);
|
||||
netScheduler.setProviderLimiter("bili_strict", bili_strict);
|
||||
networkDelegate.setTaskLimiter("getVideoInfo", videoInfoRateLimiterConfig);
|
||||
networkDelegate.setTaskLimiter("getLatestVideos", null);
|
||||
networkDelegate.setTaskLimiter("snapshotMilestoneVideo", null);
|
||||
networkDelegate.setTaskLimiter("snapshotVideo", null);
|
||||
networkDelegate.setTaskLimiter("bulkSnapshot", null);
|
||||
networkDelegate.setProviderLimiter("bilibili", biliLimiterConfig);
|
||||
networkDelegate.setProviderLimiter("bili_test", bili_test);
|
||||
networkDelegate.setProviderLimiter("bili_strict", bili_strict);
|
||||
|
||||
export default netScheduler;
|
||||
export default networkDelegate;
|
@ -1,6 +1,6 @@
|
||||
import { VideoListResponse } from "net/bilibili.d.ts";
|
||||
import logger from "log/logger.ts";
|
||||
import netScheduler from "mq/scheduler.ts";
|
||||
import networkDelegate from "./delegate.ts";
|
||||
|
||||
export async function getLatestVideoAids(page: number = 1, pageSize: number = 10): Promise<number[]> {
|
||||
const startFrom = 1 + pageSize * (page - 1);
|
||||
@ -8,7 +8,7 @@ export async function getLatestVideoAids(page: number = 1, pageSize: number = 10
|
||||
const range = `${startFrom}-${endTo}`;
|
||||
const errMessage = `Error fetching latest aid for ${range}:`;
|
||||
const url = `https://api.bilibili.com/x/web-interface/newlist?rid=30&ps=${pageSize}&pn=${page}`;
|
||||
const data = await netScheduler.request<VideoListResponse>(url, "getLatestVideos");
|
||||
const data = await networkDelegate.request<VideoListResponse>(url, "getLatestVideos");
|
||||
if (data.code != 0) {
|
||||
logger.error(errMessage + data.message, "net", "getLastestVideos");
|
||||
return [];
|
||||
|
@ -1,10 +1,10 @@
|
||||
import netScheduler from "mq/scheduler.ts";
|
||||
import networkDelegate from "./delegate.ts";
|
||||
import { VideoDetailsData, VideoDetailsResponse } from "net/bilibili.d.ts";
|
||||
import logger from "log/logger.ts";
|
||||
|
||||
export async function getVideoDetails(aid: number): Promise<VideoDetailsData | null> {
|
||||
const url = `https://api.bilibili.com/x/web-interface/view/detail?aid=${aid}`;
|
||||
const data = await netScheduler.request<VideoDetailsResponse>(url, "getVideoInfo");
|
||||
const data = await networkDelegate.request<VideoDetailsResponse>(url, "getVideoInfo");
|
||||
const errMessage = `Error fetching metadata for ${aid}:`;
|
||||
if (data.code !== 0) {
|
||||
logger.error(errMessage + data.code + "-" + data.message, "net", "fn:getVideoInfo");
|
||||
|
@ -1,4 +1,4 @@
|
||||
import netScheduler from "mq/scheduler.ts";
|
||||
import networkDelegate from "./delegate.ts";
|
||||
import { VideoInfoData, VideoInfoResponse } from "net/bilibili.d.ts";
|
||||
import logger from "log/logger.ts";
|
||||
|
||||
@ -17,7 +17,7 @@ import logger from "log/logger.ts";
|
||||
*/
|
||||
export async function getVideoInfo(aid: number, task: string): Promise<VideoInfoData | number> {
|
||||
const url = `https://api.bilibili.com/x/web-interface/view?aid=${aid}`;
|
||||
const data = await netScheduler.request<VideoInfoResponse>(url, task);
|
||||
const data = await networkDelegate.request<VideoInfoResponse>(url, task);
|
||||
const errMessage = `Error fetching metadata for ${aid}:`;
|
||||
if (data.code !== 0) {
|
||||
logger.error(errMessage + data.code + "-" + data.message, "net", "fn:getVideoInfo");
|
||||
|
@ -19,6 +19,6 @@ export default defineConfig({
|
||||
allow: [".", "../../"],
|
||||
},
|
||||
},
|
||||
plugins: [tsconfigPaths()]
|
||||
plugins: [tsconfigPaths()],
|
||||
},
|
||||
});
|
||||
|
@ -1,13 +1,25 @@
|
||||
const N_1024 = BigInt("129023318876534346704360951712586568674758913224876821534686030409476129469193481910786173836188085930974906857867802234113909470848523288588793477904039083513378341278558405407018889387577114155572311708428733260891448259786041525189132461448841652472631435226032063278124857443496954605482776113964107326943")
|
||||
const N_1024 = BigInt(
|
||||
"129023318876534346704360951712586568674758913224876821534686030409476129469193481910786173836188085930974906857867802234113909470848523288588793477904039083513378341278558405407018889387577114155572311708428733260891448259786041525189132461448841652472631435226032063278124857443496954605482776113964107326943",
|
||||
);
|
||||
|
||||
const N_2048 = BigInt("23987552118069940970878653610463005981599204778388399885550631951871084945075866571231062435627294546200946516668493107358732376187241747090707087544153108117326163500579370560400058549184722138636116585329496684877258304519458316233517215780035360354808658620079068489084797380781488445517430961701007542207001544091884001098497324624368085682074645221148086075871342544591022944384890014176612259729018968864426602901247715051556212559854689574013699665035317257438297910516976812428036717668766321871780963854649899276251822244719887233041422346429752896925499321431273560130952088238625622570366815755926694833109")
|
||||
const N_2048 = BigInt(
|
||||
"23987552118069940970878653610463005981599204778388399885550631951871084945075866571231062435627294546200946516668493107358732376187241747090707087544153108117326163500579370560400058549184722138636116585329496684877258304519458316233517215780035360354808658620079068489084797380781488445517430961701007542207001544091884001098497324624368085682074645221148086075871342544591022944384890014176612259729018968864426602901247715051556212559854689574013699665035317257438297910516976812428036717668766321871780963854649899276251822244719887233041422346429752896925499321431273560130952088238625622570366815755926694833109",
|
||||
);
|
||||
|
||||
const N_1792 = BigInt("23987552118069940970878653610463005981599204778388399885550631951871084945075866571231062435627294546200946516668493107358732376187241747090707087544153108117326163500579370560400058549184722138636116585329496684877258304519458316233517215780035360354808658620079068489084797380781488445517430961701007542207001544091884001098497324624368085682074645221148086075871342544591022944384890014176612259729018968864426602901247715051556212559854689574013699665035317257438297910516976812428036717668766321871780963854649899276251822244719887233041422346429752896925499321431273560130952088238625622570366815755926694833109")
|
||||
const N_1792 = BigInt(
|
||||
"23987552118069940970878653610463005981599204778388399885550631951871084945075866571231062435627294546200946516668493107358732376187241747090707087544153108117326163500579370560400058549184722138636116585329496684877258304519458316233517215780035360354808658620079068489084797380781488445517430961701007542207001544091884001098497324624368085682074645221148086075871342544591022944384890014176612259729018968864426602901247715051556212559854689574013699665035317257438297910516976812428036717668766321871780963854649899276251822244719887233041422346429752896925499321431273560130952088238625622570366815755926694833109",
|
||||
);
|
||||
|
||||
const N_1536 = BigInt("1694330250214463438908848400950857073137355630337290254958754184668036770489801447652464038218330711288158361242955860326168191830448553710492926795708495297280933502917598985378231124113971732841791156356676046934277122699383776036675381503510992810963611269045078440132744168908318454891211962146563551929591147663448816841024591820348784855441153716551049843185172472891407933214238000452095646085222944171689449292644270516031799660928056315886939284985905227")
|
||||
const N_1536 = BigInt(
|
||||
"1694330250214463438908848400950857073137355630337290254958754184668036770489801447652464038218330711288158361242955860326168191830448553710492926795708495297280933502917598985378231124113971732841791156356676046934277122699383776036675381503510992810963611269045078440132744168908318454891211962146563551929591147663448816841024591820348784855441153716551049843185172472891407933214238000452095646085222944171689449292644270516031799660928056315886939284985905227",
|
||||
);
|
||||
|
||||
const N_3072 = BigInt("4432919939296042464443862503456460073874727648022810391370558006281079088795179408238989283371442564716849343712703672836423961818025813387453469700639513190304802553045342607888612037304066433501317127429264242784608682213025490491212489901736408833027611579294436675682774458141490718959615677971745638214649336218217578937534746160749039668886450447773018369168258067682196337978245372237157696236362344796867228581553446331915147012787367438751646936429739232247148712001806846526947508445039707404287951727838234648917450736371192435665040644040487427986702098273581288935278964444790007953559851323281510927332862225214878776790605026472021669614552481167977412450477230442015077669503312683966631454347169703030544483487968842349634064181183599641180349414682042575010303056241481622837185325228233789954078775053744988023738762706404546546146837242590884760044438874357295029411988267287001033032827035809135092270843")
|
||||
const N_3072 = BigInt(
|
||||
"4432919939296042464443862503456460073874727648022810391370558006281079088795179408238989283371442564716849343712703672836423961818025813387453469700639513190304802553045342607888612037304066433501317127429264242784608682213025490491212489901736408833027611579294436675682774458141490718959615677971745638214649336218217578937534746160749039668886450447773018369168258067682196337978245372237157696236362344796867228581553446331915147012787367438751646936429739232247148712001806846526947508445039707404287951727838234648917450736371192435665040644040487427986702098273581288935278964444790007953559851323281510927332862225214878776790605026472021669614552481167977412450477230442015077669503312683966631454347169703030544483487968842349634064181183599641180349414682042575010303056241481622837185325228233789954078775053744988023738762706404546546146837242590884760044438874357295029411988267287001033032827035809135092270843",
|
||||
);
|
||||
|
||||
const N_4096 = BigInt("703671044356805218391078271512201582198770553281951369783674142891088501340774249238173262580562112786670043634665390581120113644316651934154746357220932310140476300088580654571796404198410555061275065442553506658401183560336140989074165998202690496991174269748740565700402715364422506782445179963440819952745241176450402011121226863984008975377353558155910994380700267903933205531681076494639818328879475919332604951949178075254600102192323286738973253864238076198710173840170988339024438220034106150475640983877458155141500313471699516670799821379238743709125064098477109094533426340852518505385314780319279862586851512004686798362431227795743253799490998475141728082088984359237540124375439664236138519644100625154580910233437864328111620708697941949936338367445851449766581651338876219676721272448769082914348242483068204896479076062102236087066428603930888978596966798402915747531679758905013008059396214343112694563043918465373870648649652122703709658068801764236979191262744515840224548957285182453209028157886219424802426566456408109642062498413592155064289314088837031184200671561102160059065729282902863248815224399131391716503171191977463328439766546574118092303414702384104112719959325482439604572518549918705623086363111")
|
||||
const N_4096 = BigInt(
|
||||
"703671044356805218391078271512201582198770553281951369783674142891088501340774249238173262580562112786670043634665390581120113644316651934154746357220932310140476300088580654571796404198410555061275065442553506658401183560336140989074165998202690496991174269748740565700402715364422506782445179963440819952745241176450402011121226863984008975377353558155910994380700267903933205531681076494639818328879475919332604951949178075254600102192323286738973253864238076198710173840170988339024438220034106150475640983877458155141500313471699516670799821379238743709125064098477109094533426340852518505385314780319279862586851512004686798362431227795743253799490998475141728082088984359237540124375439664236138519644100625154580910233437864328111620708697941949936338367445851449766581651338876219676721272448769082914348242483068204896479076062102236087066428603930888978596966798402915747531679758905013008059396214343112694563043918465373870648649652122703709658068801764236979191262744515840224548957285182453209028157886219424802426566456408109642062498413592155064289314088837031184200671561102160059065729282902863248815224399131391716503171191977463328439766546574118092303414702384104112719959325482439604572518549918705623086363111",
|
||||
);
|
||||
|
||||
export const N_ARRAY = [N_1024, N_1536, N_1792, N_2048, N_3072, N_4096];
|
@ -33,7 +33,6 @@
|
||||
|
||||
参见[CVSA文档](https://docs.projectcvsa.com/)。
|
||||
|
||||
|
||||
## 开放许可
|
||||
|
||||
受本文以[CC BY-NC-SA 4.0协议](https://creativecommons.org/licenses/by-nc-sa/4.0/)提供。
|
||||
|
Loading…
Reference in New Issue
Block a user