From 3ebeaf465573953451ffec21f93396b5972836a7 Mon Sep 17 00:00:00 2001 From: alikia2x Date: Mon, 16 Sep 2024 17:34:13 +0800 Subject: [PATCH] update: readme --- README.md | 12 ++++++++---- dataset/public/README.md | 16 ++++++++++------ 2 files changed, 18 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index f8bfa84..4dff30d 100644 --- a/README.md +++ b/README.md @@ -1,15 +1,19 @@ # sparkastML -This repository contains the machine learning components for the [sparkast](https://github.com/alikia2x/sparkast) project. +This repository houses the machine learning components for the [sparkast](https://github.com/alikia2x/sparkast) project. -The primary goal of this lab is to enhance the search functionality of sparkast, enabling users to receive real-time answers as they type their queries. +The primary objective of this project is to enhance the search functionality of sparkast, allowing users to receive real-time answers as they type their queries. ## Intention Classification The model located in the `/intention-classify` directory is designed to categorize user queries into predefined classes. -We employ a Convolutional Neural Network (CNN) architecture combined with an Energy-based Model for open-set recognition. +We utilize a Convolutional Neural Network (CNN) architecture in conjunction with an Energy-based Model for open-set recognition. This model is optimized to be lightweight, ensuring it can run on a wide range of devices, including within the browser environment. -A detailed explain of how it works could be found in [this blog post](https://blog.alikia2x.com/en/posts/sparkastml-intention/). +For a detailed explanation of how it works, you can refer to [this blog post](https://blog.alikia2x.com/en/posts/sparkastml-intention/). + +## Dataset + +To support the development of Libre Intelligence, we have made a series of datasets publicly available. You can access them [here](./dataset/public/README.md). diff --git a/dataset/public/README.md b/dataset/public/README.md index 33be04f..fd4e3b5 100644 --- a/dataset/public/README.md +++ b/dataset/public/README.md @@ -1,13 +1,17 @@ # sparkastML Datasets -Here are the datasets published by sparkastML project. +This repository contains datasets published by the sparkastML project. ## Translation ZH-EN -High-quality, fresh synthetic data containing over 100,000 sentences of Chinese-English parallel corpora. +This dataset features high-quality, fresh synthetic data comprising over 100,000 sentences of Chinese-English parallel corpora. -Version: 1 -Last Update: 2024/09/16 +### Details -[Google Drive](https://drive.google.com/drive/folders/1_ADblZcB5p9BUvawkYDmp1qIUDZgkkoe?usp=sharing) -[IPFS](https://ipfs.a2x.pub/ipfs/QmYz4ew4nSzPc6TZvoWk6jXpGN82qt3J46nwfb75N2YKc4/) +- **Version:** 1 +- **Last Update:** 2024/09/16 + +### Download Links + +- **Google Drive:** [Download from Google Drive](https://drive.google.com/drive/folders/1_ADblZcB5p9BUvawkYDmp1qIUDZgkkoe?usp=sharing) +- **IPFS:** [Download from IPFS](https://ipfs.a2x.pub/ipfs/QmYz4ew4nSzPc6TZvoWk6jXpGN82qt3J46nwfb75N2YKc4/)