update: readme

This commit is contained in:
alikia2x (寒寒) 2024-09-16 17:34:13 +08:00
parent 932cbd4336
commit 2f33ff480a
Signed by: alikia2x
GPG Key ID: 56209E0CCD8420C6
2 changed files with 18 additions and 10 deletions

View File

@ -1,15 +1,19 @@
# sparkastML # sparkastML
This repository contains the machine learning components for the [sparkast](https://github.com/alikia2x/sparkast) project. This repository houses the machine learning components for the [sparkast](https://github.com/alikia2x/sparkast) project.
The primary goal of this lab is to enhance the search functionality of sparkast, enabling users to receive real-time answers as they type their queries. The primary objective of this project is to enhance the search functionality of sparkast, allowing users to receive real-time answers as they type their queries.
## Intention Classification ## Intention Classification
The model located in the `/intention-classify` directory is designed to categorize user queries into predefined classes. The model located in the `/intention-classify` directory is designed to categorize user queries into predefined classes.
We employ a Convolutional Neural Network (CNN) architecture combined with an Energy-based Model for open-set recognition. We utilize a Convolutional Neural Network (CNN) architecture in conjunction with an Energy-based Model for open-set recognition.
This model is optimized to be lightweight, ensuring it can run on a wide range of devices, including within the browser environment. This model is optimized to be lightweight, ensuring it can run on a wide range of devices, including within the browser environment.
A detailed explain of how it works could be found in [this blog post](https://blog.alikia2x.com/en/posts/sparkastml-intention/). For a detailed explanation of how it works, you can refer to [this blog post](https://blog.alikia2x.com/en/posts/sparkastml-intention/).
## Dataset
To support the development of Libre Intelligence, we have made a series of datasets publicly available. You can access them [here](./dataset/public/README.md).

View File

@ -1,13 +1,17 @@
# sparkastML Datasets # sparkastML Datasets
Here are the datasets published by sparkastML project. This repository contains datasets published by the sparkastML project.
## Translation ZH-EN ## Translation ZH-EN
High-quality, fresh synthetic data containing over 100,000 sentences of Chinese-English parallel corpora. This dataset features high-quality, fresh synthetic data comprising over 100,000 sentences of Chinese-English parallel corpora.
Version: 1 ### Details
Last Update: 2024/09/16
[Google Drive](https://drive.google.com/drive/folders/1_ADblZcB5p9BUvawkYDmp1qIUDZgkkoe?usp=sharing) - **Version:** 1
[IPFS](https://ipfs.a2x.pub/ipfs/QmYz4ew4nSzPc6TZvoWk6jXpGN82qt3J46nwfb75N2YKc4/) - **Last Update:** 2024/09/16
### Download Links
- **Google Drive:** [Download from Google Drive](https://drive.google.com/drive/folders/1_ADblZcB5p9BUvawkYDmp1qIUDZgkkoe?usp=sharing)
- **IPFS:** [Download from IPFS](https://ipfs.a2x.pub/ipfs/QmYz4ew4nSzPc6TZvoWk6jXpGN82qt3J46nwfb75N2YKc4/)