update: README
This commit is contained in:
parent
6500e378be
commit
580753bb6f
18
README.md
18
README.md
@ -1,28 +1,28 @@
|
||||
# sparkastML
|
||||
|
||||
This repository houses the machine learning components for the [sparkast](https://github.com/alikia2x/sparkast) project.
|
||||
This repository contains the machine learning components for the [sparkast](https://github.com/alikia2x/sparkast) project.
|
||||
|
||||
The primary objective of this project is to enhance the search functionality of sparkast, allowing users to receive real-time answers as they type their queries.
|
||||
The main goal of this project is to improve the search functionality of sparkast, enabling users to receive real-time answers as they type their queries.
|
||||
|
||||
## Intention Classification
|
||||
|
||||
The model located in the `/intention-classify` directory is designed to categorize user queries into predefined classes.
|
||||
The model in the `/intention-classify` directory is designed to categorize user queries into predefined classes.
|
||||
|
||||
We utilize a Convolutional Neural Network (CNN) architecture in conjunction with an Energy-based Model for open-set recognition.
|
||||
We use a Convolutional Neural Network (CNN) architecture combined with an Energy-based Model for open-set recognition.
|
||||
|
||||
This model is optimized to be lightweight, ensuring it can run on a wide range of devices, including within the browser environment.
|
||||
|
||||
For a detailed explanation of how it works, you can refer to [this blog post](https://blog.alikia2x.com/en/posts/sparkastml-intention/).
|
||||
For a detailed explanation of how it works, refer to [this blog post](https://blog.alikia2x.com/en/posts/sparkastml-intention/).
|
||||
|
||||
## Translation
|
||||
|
||||
Language barriers are one of the biggest obstacles to communication between civilizations. In modern times, with the development of computer science and artificial intelligence, machine translation is bridging this barrier and building a Tower of Babel.
|
||||
Language barriers are one of the biggest obstacles to communication between civilizations. In modern times, with the development of computer science and artificial intelligence, machine translation is bridging this gap and building a modern Tower of Babel.
|
||||
|
||||
Unfortunately, many machine translations are owned by commercial companies, which seriously hinders the development of freedom and innovation.
|
||||
Unfortunately, many machine translation systems are owned by commercial companies, which seriously hinders the development of freedom and innovation.
|
||||
|
||||
Therefore, sparkastML is on the road to challenge commercial machine translation. We decided to tackle the translation between Chinese and English first. These are two languages with a long history and a large number of users. Their writing methods and expression habits are very different, which brings challenges to the project.
|
||||
Therefore, sparkastML is on a mission to challenge commercial machine translation. We decided to tackle the translation between Chinese and English first. These are two languages with a long history and a large number of users. Their writing methods and expression habits are very different, which brings challenges to the project.
|
||||
|
||||
For more details, you can view [this page](./translate/README.md).
|
||||
For more details, visit [this page](./translate/README.md).
|
||||
|
||||
## Dataset
|
||||
|
||||
|
@ -14,7 +14,9 @@ This dataset features high-quality, fresh synthetic data comprising over 100,000
|
||||
- **Last Update:** 2024/09/16
|
||||
- **LICENSE:** [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/)
|
||||
|
||||
### Download Links
|
||||
### Download
|
||||
|
||||
- **Google Drive:** [Download from Google Drive](https://drive.google.com/drive/folders/1_ADblZcB5p9BUvawkYDmp1qIUDZgkkoe?usp=sharing)
|
||||
- **Google Drive:** [Download from Google Drive](https://drive.google.com/drive/folders/1_ADblZcB5p9BUvawkYDmp1qIUDZgkkoe)
|
||||
- **IPFS:** [Download from IPFS](https://ipfs.a2x.pub/ipfs/QmYz4ew4nSzPc6TZvoWk6jXpGN82qt3J46nwfb75N2YKc4/)
|
||||
- CID: `QmYz4ew4nSzPc6TZvoWk6jXpGN82qt3J46nwfb75N2YKc4`
|
||||
- **GitHub Release:** [Go to Release Page](https://github.com/alikia2x/sparkastML/releases/tag/v1-dataset)
|
||||
|
@ -2,13 +2,18 @@
|
||||
|
||||
## News
|
||||
|
||||
sparkastML's first translation model has release!
|
||||
sparkastML's first translation model is now available!
|
||||
|
||||
### Details
|
||||
|
||||
- Training time: 5 hours, 20k steps
|
||||
- Training device: RTX 3080 (20GB)
|
||||
- Corpus size: over 10 million sentences
|
||||
- Validation Score: BLEU
|
||||
- **Training Time:** 5 hours, 20,000 steps
|
||||
- **Training Device:** RTX 3080 (20GB)
|
||||
- **Corpus Size:** Over 10 million sentences
|
||||
- **Validation BLEU Score:** 17
|
||||
|
||||
[Model]
|
||||
### Model Download
|
||||
|
||||
- **Google Drive:** [Download from Google Drive](https://drive.google.com/file/d/1bJkkqQJLdwTgXFXVeP7fjPawfwelzeIB/view)
|
||||
- **IPFS:** [Download from IPFS](http://ipfs.a2x.pub/ipfs/QmNw3Mo3N31wwTQPXzNeGD8jPpkGp5VFQcC9gk44bfqW1u/)
|
||||
- **CID:** `QmNw3Mo3N31wwTQPXzNeGD8jPpkGp5VFQcC9gk44bfqW1u`
|
||||
- **GitHub Release:** [Go to Release Page](https://github.com/alikia2x/sparkastML/releases/tag/v2-model)
|
||||
|
Loading…
Reference in New Issue
Block a user