sparkastML/translate/README.md

51 lines
1.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# sparkastML NMT
A set of models that aims to offer best open-source machine translation, based on the [OpenNMT](https://opennmt.net/).
## News
sparkastML's translation model is now updated!
### Details
- **Source Language:** Chinese (Simplified)
- **Target Language:** English
- **Training Time:** Totally 11.3 hours, 46,500 steps (~1×10¹⁸ FLOPs)
- **Training Device:**
- RTX 3080 (20GB): 0-20,000 steps
- RTX 4070: 20,000-46,500 steps
- **Corpus Size:** Over 10 million sentences
- **Validation BLEU Score:** 21.28
- **Validation Loss (Cross Entropy):** 3.152
### Model Download
Avaliable soon.
### Special thanks
[yumechi](https://github.com/eternal-flame-AD/) for sponsoring an RTX 4070 for training.
## History
### Sep 19, 2024
sparkastML's translation model is now updated!
#### Details
- **Source Language:** Chinese (Simplified)
- **Target Language:** English
- **Training Time:** 5 hours, 20,000 steps
- **Training Device:** RTX 3080 (20GB)
- **Corpus Size:** Over 10 million sentences
- **Validation BLEU Score:** 17
- **Version:** 1.0
#### Model Download
- **Google Drive:** [Download from Google Drive](https://drive.google.com/drive/folders/1-q_AKfQENW-pV6uAleUHPE9ghddfNWKF)
- **IPFS:** [Download from IPFS](http://ipfs.a2x.pub/ipfs/QmUMadzkBwvH5KTpoxfv7TgqzaPpqBzkXtkecV9TXPfZ3F/)
- **CID:** `QmUMadzkBwvH5KTpoxfv7TgqzaPpqBzkXtkecV9TXPfZ3F`
- **GitHub Release:** [Go to Release Page](https://github.com/alikia2x/sparkastML/releases/tag/v2-model)