-
dataset: v1 Stable
released this
2024-09-16 09:36:20 +00:00 | 17 commits to main since this releasesparkastML Datasets
This repository contains datasets published by the sparkastML project.
Translation ZH-EN
This dataset features high-quality, fresh synthetic data comprising over 100,000 sentences of Chinese-English parallel corpora.
Details
- Source Language: Chinese (Simplified)
- Target Language: English
- Version: 1
- Last Update: 2024/09/16
- LICENSE: CC-BY 4.0
Downloads
- Source Code (ZIP)
- Source Code (TAR.GZ)
-
source.txt
9.7 MiB
-
target.txt
13 MiB