tatoeba-parser/README.md
Marvin Elsen d705c0e11f
All checks were successful
Publish package / publish (push) Successful in 2m8s
Initial commit
2024-10-14 18:48:55 +02:00

62 lines
1.6 KiB
Markdown

# Tatoeba Parser for Kotlin
A parser for [Tatoeba example sentence files](https://tatoeba.org/en/downloads) written in [Kotlin](https://kotlinlang.org).
## Build
To build the project locally, simply run the following command from the terminal:
```sh
./gradlew build
```
## Installation
_Tatoeba Parser for Kotlin_ is available
from [my self-hosted Gitea instance](https://gitea.marvinelsen.com/marvinelsen/cedict-parser).
First, add the repository to your `build.gradle.kts` file:
```kotlin
repositories {
maven {
url = uri("https://gitea.marvinelsen.com/api/packages/marvinelsen/maven")
}
}
```
Afterwards, add the package dependency to your `build.gradle.kts` file:
```kotlin
dependencies {
implementation("com.marvinelsen:tatoeba-parser:1.0.0")
}
```
## Usage
```kotlin
fun main() {
val tatoebaInputStream =
GZIPInputStream(object {}.javaClass.getResourceAsStream("/cmn_sentences.tsv.gz")!!)
tatoebaInputStream.use {
val tatoebaParser = TatoebaParser.instance
val tatoebaSentences = tatoebaParser.parse(tatoebaInputStream)
tatoebaSentences.forEach { sentence ->
println(sentence.simplified)
}
}
}
```
## License
All source code in this repository is licensed under a [MIT license](LICENSE), unless otherwise noted.
To the following third-party code, data, and files in the repository different licenses apply:
### Tatoeba Example Sentences
[Tatoeba example sentences](https://tatoeba.org/en/downloads) are licensed under a [CC BY 2.0 FR](https://creativecommons.org/licenses/by/2.0/fr/).