tatoeba-parser/README.md

62 lines
1.6 KiB
Markdown
Raw Permalink Normal View History

2024-10-14 16:48:55 +00:00
# Tatoeba Parser for Kotlin
A parser for [Tatoeba example sentence files](https://tatoeba.org/en/downloads) written in [Kotlin](https://kotlinlang.org).
## Build
To build the project locally, simply run the following command from the terminal:
```sh
./gradlew build
```
## Installation
_Tatoeba Parser for Kotlin_ is available
from [my self-hosted Gitea instance](https://gitea.marvinelsen.com/marvinelsen/cedict-parser).
First, add the repository to your `build.gradle.kts` file:
```kotlin
repositories {
maven {
url = uri("https://gitea.marvinelsen.com/api/packages/marvinelsen/maven")
}
}
```
Afterwards, add the package dependency to your `build.gradle.kts` file:
```kotlin
dependencies {
implementation("com.marvinelsen:tatoeba-parser:1.0.0")
}
```
## Usage
```kotlin
fun main() {
val tatoebaInputStream =
GZIPInputStream(object {}.javaClass.getResourceAsStream("/cmn_sentences.tsv.gz")!!)
tatoebaInputStream.use {
val tatoebaParser = TatoebaParser.instance
val tatoebaSentences = tatoebaParser.parse(tatoebaInputStream)
tatoebaSentences.forEach { sentence ->
println(sentence.simplified)
}
}
}
```
## License
All source code in this repository is licensed under a [MIT license](LICENSE), unless otherwise noted.
To the following third-party code, data, and files in the repository different licenses apply:
### Tatoeba Example Sentences
[Tatoeba example sentences](https://tatoeba.org/en/downloads) are licensed under a [CC BY 2.0 FR](https://creativecommons.org/licenses/by/2.0/fr/).