62 lines
1.6 KiB
Markdown
62 lines
1.6 KiB
Markdown
|
# Tatoeba Parser for Kotlin
|
||
|
|
||
|
A parser for [Tatoeba example sentence files](https://tatoeba.org/en/downloads) written in [Kotlin](https://kotlinlang.org).
|
||
|
|
||
|
## Build
|
||
|
|
||
|
To build the project locally, simply run the following command from the terminal:
|
||
|
|
||
|
```sh
|
||
|
./gradlew build
|
||
|
```
|
||
|
|
||
|
## Installation
|
||
|
|
||
|
_Tatoeba Parser for Kotlin_ is available
|
||
|
from [my self-hosted Gitea instance](https://gitea.marvinelsen.com/marvinelsen/cedict-parser).
|
||
|
|
||
|
First, add the repository to your `build.gradle.kts` file:
|
||
|
|
||
|
```kotlin
|
||
|
repositories {
|
||
|
maven {
|
||
|
url = uri("https://gitea.marvinelsen.com/api/packages/marvinelsen/maven")
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Afterwards, add the package dependency to your `build.gradle.kts` file:
|
||
|
|
||
|
```kotlin
|
||
|
dependencies {
|
||
|
implementation("com.marvinelsen:tatoeba-parser:1.0.0")
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## Usage
|
||
|
|
||
|
```kotlin
|
||
|
fun main() {
|
||
|
val tatoebaInputStream =
|
||
|
GZIPInputStream(object {}.javaClass.getResourceAsStream("/cmn_sentences.tsv.gz")!!)
|
||
|
|
||
|
tatoebaInputStream.use {
|
||
|
val tatoebaParser = TatoebaParser.instance
|
||
|
val tatoebaSentences = tatoebaParser.parse(tatoebaInputStream)
|
||
|
|
||
|
tatoebaSentences.forEach { sentence ->
|
||
|
println(sentence.simplified)
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## License
|
||
|
|
||
|
All source code in this repository is licensed under a [MIT license](LICENSE), unless otherwise noted.
|
||
|
|
||
|
To the following third-party code, data, and files in the repository different licenses apply:
|
||
|
|
||
|
### Tatoeba Example Sentences
|
||
|
|
||
|
[Tatoeba example sentences](https://tatoeba.org/en/downloads) are licensed under a [CC BY 2.0 FR](https://creativecommons.org/licenses/by/2.0/fr/).
|