A parser for the CC-CEDICT Chinese-to-English dictionary
Go to file
Marvin Elsen 00ed96af63
All checks were successful
Publish package / publish (push) Successful in 1m54s
Merge pull request 'Use Chinese Transliteration libary' (#3) from uses-transliteration-library into main
Reviewed-on: #3
2024-09-20 11:49:40 +00:00
.gitea/workflows Rename file for uniformity 2024-09-20 12:44:25 +02:00
gradle Add code 2024-09-15 13:51:15 +02:00
src Use Chinese transliteration library 2024-09-20 13:46:01 +02:00
.gitignore Initial commit 2024-09-15 13:50:45 +02:00
build.gradle.kts Use Chinese transliteration library 2024-09-20 13:46:01 +02:00
gradle.properties Add code 2024-09-15 13:51:15 +02:00
gradlew Add code 2024-09-15 13:51:15 +02:00
gradlew.bat Add code 2024-09-15 13:51:15 +02:00
LICENSE Add license 2024-09-15 15:55:04 +02:00
README.md Improve phrasing 2024-09-15 15:57:17 +02:00
settings.gradle.kts Add code 2024-09-15 13:51:15 +02:00

CC-CEDICT Parser for Kotlin

A parser for the CC-CEDICT Chinese-to-English dictionary written in Kotlin.

Installation

CC-CEDICT Parser for Kotlin is available from my self-hosted Gitea instance.

First, add the repository to your build.gradle.kts file:

repositories {
    maven {
        url = uri("https://gitea.marvinelsen.com/api/packages/marvinelsen/maven")
    }
}

Afterwards, add the package dependency to your build.gradle.kts file:

dependencies {
    implementation("com.marvinelsen:cedict-parser:1.0-SNAPSHOT")
}

Usage

fun main() {
    val cedictInputStream =
        GZIPInputStream(object {}.javaClass.getResourceAsStream("/cedict_1_0_ts_utf-8_mdbg.txt.gz")!!)

    val cedictParser = CedictParser.instance
    val cedictEntries = cedictParser.parseCedict(cedictInputStream)

    for (entry in cedictEntries) {
        println(entry.traditional)
        println(entry.simplified)
        println(entry.pinyinSyllables.joinToString(" "))
        println(entry.definitions.joinToString("/") { it.glosses.joinToString(";") })
    }
}

License

All source code in this repository is licensed under a MIT license, unless otherwise noted.

To the following third-party code, data, and files in the repository different licenses apply:

CC-CEDICT

CC-CEDICT is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.