简体   繁体   中英

How to retrieve matching macrolanguage locale from a given iso language code?

Given a ISO 639-2/T language code of scope individual, how can i programmatically find the matching macrolanguage code, if such as match exists?

For example, how to go from "nob" (Norwegian Bokmål, scope individual) to "nor" (Norwegian, scope macrolangauge)?

In general, there can be multiple individual languages that are not part of the same macrolanguage in the same country, so grouping by country alone will give false positives.

java.util.locale knows about ISO 639 three letter language codes and recognizes both codes in the example above, but doesn't have the concept of scope nor macrolanguage.

A heuristic, without false positives is also helpful in my case.

You could make a list of macro language of your own, and corresponding individual languages.

Here's the list: https://iso639-3.sil.org/code_tables/639/data/all?title=&field_iso639_cd_st_mmbrshp_639_1_tid=All&name_3=&field_iso639_element_scope_tid=76&field_iso639_language_type_tid=All&items_per_page=200

Here's a selection I made some time ago:

public static final Map<String, String> macroLanguages = new HashMap<>();
static {
    macroLanguages.put("aao", "ara"); //https://iso639-3.sil.org/code/ara
    macroLanguages.put("abh", "ara");
    macroLanguages.put("abv", "ara");
    macroLanguages.put("acm", "ara");
    macroLanguages.put("acq", "ara");
    macroLanguages.put("acw", "ara");
    macroLanguages.put("acx", "ara");
    macroLanguages.put("acy", "ara");
    macroLanguages.put("adf", "ara");
    macroLanguages.put("aeb", "ara");
    macroLanguages.put("aec", "ara");
    macroLanguages.put("afb", "ara");
    macroLanguages.put("ajp", "ara");
    macroLanguages.put("apc", "ara");
    macroLanguages.put("apd", "ara");
    macroLanguages.put("arb", "ara");
    macroLanguages.put("arq", "ara");
    macroLanguages.put("ars", "ara");
    macroLanguages.put("ary", "ara");
    macroLanguages.put("arz", "ara");
    macroLanguages.put("auz", "ara");
    macroLanguages.put("avl", "ara");
    macroLanguages.put("ayh", "ara");
    macroLanguages.put("ayl", "ara");
    macroLanguages.put("ayn", "ara");
    macroLanguages.put("ayp", "ara");
    macroLanguages.put("bbz", "ara");
    macroLanguages.put("pga", "ara");
    macroLanguages.put("shu", "ara");
    macroLanguages.put("ssh", "ara");

    macroLanguages.put("ekk", "est"); //https://iso639-3.sil.org/code/est
    macroLanguages.put("vro", "est");

    macroLanguages.put("bos", "hbs"); //https://iso639-3.sil.org/code/hbs
    macroLanguages.put("hrv", "hbs");
    macroLanguages.put("srp", "hbs");
    macroLanguages.put("cnr", "hbs");

    macroLanguages.put("ltg", "lav"); //https://iso639-3.sil.org/code/lav
    macroLanguages.put("lvs", "lav");

    macroLanguages.put("nno", "nor"); //https://iso639-3.sil.org/code/nor
    macroLanguages.put("nob", "nor");

    macroLanguages.put("aae", "sqi"); //https://iso639-3.sil.org/code/sqi
    macroLanguages.put("aat", "sqi");
    macroLanguages.put("aln", "sqi");
    macroLanguages.put("als", "sqi");

    macroLanguages.put("ydd", "yid"); //https://iso639-3.sil.org/code/yid
    macroLanguages.put("yih", "yid");

    macroLanguages.put("ccx", "zha"); //https://iso639-3.sil.org/code/zha
    macroLanguages.put("ccy", "zha");
    macroLanguages.put("zch", "zha");
    macroLanguages.put("zeh", "zha");
    macroLanguages.put("zgb", "zha");
    macroLanguages.put("zgm", "zha");
    macroLanguages.put("zgn", "zha");
    macroLanguages.put("zhd", "zha");
    macroLanguages.put("zhn", "zha");
    macroLanguages.put("zlj", "zha");
    macroLanguages.put("zln", "zha");
    macroLanguages.put("zlq", "zha");
    macroLanguages.put("zqe", "zha");
    macroLanguages.put("zyb", "zha");
    macroLanguages.put("zyg", "zha");
    macroLanguages.put("zyj", "zha");
    macroLanguages.put("zyn", "zha");
    macroLanguages.put("zzj", "zha");

    macroLanguages.put("cdo", "zho"); //https://iso639-3.sil.org/code/zho
    macroLanguages.put("cjy", "zho");
    macroLanguages.put("cmn", "zho");
    macroLanguages.put("cpx", "zho");
    macroLanguages.put("czh", "zho");
    macroLanguages.put("czo", "zho");
    macroLanguages.put("gan", "zho");
    macroLanguages.put("hak", "zho");
    macroLanguages.put("hsn", "zho");
    macroLanguages.put("lzh", "zho");
    macroLanguages.put("mnp", "zho");
    macroLanguages.put("nan", "zho");
    macroLanguages.put("wuu", "zho");
    macroLanguages.put("yue", "zho");
    macroLanguages.put("cnp", "zho");
    macroLanguages.put("csp", "zho");

    macroLanguages.put("pes", "fas"); //https://iso639-3.sil.org/code/fas
    macroLanguages.put("prs", "fas");
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM