加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
CC-BY-4.0
language datasets model-index license widget
mt
MLRS/korpus_malti
name results
BERTu
task dataset metrics
type name
dependency-parsing
Dependency Parsing
type args name
universal_dependencies
mt_mudt
Maltese Universal Dependencies Treebank (MUDT)
type value name
uas
92.31
Unlabelled Attachment Score
type value name
las
88.14
Labelled Attachment Score
task dataset metrics
type name
part-of-speech-tagging
Part-of-Speech Tagging
type name
mlrs_pos
MLRS POS dataset
type value name args
accuracy
98.58
UPOS Accuracy
upos
type value name args
accuracy
98.54
XPOS Accuracy
xpos
task dataset metrics
type name
named-entity-recognition
Named Entity Recognition
type name args
wikiann
WikiAnn (Maltese)
mt
type args value name
f1
span
86.77
Span-based F1
task dataset metrics
type name
sentiment-analysis
Sentiment Analysis
type name
mt-sentiment-analysis
Maltese Sentiment Analysis Dataset
type args value name
f1
macro
78.96
Macro-averaged F1
cc-by-nc-sa-4.0
text
Malta hija gżira fil-[MASK].

BERTu

A Maltese monolingual model pre-trained from scratch on the Korpus Malti v4.0 using the BERT (base) architecture.

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

CC BY-NC-SA 4.0

Citation

This work was first presented in Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese. Cite it as follows:

@inproceedings{BERTu,
    title = "Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and {BERT} Models for {M}altese",
    author = "Micallef, Kurt  and
              Gatt, Albert  and
              Tanti, Marc  and
              van der Plas, Lonneke  and
              Borg, Claudia",
    booktitle = "Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing",
    month = jul,
    year = "2022",
    address = "Hybrid",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.deeplo-1.10",
    doi = "10.18653/v1/2022.deeplo-1.10",
    pages = "90--101",
}
--- language: - mt datasets: - MLRS/korpus_malti model-index: - name: BERTu results: - task: type: dependency-parsing name: Dependency Parsing dataset: type: universal_dependencies args: mt_mudt name: Maltese Universal Dependencies Treebank (MUDT) metrics: - type: uas value: 92.31 name: Unlabelled Attachment Score - type: las value: 88.14 name: Labelled Attachment Score - task: type: part-of-speech-tagging name: Part-of-Speech Tagging dataset: type: mlrs_pos name: MLRS POS dataset metrics: - type: accuracy value: 98.58 name: UPOS Accuracy args: upos - type: accuracy value: 98.54 name: XPOS Accuracy args: xpos - task: type: named-entity-recognition name: Named Entity Recognition dataset: type: wikiann name: WikiAnn (Maltese) args: mt metrics: - type: f1 args: span value: 86.77 name: Span-based F1 - task: type: sentiment-analysis name: Sentiment Analysis dataset: type: mt-sentiment-analysis name: Maltese Sentiment Analysis Dataset metrics: - type: f1 args: macro value: 78.96 name: Macro-averaged F1 license: cc-by-nc-sa-4.0 widget: - text: "Malta hija gżira fil-[MASK]." --- # BERTu A Maltese monolingual model pre-trained from scratch on the Korpus Malti v4.0 using the BERT (base) architecture. ## License This work is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa]. Permissions beyond the scope of this license may be available at [https://mlrs.research.um.edu.mt/](https://mlrs.research.um.edu.mt/). [![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa] [cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/ [cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png ## Citation This work was first presented in [Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese](https://aclanthology.org/2022.deeplo-1.10/). Cite it as follows: ```bibtex @inproceedings{BERTu, title = "Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and {BERT} Models for {M}altese", author = "Micallef, Kurt and Gatt, Albert and Tanti, Marc and van der Plas, Lonneke and Borg, Claudia", booktitle = "Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing", month = jul, year = "2022", address = "Hybrid", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.deeplo-1.10", doi = "10.18653/v1/2022.deeplo-1.10", pages = "90--101", } ```

简介

暂无描述 展开 收起
CC-BY-4.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化