加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
uninorms.h 1.73 KB
一键复制 编辑 原始数据 按行查看 历史
wscjxky 提交于 2021-11-23 10:24 . update
// This file is part of UniLib <http://github.com/ufal/unilib/>.
//
// Copyright 2014 Institute of Formal and Applied Linguistics, Faculty of
// Mathematics and Physics, Charles University in Prague, Czech Republic.
//
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.
//
// UniLib version: 3.1.2-devel
// Unicode version: 8.0.0
#pragma once
#include <cstdint>
#include <string>
namespace ufal {
namespace unilib {
class uninorms {
public:
static void nfc(std::u32string& str);
static void nfd(std::u32string& str);
static void nfkc(std::u32string& str);
static void nfkd(std::u32string& str);
private:
static void compose(std::u32string& str);
static void decompose(std::u32string& str, bool kanonical);
static const char32_t CHARS = 0x110000;
struct Hangul {
// Hangul decomposition and composition
static const char32_t SBase = 0xAC00, LBase = 0x1100, VBase = 0x1161,
TBase = 0x11A7;
static const char32_t LCount = 19, VCount = 21, TCount = 28,
NCount = VCount * TCount,
SCount = LCount * NCount;
};
static const uint8_t ccc_index[CHARS >> 8];
static const uint8_t ccc_block[][256];
static const uint8_t composition_index[CHARS >> 8];
static const uint16_t composition_block[][257];
static const char32_t composition_data[];
static const uint8_t decomposition_index[CHARS >> 8];
static const uint16_t decomposition_block[][257];
static const char32_t decomposition_data[];
};
} // namespace unilib
} // namespace ufal
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化