克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT

简介

MindSpeed 是针对华为昇腾设备的大模型加速库。

大模型训练是一种非常复杂的过程,涉及到许多技术和挑战,其中大模型训练需要大量的显存资源是一个难题,对计算卡提出了不小的挑战。 为了在单个计算卡显存资源不足时,可以通过多张计算卡进行计算,业界出现了类似 Megatron、DeepSpeed 等第三方大模型加速库,对模型、输入数据等进行切分并分配到不同的计算卡上,最后再通过集合通信对结果进行汇总。

昇腾提供 MindSpeed 加速库,使能客户大模型业务快速迁移至昇腾设备,并且支持昇腾专有算法,确保开箱可用。

安装

1. 安装依赖

在安装MindSpeed之前,请参考版本配套表,安装最新昇腾软件栈。

依赖软件 软件安装指南
昇腾NPU驱动 驱动固件安装指南
昇腾NPU固件
Toolkit(开发套件) CANN 软件安装指南
Kernel(算子包)
NNAL(Ascend Transformer Boost加速库)
PyTorch Ascend Extension for PyTorch 配置与安装
torch_npu插件
apex

2. 安装 MindSpeed

下载源码安装:

git clone https://gitee.com/ascend/MindSpeed.git
pip install -e MindSpeed

如需使用Ascend Transformer Boost(ATB)加速库算子,请先安装CANN-NNAL并初始化添加环境,例如:

#CANN-NNAL默认安装路径为:/usr/local/Ascend/nnal
#运行CANN-NNAL默认安装路径下atb文件夹中的环境配置脚本set_env.sh
source /usr/local/Ascend/nnal/atb/set_env.sh 

3. 获取 Megatron-LM 并指定分支

# 目前版本基于core_r0.8.0的release版本
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM
git checkout core_r0.8.0

快速上手

以 GPT 模型为例:

  1. 在 Megatron-LM 目录下修改pretrain_gpt.py文件,在import torch下新增一行import mindspeed.megatron_adaptor

     import os
     import torch
    +import mindspeed.megatron_adaptor
     from functools import partial
     from typing import Union
    
  2. 在 Megatron-LM 目录下修改pretrain_gpt.py文件,在model_provider函数中删除assert(args.context_parallel_size == 1), "Context parallelism is only supported with Megatron Core!"

    else:
        assert (
            args.context_parallel_size == 1
        ), "Context parallelism is only supported with Megatron Core!"
    
        model = megatron.legacy.model.GPTModel(
            config,
            num_tokentypes=0,
            parallel_output=True,
            pre_process=pre_process,
            post_process=post_process,
        )
    
  3. 在 Megatron-LM 目录下,准备好训练数据,并在示例脚本中填写对应路径,然后执行。

    bash examples/pretrain_gpt_distributed.sh
    

自定义优化级别

MindSpeed提供了多层次的优化解决方案,分为三个层级,用户可根据实际需求灵活启用任意层级。高层级兼容低层级的能力,确保了整个系统的稳定性和扩展性。 用户可以通过设置--optimization-level {优化层级}参数来自定义开启的优化层级。该参数支持以下值:

  • 0:基础兼容层L0,提供Megatron-LM框架对NPU的支持,确保无缝集成。该层包含基础功能集patch,保证可靠性和稳定性,为高级优化奠定基础。
  • 1:亲和性增强层L1(兼容L0能力),集成高性能融合算子库,结合昇腾亲和的计算优化,充分释放昇腾算力,显著提升计算效率。
  • 2(默认值):自研加速算法层L2(兼容L1,L0能力),集成了多项自主研发的核心技术成果,提供全面的性能优化。

特性介绍

MindSpeed特性由六大模块组成,分别为:megetron特性支持、并行策略特性、内存优化特性、亲和计算特性、通信优化特性以及关键场景特性。 【Prototype】表示原型特性,暂未商用发布

Megatron特性支持

特性 介绍
Megatron 数据并行 link
Megatron 张量并行 link
Megatron 流水并行 link
Megatron 虚拟流水并行 link
Megatron 分布式优化器 link
Megatron 序列并行 link
Megatron 异步DDP link
Megatron 权重更新通信隐藏 link
Megatron 重计算 link

并行策略特性

特性 介绍
Ulysses 长序列并行 link
Ascend Ring Attention 长序列并行 link
Ascend 混合长序列并行 link
【Prototype】Ascend 自定义空操作层 link
【Prototype】Adaptive-CP 泛化掩码自适应负载均衡序列并行分布式FA link
【Prototype】PP支持动态形状 link

内存优化特性

特性 介绍
Ascend 自适应选择重计算 link
Ascend 激活函数重计算 link
Ascend 重计算流水线独立调度 link
Ascend Mask归一 link
Ascend BF16 参数副本复用 link
Ascend swap_attention link
【Prototype】Ascend Norm重计算 link

亲和计算特性

特性 介绍
Ascend rms_norm 融合算子 link
Ascend swiglu 融合算子 link
Ascend rotary_embedding 融合算子 link
Ascend flash attention 融合算子 link
【Prototype】Ascend 计算通信并行优化 link
【Prototype】Ascend Moe Token Permute and Unpermute 融合算子 link
【Prototype】Ascend ring_attention_update 融合算子 link
【Prototype】Ascend npu_matmul_add_fp32梯度累加融合算子 link
【Prototype】Ascend npu_groupmatmul_add_fp32梯度累加融合算子 link
【Prototype】Ascend MC2 link

通信优化特性

特性 介绍
Ascend nano-pipe流水线并行 link
【Prototype】Ascend 高维张量并行 link
Gloo 存档落盘优化 link

关键场景特性

特性 介绍
Megatron Mcore MoE link
DeepSpeed MoE link
Ascend 共享专家 link
【Prototype】Ascend alibi link
【Prototype】Ascend EOD Reset训练场景 link

其它特性

特性 介绍
Ascend TFLOPS计算 link
【Prototype】Ascend 确定性计算 link
Auto Tuning 并行策略自动搜索系统 link

自定义算子

部分自定义算子设置为公开接口,公开接口设置说明请参照MindSpeed安全声明中的公开接口声明,具体对外接口细节参照以下算子对应的手册链接。

算子 介绍
npu_dropout_add_layer_norm link
npu_rotary_position_embedding link
fusion_attention link
rms_norm link
swiglu link
npu_mm_all_reduce_add_rms_norm link
npu_mm_all_reduce_add_rms_norm_ link
npu_gmm link
npu_grouped_mat_mul_all_reduce link
【Prototype】lcal_coc link
【Prototype】ffn link
【Prototype】npu_fused_moe_token_permute link
【Prototype】npu_fused_moe_token_unpermute link
【Prototype】npu_ring_attention_update link
【Prototype】npu_matmul_add_fp32 link
【Prototype】npu_groupmatmul_add_fp32 link
【Prototype】npu_all_to_all_all_gather_bmm link
【Prototype】npu_bmm_reduce_scatter_all_to_all link
【Prototype】quant_gmm link

MindSpeed中采集Profile数据

MindSpeed支持命令式开启Profile采集数据,命令配置介绍如下:

配置命令 命令含义
--profile 打开profile开关
--profile-step-start 配置开始采集步, 未配置时默认为10, 配置举例: --profile-step-start 30
--profile-step-end 配置结束采集步, 未配置时默认为12, 配置举例: --profile-step-end 35
--profile-level 配置采集等级, 未配置时默认为level0, 可选配置: level0, level1, level2, 配置举例: --profile-level level1
--profile-with-cpu 打开cpu信息采集开关
--profile-with-stack 打开stack信息采集开关
--profile-with-memory 打开memory信息采集开关, 配置本开关时需打开--profile-with-cpu
--profile-record-shapes 打开shapes信息采集开关
--profile-save-path 配置采集信息保存路径, 未配置时默认为./profile_dir, 配置举例: --profile-save-path ./result_dir
--profile-ranks 配置待采集的ranks,未配置时默认为0,配置举例: --profile-ranks 0 1 2 3, 需注意: 该配置值为每个rank在单机/集群中的全局值

版本配套表

PyTorch Extension版本号采用{PyTorch版本}-{昇腾版本}命名规则,前者为PyTorch Extension匹配的PyTorch版本,后者用于匹配CANN版本,详细匹配如下:

MindSpeed版本 Megatron版本 PyTorch版本 torch_npu版本 CANN版本 Python版本 硬件型态
master(主线) Core 0.8.0 2.1.0 在研版本 在研版本 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
core_r0.7.0(主线) Core 0.7.0 2.1.0 在研版本 在研版本 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
core_r0.6.0(主线) Core 0.6.0 2.1.0 在研版本 在研版本 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
1.0.0_core_r0.7.0(商用) Core 0.7.0 2.1.0 6.0.0 8.0.0 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
1.0.0_core_r0.6.0(商用) Core 0.6.0 2.1.0 6.0.0 8.0.0 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
1.0.RC3_core_r0.7.0(商用) Core 0.7.0 2.1.0 6.0.RC3 8.0.RC3 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
1.0.RC3_core_r0.6.0(商用) Core 0.6.0 2.1.0 6.0.RC3 8.0.RC3 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
1.0.RC2(商用) Core 0.6.0 2.1.0 6.0.RC2 8.0.RC2 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc
1.0.RC1(商用) commitid bcce6f 2.1.0 6.0.RC1 8.0.RC1 Python3.8.x, Python3.9.x, Python3.10.x Atlas 200T A2 Box16, Atlas 800T A2, Atlas 900 A2 PODc

昇腾辅助软件中有更多关于PyTorch和CANN的版本信息。

分支维护策略

MindSpeed版本分支的维护阶段如下:

状态 时间 说明
计划 1—3 个月 计划特性
开发 3 个月 开发特性
维护 6-12 个月 合入所有已解决的问题并发布版本,针对不同的MindSpeed版本采取不同的维护策略,常规版本和长期支持版本维护周期分别为6个月和12个月
无维护 0—3 个月 合入所有已解决的问题,无专职维护人员,无版本发布
生命周期终止(EOL) N/A 分支不再接受任何修改

MindSpeed版本维护策略

MindSpeed版本 维护策略 当前状态 发布时间 后续状态 EOL日期
1.0.0_core_r0.7.0 常规版本 开发 2024/12/30 预计2025/6/30起无维护
1.0.0_core_r0.6.0 常规版本 开发 2024/12/30 预计2025/6/30起无维护
1.0.RC3_core_r0.7.0 常规版本 维护 2024/09/30 预计2025/3/30起无维护
1.0.RC3_core_r0.6.0 常规版本 维护 2024/09/30 预计2025/3/30起无维护
1.0.RC2 常规版本 维护 2024/06/30 预计2024/12/30起无维护
1.0.RC1 常规版本 停止维护 2024/03/30 2024/9/30起无维护

安全声明

MindSpeed 安全声明

常见问题

现象 介绍
Data helpers 数据预处理出错 link
Torch extensions 编译卡住 link
megatron0.7.0版本长稳测试出现grad norm为nan link
The following applies to all files unless otherwise noted; Copyright (c) 2024, Bytedance Inc. Copyright (c) 2023, Huawei Technologies Co., Ltd Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -- This repository also contains code from Microsoft (from their DeepSpeed project). Files from these organization(s) have notices at the top of each file. Below are licenses used in those files, as indicated. ----------------------------- LICENSE FOR Microsoft code, Facebook, huggingface and Google Research code ----------------------------- Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ------------- LICENSE FOR various code from Facebook -------------- MIT License Copyright (c) Facebook, Inc. and its affiliates. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ------------- LICENSE FOR Mircrosoft Swin transformer code -------------- MIT License Copyright (c) Microsoft Corporation. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE

简介

昇腾大模型加速库 展开 收起
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化