GithubHelp home page GithubHelp logo

When will support for converting Megatron models to Hugging Face model with pipeline parallelism enabled be available for Qwen1.5 models? about pai-megatron-patch HOT 5 CLOSED

alibaba avatar alibaba commented on August 28, 2024
When will support for converting Megatron models to Hugging Face model with pipeline parallelism enabled be available for Qwen1.5 models?

from pai-megatron-patch.

Comments (5)

jerryli1981 avatar jerryli1981 commented on August 28, 2024

烦请pull下最新的代码再试试:https://github.com/alibaba/Pai-Megatron-Patch

from pai-megatron-patch.

kaiwang13 avatar kaiwang13 commented on August 28, 2024

烦请pull下最新的代码再试试:https://github.com/alibaba/Pai-Megatron-Patch

hf2mcore似乎还是不支持pp>1。

raise ValueError('not support pp convert')

from pai-megatron-patch.

jerryli1981 avatar jerryli1981 commented on August 28, 2024

hf2mcore_1.5_v1.py版本其实是支持。v2和v1的区别是转换代码更容易维护且带精度校验功能

from pai-megatron-patch.

kaiwang13 avatar kaiwang13 commented on August 28, 2024

hf2mcore_1.5_v1.py版本其实是支持。v2和v1的区别是转换代码更容易维护且带精度校验功能

好的谢谢,那我去试一下v1的。

from pai-megatron-patch.

kaiwang13 avatar kaiwang13 commented on August 28, 2024

hf2mcore_1.5_v1.py版本其实是支持。v2和v1的区别是转换代码更容易维护且带精度校验功能

@jerryli1981 hf2mcore_1.5_v1.py似乎不支持PP>1同时如QWen1.5-32B一样有GQA的情况,在这里会出错:

Traceback (most recent call last):
  File "/Pai-Megatron-Patch/toolkits/model_checkpoints_convertor/qwen/hf2mcore_1.5_v1.py", line 546, in <module>
    main()
  File "/Pai-Megatron-Patch/toolkits/model_checkpoints_convertor/qwen/hf2mcore_1.5_v1.py", line 542, in main
    convert_checkpoint_from_transformers_to_megatron(args)
  File "/Pai-Megatron-Patch/toolkits/model_checkpoints_convertor/qwen/hf2mcore_1.5_v1.py", line 448, in convert_checkpoint_from_transformers_to_megatron
    params = transformers_to_megatron_fix_query_key_value_ordering(
  File "/Pai-Megatron-Patch/toolkits/model_checkpoints_convertor/qwen/hf2mcore_1.5_v1.py", line 242, in transformers_to_megatron_fix_query_key_value_ordering
    param = param.view(*current_shape)
RuntimeError: shape '[3, 40, 128, 5120]' is invalid for input of size 36700160

from pai-megatron-patch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.