GithubHelp home page GithubHelp logo

LCQMC.zip + albert_xlarge_zh_183k.zip ValueError: Shape of variable bert/embeddings/LayerNorm/beta:0 ((312,)) doesn't match with shape of tensor bert/embeddings/LayerNorm/beta ([2048]) from checkpoint reader. about albert_zh HOT 4 OPEN

brightmart avatar brightmart commented on May 11, 2024
LCQMC.zip + albert_xlarge_zh_183k.zip ValueError: Shape of variable bert/embeddings/LayerNorm/beta:0 ((312,)) doesn't match with shape of tensor bert/embeddings/LayerNorm/beta ([2048]) from checkpoint reader.

from albert_zh.

Comments (4)

brightmart avatar brightmart commented on May 11, 2024

312是tiny的配置文件里的一个维度;
你如果使用xlarge,那么配置文件的名称也要变一变哦,你看看albert_config文件夹下有xlarge的配置文件。

from albert_zh.

easywaytodo avatar easywaytodo commented on May 11, 2024

恩 恩。
想问下,albert对GPU的显存 还是跟bert一样有要求么,你对比过么。

我运行bert和albert,对gpu的显存消耗差不多,一样参数的时候还是会出现oom

from albert_zh.

parkourcx avatar parkourcx commented on May 11, 2024

312是tiny的配置文件里的一个维度;
你如果使用xlarge,那么配置文件的名称也要变一变哦,你看看albert_config文件夹下有xlarge的配置文件。
@brightmart
我也遇到了类似的问题,报错信息是:
ValueError: Shape of variable bert/pooler/dense/bias:0 ((128,)) doesn't match with shape of tensor bert/pooler/dense/bias ([768]) from checkpoint reader.
我是自己训练了一个中文的预训练albert模型,是直接把预训练albert时的配置文件拿过来用了,我的配置文件内容是:
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 128,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 8,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 20974,
"embedding_size": 128,
"ln_type":"postln"
请问应该怎么修改呢?

from albert_zh.

TingNLP avatar TingNLP commented on May 11, 2024

@parkourcx 请问解决?

from albert_zh.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.