System Info transformers ve

cc <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Attention implementation cannot work together with config in AutoModel about transformers HOT 2 CLOSED

hiyouga commented on June 2, 2024

Attention implementation cannot work together with config in AutoModel

from transformers.

Comments (2)

hiyouga commented on June 2, 2024

Given the logic below, we cannot enforce the model to use eager attention, since config._attn_implementation falls back to eager when config._attn_implementation_internal is None [1]. Hence, the if condition config._attn_implementation != kwarg_attn_imp cannot hold, and the config._attn_implementation_internal will be not affected, resulting a SDPA attention [2].

transformers/src/transformers/modeling_utils.py

Lines 3138 to 3150 in e4ea19b

 # In case one passes a config to `from_pretrained` + "attn_implementation" 

 # override the `_attn_implementation` attribute to `attn_implementation` of the kwargs 

 # Please see: https://github.com/huggingface/transformers/issues/28038 

 # Overwrite `config._attn_implementation` by the one from the kwargs --> in auto-factory 

 # we pop attn_implementation from the kwargs but this handles the case where users 

 # passes manually the config to `from_pretrained`. 

 config = copy.deepcopy(config) 

 kwarg_attn_imp = kwargs.pop("attn_implementation", None) 

 if kwarg_attn_imp is not None and config._attn_implementation != kwarg_attn_imp: 

 config._attn_implementation = kwarg_attn_imp 

 model_kwargs = kwargs

I think we should use config._attn_implementation_internal != kwarg_attn_imp instead

transformers/src/transformers/configuration_utils.py

Lines 406 to 420 in e4ea19b

 @property 

 def _attn_implementation(self): 

 # This property is made private for now (as it cannot be changed and a PreTrainedModel.use_attn_implementation method needs to be implemented.) 

 if hasattr(self, "_attn_implementation_internal"): 

 if self._attn_implementation_internal is None: 

 # `config.attn_implementation` should never be None, for backward compatibility. 

 return "eager" 

 else: 

 return self._attn_implementation_internal 

 else: 

 return "eager" 

 @_attn_implementation.setter 

 def _attn_implementation(self, value): 

 self._attn_implementation_internal = value

transformers/src/transformers/modeling_utils.py

Lines 1461 to 1466 in e4ea19b

 elif requested_attn_implementation in [None, "sdpa"] and not is_torch_xla_available(): 

 # use_flash_attention_2 takes priority over SDPA, hence SDPA treated in this elif. 

 config = cls._check_and_enable_sdpa( 

 config, 

 hard_check_only=False if requested_attn_implementation is None else True, 

 )

from transformers.

amyeroberts commented on June 2, 2024

cc @fxmarty

from transformers.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

Attention implementation cannot work together with config in AutoModel about transformers HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	# In case one passes a config to `from_pretrained` + "attn_implementation"
	# override the `_attn_implementation` attribute to `attn_implementation` of the kwargs
	# Please see: https://github.com/huggingface/transformers/issues/28038

	# Overwrite `config._attn_implementation` by the one from the kwargs --> in auto-factory
	# we pop attn_implementation from the kwargs but this handles the case where users
	# passes manually the config to `from_pretrained`.
	config = copy.deepcopy(config)

	kwarg_attn_imp = kwargs.pop("attn_implementation", None)
	if kwarg_attn_imp is not None and config._attn_implementation != kwarg_attn_imp:
	config._attn_implementation = kwarg_attn_imp
	model_kwargs = kwargs

	@property
	def _attn_implementation(self):
	# This property is made private for now (as it cannot be changed and a PreTrainedModel.use_attn_implementation method needs to be implemented.)
	if hasattr(self, "_attn_implementation_internal"):
	if self._attn_implementation_internal is None:
	# `config.attn_implementation` should never be None, for backward compatibility.
	return "eager"
	else:
	return self._attn_implementation_internal
	else:
	return "eager"

	@_attn_implementation.setter
	def _attn_implementation(self, value):
	self._attn_implementation_internal = value

	elif requested_attn_implementation in [None, "sdpa"] and not is_torch_xla_available():
	# use_flash_attention_2 takes priority over SDPA, hence SDPA treated in this elif.
	config = cls._check_and_enable_sdpa(
	config,
	hard_check_only=False if requested_attn_implementation is None else True,
	)