adapter-hub / adapters Goto Github PK
View Code? Open in Web Editor NEWA Unified Library for Parameter-Efficient and Modular Transfer Learning
Home Page: https://docs.adapterhub.ml
License: Apache License 2.0
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Home Page: https://docs.adapterhub.ml
License: Apache License 2.0
I get this key error when loading an adapter (see code below).
Inspecting the adapter_config shows that no mh_adapter key is contained but an MH_adapter key.
Steps to reproduce the behavior:
model = BertModelWithHeads.from_pretrained('bert-base-uncased', cache_dir=transformers_cache_dir) model.load_adapter("sentiment/sst@example-org", cache_dir=transformers_cache_dir)
Traceback (most recent call last):
File "C:\Users\Gregor\AppData\Roaming\JetBrains\IdeaIC2020.1\plugins\python-ce\helpers\pydev\pydevd.py", line 1438, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Users\Gregor\AppData\Roaming\JetBrains\IdeaIC2020.1\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/Gregor/Documents/Programming/efficient-adapters/evaluation/timing/measure_inference.py", line 59, in
results = measure_inference_gpu(True, [10, 512], [1, 2, 128], cache_dir, repetitions=5)
File "C:/Users/Gregor/Documents/Programming/efficient-adapters/evaluation/timing/measure_inference.py", line 16, in measure_inference_gpu
model.load_adapter("sentiment/sst@example-org", cache_dir=transformers_cache_dir)
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_model_mixin.py", line 748, in load_adapter
super().load_adapter(
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_model_mixin.py", line 650, in load_adapter
load_dir, load_name = loader.load(adapter_name_or_path, config, version, model_name, load_as, **kwargs)
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_model_mixin.py", line 375, in load
self.model.add_adapter(adapter_name, config["type"], config=config["config"])
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_model_mixin.py", line 705, in add_adapter
self.base_model.add_adapter(adapter_name, adapter_type, config)
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_bert.py", line 471, in add_adapter
self.encoder.add_adapter(adapter_name, adapter_type)
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_bert.py", line 413, in add_adapter
layer.add_adapter(adapter_name, adapter_type)
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_bert.py", line 389, in add_adapter
self.attention.output.add_adapter(adapter_name, adapter_type)
File "C:\Users\Gregor\Anaconda3\envs\adapter\lib\site-packages\transformers\adapter_bert.py", line 38, in add_adapter
if adapter_config and adapter_config["mh_adapter"]:
KeyError: 'mh_adapter'
When training e.g. fusion with multiple heads, there is a warning that no prediction head was stored for all adapters.
This is desired and thus no warning should be issued.
Model I am using (Bert, XLNet ...):
any model
Language I am using the model on (English, Chinese ...):
any language
Adapter setup I am using (if any):
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
transformers
version:๐ Add adapters to the BART model.
Hi
I am trying with this example colab:
https://colab.research.google.com/github/Adapter-Hub/website/blob/master/app/static/notebooks/Adapter_Quickstart_Training.ipynb#scrollTo=Lbwb3NRf8mBF
getting this error:
Traceback (most recent call last):
File "test.py", line 11, in <module>
from transformers import AutoTokenizer, EvalPrediction, GlueDataset, GlueDataTrainingArguments, AutoModelWithHeads, AdapterType
ImportError: cannot import name 'AutoModelWithHeads' from 'transformers' (/idiap/user/rkarimi/libs/anaconda3/envs/adapter/lib/python3.7/site-packages/transformers/__init__.py)
versions
(adapter) rkarimi@italix17:/idiap/user/rkarimi/dev/internship/seq2seq/adapter-transformers$ conda list | grep transformers
adapter-transformers 1.0.1 <pip>
transformers 3.5.1 <pip>
(adapter) rkarimi@italix17:/idiap/user/rkarimi/dev/internship/seq2seq/adapter-transformers$ conda list | grep pytorch
pytorch-lightning 1.0.4 <pip>
adapter hub from github is installed
Old versions of the adapters initialized *adapter_attention*
which were never used but stored.
I proposed a two stage fix:
hot fix which does not log the warning that the parameters were not instantiated
remove the parameters from all adapters
Model I am using (Bert, XLNet ...): e.g. RoBERTa-Base
Language I am using the model on (English, Chinese ...): English
Adapter setup I am using (if any):
many but e.g.
model = AutoModel.from_pretrained("roberta-base")
model.load_adapter("comsense/csqa@ukp", "text_task", config="{'using': 'pfeiffer'}")
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
no warning
transformers
version:Hi
I am using a custom model, so I had to read and implement the adapter layers using library myself, here is how I did it:
I defined an adapter class like below:
"""Implements an Adapter block.
Code is adapted from: https://github.com/Adapter-Hub/adapter-transformers/blob/master/\
src/transformers/adapter_modeling.py
"""
import torch.nn as nn
from .adapter_utils import Activations
class Adapter(nn.Module):
"""
Implementation of a single Adapter block.
"""
def __init__(self, input_size, config):
super().__init__()
self.input_size = input_size
self.add_layer_norm_before = config.add_layer_norm_before
self.add_layer_norm_after = config.add_layer_norm_after
self.residual_before_layer_norm = config.residual_before_layer_norm
# list for all modules of the adapter, passed into nn.Sequential()
seq_list = []
# If we want to have a layer norm on input, we add it to seq_list
if self.add_layer_norm_before:
seq_list.append(nn.LayerNorm(self.input_size))
# if a downsample size is not passed, we just half the size of the original input
reduction_factor = config.reduction_factor if config.reduction_factor is not None else 2
self.down_sample_size = self.input_size//reduction_factor
seq_list.append(nn.Linear(self.input_size, self.down_sample_size))
self.non_linearity = Activations(config.non_linearity.lower())
seq_list.append(self.non_linearity)
# sequential adapter, first downproject, then non-linearity then upsample.
# In the forward pass we include the residual connection
self.adapter_down = nn.Sequential(*seq_list)
# Up projection to input size
self.adapter_up = nn.Linear(self.down_sample_size, self.input_size)
# If we want to have a layer norm on output, we apply it later after a
# separate residual connection. This means that we learn a new output layer norm,
# which replaces another layer norm learned in the bert layer
if self.add_layer_norm_after:
self.adapter_norm_after = nn.LayerNorm(self.input_size)
def forward(self, x): #, residual_input):
down = self.adapter_down(x)
up = self.adapter_up(down)
output = up
# todo add brief documentation what that means
#if self.residual_before_layer_norm:
# output = output + residual_input
# todo add brief documentation what that means
if self.add_layer_norm_after:
output = self.adapter_norm_after(output)
# todo add brief documentation what that means
#if not self.residual_before_layer_norm:
# output = output + residual_input
return output #, down, up
then I add them between the layers of my class
class LayerFF(nn.Module):
def __init__(self, config):
super().__init__()
self.DenseReluDense = T5DenseReluDense(config)
self.layer_norm = T5LayerNorm(config.d_model, eps=config.layer_norm_epsilon)
self.dropout = nn.Dropout(config.dropout_rate)
# TODO: remove it later.
self.add_adapters = config.add_adapters
if self.add_adapters:
# TODO: adapter config should be a part of model config then. or do we want a separate config?
adapter_config = AdapterConfig()
self.adapter = Adapter(config.d_model, adapter_config)
def forward(self, hidden_states):
norm_x = self.layer_norm(hidden_states)
y = self.DenseReluDense(norm_x)
if self.add_adapters:
y = self.adapter(y)
layer_output = hidden_states + self.dropout(y)
return layer_output
then I freeze all model parameters with require_grad=False and setting only adapter ones to True. I see a large memory requirement, around the same model, could you assist me if anything is missing? thanks
I additionally get accuracy close to untrained model, and very low, could you give me possible suggestions to improve the accuracy? thank you.
Example code to produce the (supercool!) Adapter Fusion inter-Adapter attention plots in figure 5 from the paper AdapterFusion: Non-Destructive Task Composition for Transfer Learning.
Checking out the attention scores in the AdapterFusion module for analysis is exciting. But I didn't find it easy to create them. The challenge was accessing the relevant tensors and then creating them in the right format. Hence my request for some help ;).
Creating square attention plots from the attention tensor saved in BertFusion.recent_attention
(https://github.com/Adapter-Hub/adapter-transformers/blob/master/src/transformers/adapter_modeling.py#L218). As far I understand, this tensor is of shape [batch_size, seq_len, num_adapters]
, and when I average over the first two dimensions (mean(0).mea(0)
, which I will do for all batches in the prediction data) I get a tensor of num_adapters
floats that sums to 1.
Should I understand this to be the attention displayed in the above figure? But how do I get something of shape [num_adapters, num_adapters]
?
Accessing the stored attention tensors from the Bert encoder during the prediction forward passes. I have been trying to trace up the BertFusion
module through transformers.adapter_bert
to understand where this modules ends up in the Bert model, and thus how I can access it from the top down. My guess from https://github.com/Adapter-Hub/adapter-transformers/blob/master/src/transformers/adapter_bert.py#L80 would be that
model.encoder.adapter_fusion_layer[adapter_fusion_name]
should give me the BertFusion module, which in turn would allow access to recent_attention
after a prediction forward pass. But that does not seem to work. (If I believe correctly, because the model.encoder
has no attribute adapter_fusion_layer
.
How should I do this?
All that I have to contribute are the incomplete findings I shared above. But my guess is that the authors of Adapter Fusion would have some snippets lying around. I could turn those into an example snippet, in a notebook or something. Whatever you prefer!
Thanks for this project. I had a query which has not been discussed in the docs. I wanted to ask when we use run_glue.py
example from this repo, which type of adapter is added.
AdapterFusion
method by default ?Currently we need to manually pass is_training_adapter
to the Trainer
.
See: https://github.com/Adapter-Hub/adapter-transformers/blob/d24649cea108baa2f33c4f3ac9c040b88a43abc0/src/transformers/trainer.py#L180
I did not know about this option and wondered why my script never exported adapters. Further, this only has an effect on checkpointing, not on the training. So we might want to rename it. Or better: automatically determine whether to export adapters or the full model.
transformers is now in 3.x version with cleaner data processing, improved stability and multiple bug fixes
This makes it much easier for making new adapters on custom datasets which are not managed automatically by GLUE scripts. E.g. the __call__
API from AutoTokenizer
reduces the separate tokenize, pad, encode, create attention masks steps into single API call.
If the developers can point out places, e.g. classes or function calls which could act as a starting point for this upgrade - I'd be happy to start a PR. I might need a little help to warm up and get comfortable with the code flow here though.
BertEncoderAdaptersMixin.add_adapter() checks for "leave_out" with hasattr. This does not work if the config is a dict because then it is no attribute.
adapter_config = resolve_adapter_config("pfeiffer")
adapter_config["leave_out"] = [0, 1]
#adapter_config = AdapterConfig.from_dict(adapter_config)
model.add_adapter(name, AdapterType.text_task, config=adapter_config)
If the third line stays commented out, then the 0th and 1st layer will not be skipped in BertEncoderAdaptersMixin.add_adapter().
A dict config should work with "leave_out" especially if resolve_adapter_config returns a dict.
Model I am using (Bert, XLNet ...):
mBERT
Language I am using the model on (English, Chinese ...):
Arabic, but the issue is language/dataset independent
Adapter setup I am using (if any):
Arabic lang adapter from adapterhub, new squad task adapter
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
--load_lang_adapter [adapter name]
--load_language_adapter
--load_language_adapter
insteadsetup_task_adapter_training
function of adapter_training.py
expects it to be --load_lang_adapter
I should be able to define the language adapter with the --load_lang_adapter flag and its config with the --lang_adapter_config flag. When using adapters to finetune my model, I would usually like to store the adapters, not the full model.
transformers
version: 2.11.0Hi,
I see in some implementations they set layer_norm as require_grad=True, could you tell me if all layer norms of the model needs to be set to require_grad=True, or only the ones inside adapter layer needs this condition?
thanks.
Model I am using (Bert, XLNet ...): bert-base-uncased
Language I am using the model on (English, Chinese ...): English
Adapter setup I am using (if any): Default
The problem arises when using:
run_glue_wh.py
The tasks I am working on is:
Steps to reproduce the behavior:
HF default Trainer
wraps the model in DataParallel
if training_args.n_gpu
> 1. As a result, it doesn't have the config
attribute. When I use your modified Trainer
, I am getting the error, DataParallel object has no attribute 'config'
It should not raise the above error.
Using the master version of this repo. I had to use CUDA_VISIBLE_DEVICES
flag to specify one GPU.
Model I am using (Bert, XLNet ...): Bert-base
Language I am using the model on (English, Chinese ...): English
Adapter setup I am using (if any): AdapterFusion
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
model.load_adapter_fusion("qqp,snli")
I get the following error message:Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_model_mixin.py", line 837, in load_adapter_fusion
load_dir, load_name = loader.load(adapter_fusion_name_or_path, load_as)
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_model_mixin.py", line 485, in load
self.model.add_fusion(adapter_fusion_name, config["config"])
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_model_mixin.py", line 716, in add_fusion
self.base_model.add_fusion_layer(adapter_names)
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_bert.py", line 585, in add_fusion_layer
self.encoder.add_fusion_layer(adapter_names)
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_bert.py", line 479, in add_fusion_layer
layer.add_fusion_layer(adapter_names)
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_bert.py", line 461, in add_fusion_layer
self.attention.output.add_fusion_layer(adapter_names)
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_bert.py", line 69, in add_fusion_layer
adapter_config = self.config.adapters.common_config(adapter_names)
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_config.py", line 243, in common_config
adapter_config = AdapterConfig.from_dict(adapter_config)
File "/home/om304/anaconda3/lib/python3.7/site-packages/transformers/adapter_config.py", line 87, in from_dict
return cls(**config)
TypeError: ABCMeta object argument after ** must be a mapping, not NoneType
I would expect to be able to load the trained adapter-fusion from the directory to which it was saved.
transformers
version: 2.11.0Hi
Looking into adapter_bert inside "adapter_stack_layer" function, you first call self.get_adapter_preparams, there for case of Hausbly adapter config, you change the residual to hidden_state, after this fucntion call in line 178, both hidden_states and residual are the same value of hidden_states, then they feed into the adapter_layer(), is this the correct behaviour expected to feed the same input to this layer? thanks
Merge this into the original transformers library.
This library is awesome so thanks a lot but it would be much more convenient to have this merged into the original transformers library. The Huggingface team seems to be focused on adding lightweight options for their models and adapters are huge time-and-memory-savers for multitask use cases and would be a great addition to the transformers library.
You've done the integration here already so it should be straightforward but happy to help. I've posted an issue on huggingface's end as well.
A description on how to manually pass the adapter composition to model.forward()
I am doing this right now and cannot remember the exact naming. Checked docs, didn't find it. Now need to check code
Hi guys,
Hope you are all well !
I was wondering if adapter-transformers can handle multi-label classification with 1560 labels.
More precisely, I would like to apply it to paperswithcode dataset where labels
are called tasks
.
Refs:
Thanks for any insights or inputs on that.
Cheers,
X
when checkpointing at more than one steps, we seem to be storing the same adapter and head multiple times (in a loop)
I think I was able to zero-in the problem:
https://github.com/Adapter-Hub/adapter-transformers/blob/a994914cbb5290a633e0f3e1e6b7cfd7fb91ecbe/src/transformers/adapter_model_mixin.py#L729
custom_weights_loaders.append(PredictionHeadLoader(self, error_on_missing=False))
When storing the model we append
the prediction head, so this list increases every time we save the model.
also here:
https://github.com/Adapter-Hub/adapter-transformers/blob/a994914cbb5290a633e0f3e1e6b7cfd7fb91ecbe/src/transformers/adapter_model_mixin.py#L767
Model I am using (Bert, XLNet ...):
mBERT
Language I am using the model on (English, Chinese ...):
English
Adapter setup I am using (if any):
SST-2 in Glue script
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
--save_steps 2 --logging_steps 2
{"eval_loss": 0.33147413358775846, "eval_acc": 0.8692660550458715, "epoch": 2.2802850356294537, "step": 4800}
07/08/2020 09:55:17 - INFO - transformers.trainer - Saving model checkpoint to data_models/glue_testing/checkpoint-4800
07/08/2020 09:55:17 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/adapter_config.json
07/08/2020 09:55:17 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_adapter.bin
07/08/2020 09:55:17 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:17 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:17 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:17 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/sst-2/head_config.json
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/sst-2/pytorch_model_head.bin
07/08/2020 09:55:18 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/adapter_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_adapter.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Configuration saved in data_models/glue_testing/checkpoint-4800/en/head_config.json
07/08/2020 09:55:19 - INFO - transformers.adapter_model_mixin - Module weights saved in data_models/glue_testing/checkpoint-4800/en/pytorch_model_head.bin
Only store the adapter and head once per checkpoint
transformers
version: latestModel I am using (Bert, XLNet ...): RoBERTa
Language I am using the model on (English, Chinese ...): English
Adapter setup I am using (if any): Pfeiffer
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
model.save_pretrained(output_dir)
config = AutoConfig.from_pretrained(load_path)
model = RobertaModelWithHeads.from_pretrained(load_path, config=config)
The config from AutoConfig (line 1) shows the head:
"prediction_heads": {
"sts-b": {
"activation_function": "tanh",
"head_type": "classification",
"layers": 2,
"num_labels": 1
}
}
The config of the RobertaModelWithHeads removes the head from the config:
"prediction_heads": {},
Apparently the weights of the head are also not loaded:
>>> [(k,v) for (k,v) in model.named_parameters() if 'bert' not in k]
[]
However, they are in the pickled checkpoint file:
>>> import torch
>>> x = torch.load(in_dir + '/pytorch_model.bin')
>>> [k for k in x.keys() if 'bert' not in k]
['heads.sts-b.1.weight', 'heads.sts-b.1.bias', 'heads.sts-b.4.weight', 'heads.sts-b.4.bias']
Model also loads the prediction head of the checkpoint.
transformers
version: adapter-transformer pre-release versionHi
could you point me how I can use adapters with seq2seq models in huggingafce repo? thanks
Hi
I have added adapter layers to my custom model and currently getting very low performance, could you tell me if adapter layers needs special pretraining and how I can do this pretraining? I am defining them from scratch and freeze the model and then train the adapters. thanks.
Hi
I read the description here https://docs.adapterhub.ml/prediction_heads.html
I am not getting why adding add_classification_head is needed? could you give more details on why one needs to introduce it and how it works? thanks.
What is suggested adapter configuration for transformers with pre-layer normalization? I mean where to keep Layer-normalization within adapters?
Thanks
Hi
could you clarify the number of training iterations? I could not find it in the paper
thanks
Hi
I see currently you have implemented part of computation of adapter layers inside get_adapter_preparams
see https://github.com/Adapter-Hub/adapter-transformers/blob/master/src/transformers/adapter_bert.py line 107, this is confusing and to me the best is putting all computation in one place, this is inside Adapter class https://github.com/Adapter-Hub/adapter-transformers/blob/master/src/transformers/adapter_modeling.py, line 43 to allow track the method easier.
thanks.
Best
Rabeeh
Model I am using (Bert, XLNet ...): XLM-RoBERTa-base
Language I am using the model on (English, Chinese ...): Korean
Adapter setup I am using (if any):
The problem arises when using:
The tasks I am working on is:
What I'm doing is that:
Sorry that I'm not familiar with the adapter-transformers codebase.
Here are some questions about the AdapterFusion framework.
transformers
version:transformers
version: 1.0.1Who can help:
@LysandreJik @patrickvonplaten
Model I am using: Bert
Language I am using the model on:English
Adapter setup I am using (if any): HoulsbyConfig
The problem arises when using:
My own modified scripts:
I want to use adapters for a project of mine, which will require fine-tuning BERT multiple times. In order to get an understanding of how much speedup I shall get from using adapters, I profiled the various steps in the training loop of BERT, both with and without the use of adapters
The tasks I am working on is:
Stanford Natural Language inference(SNLI)
Steps to reproduce the behavior:
The following function is executed for a period of 4 hours on identical GPUs(via an LSF bach system) once with UseAdapter set to true and once with it set to False. The path contains a preloaded and tokenized version of the SNLI training set(as well as the test and dev sets, dropped here via underscores)
def load_and_train(path, UseAdapter):
x_train,y_train,a_train,t_train,_,_,_,_,_,_,_,_=load(open(path,"rb"))
train_inst=torch.tensor(x_train)
train_att=torch.tensor(a_train)
train_types=torch.tensor(t_train)
train_targ=torch.tensor(y_train)
train_data = TensorDataset(train_inst, train_att, train_types,train_targ)
train_sampler = RandomSampler(train_data)
train_dataloader = DataLoader(train_data, sampler=train_sampler, batch_size=32)
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=3)
if UseAdapter:
model.add_adapter("SNLI",AdapterType.text_task,HoulsbyConfig().__dict__)
model.train_adapter(["SNLI"])
model.set_active_adapters(["SNLI"])
model.cuda()
optimizer=AdamW(model.parameters(),lr=1e-4)
scheduler=get_linear_schedule_with_warmup(optimizer,0,len(train_dataloader)*EPOCHS)
iter=0
time_load=0
time_cler=0
time_forw=0
time_back=0
time_updt=0
for e in range(15):
model.train()
for batch in train_dataloader:
last=time()
x=batch[0].cuda()
a=batch[1].cuda()
t=batch[2].cuda()
y=batch[3].cuda()
time_load+=time()-last
last=time()
model.zero_grad()
time_cler+=time()-last
last=time()
outputs = model(x, token_type_ids=t, attention_mask=a, labels=y)
time_forw+=time()-last
last=time()
loss=outputs[0]
loss.backward()
time_back+=time()-last
last=time()
optimizer.step()
scheduler.step()
time_updt+=time()-last
iter+=1
print(time_load,time_cler,time_forw,time_back,time_updt)
time_load
is identical for both casestime_cler
is slightly lower with adapters due to the presence of fewer gradientstime_forw
is slightly higher with adapters due to extra layers that are introducedtime_back
is significantly lower with adapters since it needs to save fewer gradientstime_updt
is lower with adapters due to having fewer parameters to updateOverall times(seconds):
Adapter | Load Time | Clear Time | Forward Prop | Backward Prop | Update | Total | No of Batches |
---|---|---|---|---|---|---|---|
No | 9.141064644 | 349.405822 | 873.8870151 | 11770.82554 | 1159.772 | 14163.03 | 69022 |
Yes | 2721.683394 | 394.4980106 | 1652.686945 | 3192.402303 | 6304.335 | 14265.61 | 95981 |
Per Batch Times(seconds):
Adapter | Load Time | Clear Time | Forward Prop | Backward Prop | Update |
---|---|---|---|---|---|
No | 0.000132437 | 0.005062238 | 0.012660992 | 0.1705373 | 0.016803 |
Yes | 0.028356481 | 0.004110168 | 0.017218897 | 0.033260774 | 0.065683 |
As is evident from above, points 2 and 6 above are not satisfied in this output.
Note that similar observations were made in 2 reruns of the experiment.
It is unclear to me if there is an explanation I am missing or if this is an implementation issue.
Adapter saving raises an Exception due to PfeifferConfig and HoulsbyConfig not being JSON serializable. The code assumes either a string or a dict, but not dataclasses as configurations.
See https://colab.research.google.com/drive/1ql343s22txh8q63w_Dfk25JoIj7pGdJB?usp=sharing
Even all the layer is loaded to the model, however, the weights are not applied at all.
is there any version issue or did I miss something?
model.load_adapter("/home/test/siqa_default", "text_task",config=PfeifferConfig(), with_head=False)
model.load_adapter("/home/test/a", "text_task",config=PfeifferConfig(), with_head=False)
model.load_adapter("/home/test/b", "text_task",config=PfeifferConfig(), with_head=False)
model.load_adapter("/home/test/c", "text_task",config=PfeifferConfig(), with_head=False)
model.load_adapter("/home/test/d", "text_task",config=PfeifferConfig(), with_head=False)
adapter_names = [
[
"siqa_default",
"a",
"b",
"c",
"d"
]
]
# pre-trained fusion_path
fusion_path = ="/home/test/fusion/siqa_defaut,a,b,c,d"
model.load_adapter_fusion(fusion_path)
# test_dataset
test_dataset = (
MultipleChoiceDataset(
data_dir=data_args.data_dir,
tokenizer=tokenizer,
task=task_type,
max_seq_length=data_args.max_seq_length,
overwrite_cache=data_args.overwrite_cache,
mode=Split.test,
)
)
def compute_metrics(p: EvalPrediction) -> Dict:
preds = np.argmax(p.predictions, axis=1)
return {"acc": simple_accuracy(preds, p.label_ids)}
# Initialize our Trainer
trainer = Trainer(
model=model,
args=training_args,
compute_metrics=compute_metrics,
adapter_names=adapter_names
)
Hi! I have many language adapters, each trained with masked language modeling on different (English) datasets.
I want to be able to load 1 BERT model, load each of the adapters, and then decide which adapter to use on any given forward pass. This would save on memory and loading time, as opposed to loading a separate BERT for each adapter. This seems possible -- here's what I'm doing:
model_name = 'bert-base-cased'
model = BertForMaskedLM.from_pretrained(model_name)
model.load_adapter(ADAPTER1)
model.load_adapter(ADAPTER2)
# In practice, I have many more adapters
output1 = model(input_ids, token_type_ids, attention_mask, adapter_names=[ADAPTER1])
output2 = model(input_ids, token_type_ids, attention_mask, adapter_names=[ADAPTER2])
However, I am getting different outputs than having one model per adapter. For example, output1
above is different from output1
below, and output2
above is different from output2
below.
model_name = 'bert-base-cased'
model1 = BertForMaskedLM.from_pretrained(model_name)
model1.load_adapter(ADAPTER1)
output1 = model1(input_ids, token_type_ids, attention_mask, adapter_names=[ADAPTER1])
model2 = BertForMaskedLM.from_pretrained(model_name)
model2.load_adapter(ADAPTER2)
output2 = model2(input_ids, token_type_ids, attention_mask, adapter_names=[ADAPTER2])
I'm trying to read https://github.com/Adapter-Hub/adapter-transformers/blob/master/src/transformers/adapter_model_mixin.py#L339 to see why this would be the case. Is this expected behavior? Am I doing something wrong?
Thanks!
Hi,
I built my own sentiment-analysis-adapter but I found the inference result are different although the same text was input.
It is necessary to freeze the layers during inference or is there anything wrong with training configuration?
Training code (similar with: https://colab.research.google.com/github/Adapter-Hub/website/blob/master/app/static/notebooks/Adapter_Quickstart_Training.ipynb#scrollTo=M6vjtq3NHtxS)
model_name = "cl-tohoku/bert-base-japanese-whole-word-masking"
# BERT model for Japanese
data_args = GlueDataTrainingArguments(task_name="sst-2", data_dir="./glue_data/yahoo_movie_reviews/")
# yahoo_movie_reviews is a dataset very similar with sst-2 but in Japanese
training_args = TrainingArguments(
logging_steps=1000,
per_device_train_batch_size=32,
per_device_eval_batch_size=64,
save_steps=1000,
evaluate_during_training=True,
output_dir="./models/yahoo_movie_reviews",
overwrite_output_dir=True,
do_train=True,
do_eval=True,
do_predict=True,
learning_rate=0.0001,
num_train_epochs=10,
)
set_seed(training_args.seed)
num_labels = glue_tasks_num_labels[data_args.task_name]
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelWithHeads.from_pretrained(model_name)
model.add_adapter("sst-2", AdapterType.text_task)
model.train_adapter(["sst-2"])
model.add_classification_head("sst-2", num_labels=num_labels)
model.set_active_adapters([["sst-2"]])
train_dataset = GlueDataset(data_args, tokenizer=tokenizer)
eval_dataset = GlueDataset(data_args, tokenizer=tokenizer, mode="dev")
def compute_metrics(p: EvalPrediction):
preds = np.argmax(p.predictions, axis=1)
return glue_compute_metrics(data_args.task_name, preds, p.label_ids)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
compute_metrics=compute_metrics,
)
trainer.train()
trainer.evaluate()
Inference code is almost as same as https://colab.research.google.com/github/Adapter-Hub/website/blob/master/app/static/notebooks/Adapter_Quickstart_Inference.ipynb#scrollTo=2xwdA1sz7eZO
Thanks!
Lai
Model I am using (Bert, XLNet ...):
bert-base-multilingual-cased
Language I am using the model on (English, Chinese ...):
Finnish
Adapter setup I am using (if any):
Fi language adapter (fine-tuned the pre-trained one from AdapterHub) & NER task adapter (newly initialized and fine-tuned)
The problem arises when using:
Using run_ner.py
The tasks I am working on is:
NER on the FiNER dataset
Steps to reproduce the behavior:
pytorch_model_head.bin
eachrun_ner.py
script again, with do_train=False
and do_evaluate=False
in order to predict only, load_lang_adapter
pointing to the dir where the finetuned language adapter is stored and load_task_adapter
pointing to where the finetuned task adapter is storedpytorch_model_head.bin
whereas the task adapter's pytorch_model_head.bin
is not used07/20/2020 14:50:17 - INFO - transformers.adapter_model_mixin - Loading module configuration from best_model/ner/adapter_config.json
07/20/2020 14:50:17 - INFO - transformers.adapter_config - Adding adapter 'ner' of type 'text_task'.
07/20/2020 14:50:17 - INFO - transformers.adapter_model_mixin - Loading module weights from best_model/ner/pytorch_adapter.bin
07/20/2020 14:50:17 - INFO - transformers.adapter_model_mixin - Loading module configuration from best_model/fi/adapter_config.json
07/20/2020 14:50:17 - INFO - transformers.adapter_config - Adding adapter 'fi' of type 'text_lang'.
07/20/2020 14:50:17 - INFO - transformers.adapter_model_mixin - Loading module weights from best_model/fi/pytorch_adapter.bin
07/20/2020 14:50:17 - INFO - transformers.adapter_model_mixin - Loading module configuration from best_model/fi/head_config.json
07/20/2020 14:50:17 - INFO - transformers.adapter_model_mixin - Loading module weights from best_model/fi/pytorch_model_head.bin
My intuition here is that the pytorch_model_head.bin
in both the fi lang adapter's and the ner task adapter's directory should be the same since I used both during the training process, but it's unclear to me if that's the case. Since this is an NER script, I would also expect that the head is loaded from the NER task adapter directory. If I wanted to use a different language adapter for Finnish now, I would have to copy the pytorch_model_head.bin
from the old language adapter's directory to the new language adapter's directory because the script, by default, tries to load it from there. If it loaded it from the NER adapter's directory instead, this would not be an issue. I'm not aware if there are some additional flags I can set to change this. It may not be a bug really, but it definitely caused some confusion on my side.
transformers
version: 2.11.0Hi, I tried to run the Quickstart Tutorial here, so Im using that exact code. When loading the adapter with model.load_adapter('sst')
Im getting an error mentioned below. Using "sst-2" instead of "sst" returns another warning?
Model I am using (Bert, XLNet ...):
Language I am using the model on (English, Chinese ...):
Adapter setup I am using (if any):
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
Using
# load pre-trained task adapter from Adapter Hub
# with with_head=True given, we also load a pre-trained classification head for this task
model.load_adapter('sst', config='pfeiffer', with_head=True)
# activate the adapter we just loaded, so that it is used in every forward pass
model.set_active_adapters('sst')
Returns
raise EnvironmentError("No adapter with name '{}' was found in the adapter index.".format(specifier))
OSError: No adapter with name 'sst' was found in the adapter index.
Using:
# load pre-trained task adapter from Adapter Hub
# with with_head=True given, we also load a pre-trained classification head for this task
model.load_adapter('sst-2', config='pfeiffer', with_head=True)
# activate the adapter we just loaded, so that it is used in every forward pass
model.set_active_adapters('sst-2')
Returns
INFO:transformers.adapter_bert:No prediction head for task_name 'sst-2' available.
WARNING:transformers.adapter_bert:No prediction head is used.
transformers
version: 2.11.0Hi,
I really like this project and I am wondering if the adapter-transformers could support DistilBERT model,
since I have following errors when I trained an adapter for my own DistilBERT:
model_name = "bandainamco-mirai/distilbert-base-japanese"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelWithHeads.from_pretrained(model_name)
model.add_adapter("yahoo_movie_reviews", AdapterType.text_task)
model.train_adapter(["yahoo_movie_reviews"])
Then I got:
'ValueError: Unrecognized configuration class for this kind of AutoModel: AutoModelWithHeads.
Model type should be one of XLMRobertaConfig, RobertaConfig, BertConfig.
model = AutoModelForSequenceClassification.from_pretrained("bandainamco-mirai/distilbert-base-japanese")
tokenizer = AutoTokenizer.from_pretrained("bandainamco-mirai/distilbert-base-japanese")
model.load_adapter("adapter/yahoo_movie_reviews")
Then I got:
AttributeError: 'DistilBertForSequenceClassification' object has no attribute 'load_adapter'
I wonder if it is possible to train an adapter for DistilBERT by changing part of exist codes.
Thanks for any insights on that.
Add code examples for zero-shot cross-lingual transfer like classification problem.
As the description at Adapter documentation, I understand that the adapter can be performed zero-shot cross-lingual transfer, so I am facing the problem that I want to build a classification model will be train on English and test on another language like Chinese, Spanish,.. so can you add an example to use Adapter for this setting.
I already go through available examples but I still did not clarify how to use it, please let me know if I missed something.
Thank you so much
Hi!
I'm trying to use adapters for a task where the input is a set of documents each containing multiple sentences (hence multiple cls
tokens). The goal is to assign a binary label to each sentence of each document. In the case of fine-tuning, I would get the output of a pretrained BERTModel
, grab the cls embeddings and feed it through a simple classifier. But with adapters, I'm not sure how I can access the output of the base language model, process it, and feed it to a classification head. Any pointers would be greatly appreciated!
Hi,
So AllenNLP has already wrapped many transformer classes from the original transformers library, which uses many same names as adapter-transformers (since the latter was forked from the former).
Given the historical baggage associated with the 'Master' branch, I propose moving to the main branch.
https://www.hanselman.com/blog/easily-rename-your-git-default-branch-from-master-to-main
Hinglish: Romanized version of Hindi, and is immensely popular in India, where Hindi is spoken by millions of people but typed quite often in Roman script
Dataset: SemEval 2020 Task 9 Sentiment Analysis: 3 classes, +ve, -ve and neutral
HinglishDataset
class and other skeleton code -- I'd appreciate a review if I got something wrongIf all is well in the code above, I'd like to continue along and contribute an adapter for Hinglish under the Sentiment task.
Hi there,
I need to reproduce the adapter fusion paper's results in table 1, I have a couple of questions:
Thank you.
Hi,
I'm trying to train an adapter and I'm getting the following error:
RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)
to reproduce this, you can just run any of the colabs provided in the tutorials. For example, running the cells in this one, the training line (trainer.train()
) throws this error.
Hi, I just wanted to train an adapter with the token classification example (using CoNLL-2003 NER dataset). I'm using the following json-based configuration:
{
"data_dir": "./data_en",
"labels": "./data_en/labels.txt",
"model_name_or_path": "bert-large-cased",
"output_dir": "conll2003-en-1",
"max_seq_length": 128,
"num_train_epochs": 3,
"per_device_train_batch_size": 32,
"save_steps": 750,
"seed": 1,
"do_train": true,
"do_eval": true,
"do_predict": true,
"fp16": true,
"train_adapter": true,
"adapter_config": "pfeiffer",
"language": "en"
}
and run it with python3 run_ner.py <config>.json
. Then the following error message is thrown:
Traceback (most recent call last):
File "run_ner.py", line 323, in <module>
main()
File "run_ner.py", line 248, in main
model_path=model_args.model_name_or_path if os.path.isdir(model_args.model_name_or_path) else None
File "/mnt/adapter-transformers/src/transformers/trainer.py", line 484, in train
tr_loss += self._training_step(model, inputs, optimizer)
File "/mnt/adapter-transformers/src/transformers/trainer.py", line 592, in _training_step
outputs = model(**inputs, adapter_names=self.adapter_names)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/adapter-transformers/src/transformers/modeling_bert.py", line 1463, in forward
adapter_names=adapter_names,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/adapter-transformers/src/transformers/modeling_bert.py", line 780, in forward
adapter_names=adapter_names,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/adapter-transformers/src/transformers/modeling_bert.py", line 437, in forward
adapter_names=adapter_names,
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/adapter-transformers/src/transformers/modeling_bert.py", line 403, in forward
layer_output = self.output(intermediate_output, attention_output, attention_mask, adapter_names=adapter_names)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/adapter-transformers/src/transformers/modeling_bert.py", line 368, in forward
hidden_states = self.adapters_forward(hidden_states, input_tensor, attention_mask, adapter_names)
File "/mnt/adapter-transformers/src/transformers/adapter_bert.py", line 443, in adapters_forward
adapter_stack=adapter_stack,
File "/mnt/adapter-transformers/src/transformers/adapter_bert.py", line 374, in adapter_stack_layer
hidden_states, query, residual = self.get_adapter_preparams(adapter_config, hidden_states, input_tensor)
File "/mnt/adapter-transformers/src/transformers/adapter_bert.py", line 320, in get_adapter_preparams
if adapter_config["residual_before_ln"]:
TypeError: 'NoneType' object is not subscriptable
Do I need to provide additional options ๐ค
Thanks many in advance,
Stefan
transformers
version: latest from master
, ee2adadnvidia/cuda:10.2-cudnn7-devel
fp16
Model I am using (Bert, XLNet ...): bert-base-uncased
Language I am using the model on (English, Chinese ...): EN
Adapter setup I am using (if any):
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
self.bert.load_adapter("sts/qqp@ukp", "text_task", config=PfeifferConfig())
self.bert.load_adapter("nli/rte@ukp", "text_task", config=PfeifferConfig())
self.bert.load_adapter("nli/qnli@ukp", "text_task", config=PfeifferConfig())
self.bert.load_adapter("nli/multinli@ukp", "text_task", config=PfeifferConfig())
self.bert.load_adapter("lingaccept/cola@ukp", "text_task", config=PfeifferConfig())
AdapterFusion version of QQP works. Does not work for cola and multinli.
Can load adapters.
Similar to #58 it would be nice to have support for ๐ฆ GPT-2.
The adapter fusion config is not JSON serializable. This can result in crashes when trying to save a model with fusion. See a minimal example below.
https://colab.research.google.com/drive/1YRwxVPe3-2QnatpG4fN_aZH00GoBMaqx?usp=sharing
AdapterFusionConfig should be JSON Serializable.
The function will add an adapter to the encoder by default. However, according to the doc, we could add an adapter through the add_adapter()
method. Is the initialization function redundant here?
Add our internal implementation of AdapterDrop to the AdapterHub repository.
https://arxiv.org/abs/2010.11918
This consists of several smaller features, which I'll add later to this feature request.
When I download an adapter, I would like to have something like
adapter.get_labels()
adapter.get_labels_dict()
so that I do not need to find the right labels in the README and reduce errors when copying them.
Maybe also
adapter.get_autoclass()
to know whether it is token classification or sequence classification
I want adapters be self-explaining
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.