ibm-granite / granite-code-models Goto Github PK

View Code? Open in Web Editor NEW

1.0K 1.0K 66.0 17.5 MB

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Home Page: https://huggingface.co/collections/ibm-granite/granite-code-models-6624c5cec322e4c148c8b330

License: Apache License 2.0

granite-code-models's People

Contributors

Stargazers

Watchers

granite-code-models's Issues

Could you provide a sample script to start the model with openai access

GQA?

Have the granite models been trained with grouped query attention?

Question about PSM vs SPM

I noticed in the paper that it says:

We train our models to work with both PSM (Prefix-Suffix-Middle) and SPM (Suffix-Prefix-Middle) modes, with relevant formatting control tokens,

Have you noticed any difference in accuracy from PSM vs SPM completion? I have seen PSM used before, but I like the idea of SPM. Since the model would be completing almost directly from the prefix, I wonder if it would generate higher quality completions than PSM. If there is no difference, that would be cool to know as well.

I'm still waiting on llama.cpp support to fully materialize before I can try out these models, but they look really nice!

Support infilling?

Hello, many thanks for the brilliant work!

Does the granite code model support infilling format for code completion?

Is softmax scaling optional?

Can you elaborate on the significance of the softmax scaling? I can't find it referenced in the paper, and it seems to be applied differently for each of the three attention methods in the HF implementation:

Eager attention applies it whenever the dtype isn't FP32 (since scale_attention_softmax_in_fp32, attention_softmax_in_fp32 and scale_attn_weights are all set.
SDPA sets a scale of None, though seems prepared to change it to 1 if scale_attn_weights were unset. (?)
The flash-attn module has provisions for applying the scale in _flash_attention_forward, but that argument isn't passed so it defaults to None.

Presumably the models are trained with flash-attn so is this just not actually relevant?

[Question] When were the knowledge thresholds set?

Provide .GGUF files?

Would it be possible to provice a full range of GGUF files for these wicked models?

I'm tyring to convert the 3B myself, but running into issues.

ibm-granite / granite-code-models Goto Github PK

granite-code-models's People

Contributors

Stargazers

Watchers

Forkers

granite-code-models's Issues

Could you provide a sample script to start the model with openai access

GQA?

Question about PSM vs SPM

Support infilling?

Is softmax scaling optional?

[Question] When were the knowledge thresholds set?

Provide .GGUF files?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs