GithubHelp home page GithubHelp logo

Comments (5)

leejamesss avatar leejamesss commented on June 11, 2024 1

#4688 might solve your issue, can you try that out?

Thank you for your prompt attention to the issue. However, upon careful consideration, it seems that the problem we are facing is distinct from the one addressed in issue #4688. The concern raised in #4688 is related to an additional repetition of the Beginning of Sequence (BOS) token, which is particularly relevant in the context of API interactions. In contrast, our scenario involves offline inference and does not exhibit the same pattern of BOS token repetition as described in the aforementioned issue.
Therefore, the solution or workaround suggested in issue #4688 may not be directly applicable to the problem we are encountering with the prompt_logprobs output and the proliferation of unexpected special tokens.

I appreciate your effort to provide a potential fix, and I will continue to explore alternative solutions to resolve the unexpected token issue we are experiencing. Should there be any further insights or suggestions you can offer, they would be most welcome.

Thank you once again for your assistance with this matter.

from vllm.

DreamGenX avatar DreamGenX commented on June 11, 2024 1

@DarkLight1337 This sounds related to #4577 -- something between 0.4.0.post1 and 0.4.1 changed the way tokenization works. I am for whatever reason getting back a sequence of tokens like <, <|, <|im_ etc. instead of the whole <|im_start|> at once.

from vllm.

DarkLight1337 avatar DarkLight1337 commented on June 11, 2024

#4688 might solve your issue, can you try that out?

from vllm.

DarkLight1337 avatar DarkLight1337 commented on June 11, 2024

I'm currently investigating a similar issue in #4200. It seems that there is something wrong with the detokenizing logic where new_decoded_token_text gets pre-padded with extra whitespace characters. @Yard1 @njhill do you have any idea about this?

Edit It seems that my particular issue is related to the chat template. @DreamGenX 's issue would be more relevant to this case.

(new_tokens, new_decoded_token_text, prefix_offset,

from vllm.

leejamesss avatar leejamesss commented on June 11, 2024

Hello @DarkLight1337, @DreamGenX, and everyone involved in this discussion,

Thank you for your ongoing investigation into the tokenization and detokenization logic within the vLLM project. I understand that there may be related issues, such as #4200 and #4577, which are being looked into.

However, for the issue at hand, which is #4772, I would like to clarify that our primary concern is not with the detokenization process or the formatting of the output. We are not using the detokenized text and, therefore, any irregularities in the detokenization are not relevant to our use case.

Our focus is on the correctness of the prompt_logprobs output. We are encountering a significant number of unexpected special tokens in the log probability dictionary, which is causing errors in our downstream processing. The presence of these tokens is not expected and is interfering with the intended functionality of the LLM model.

To reiterate, we need assistance in ensuring that the prompt_logprobs output is accurate and free from these unexpected special tokens for both the Llama3 and Llama2-13b-chat-hf models.

If there are any updates, insights, or suggestions on how to address this specific issue with the log probabilities, we would greatly appreciate the guidance.

Thank you for your attention to this matter, and we look forward to a resolution.

Best regards,
@leejamesss

from vllm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.