GithubHelp home page GithubHelp logo

Comments (15)

jameswex avatar jameswex commented on August 25, 2024 1

To run in a notebook, in your jupyter notebook, create your dataset and model classes and then create a LitWidget object with those objects and call render on it. An example can be seen here https://colab.sandbox.google.com/github/PAIR-code/lit/blob/main/lit_nlp/examples/notebooks/LIT_sentiment_classifier.ipynb in colab, but the code would be the same in jupyter.

If you want to see gradient-based salience methods in the LIT UI, then your model will need to have the apporpriate inputs and outputs to support them. See https://github.com/PAIR-code/lit/wiki/components.md#token-based-salience for details for having your model support the different salience methods.

from lit.

jameswex avatar jameswex commented on August 25, 2024

To return values from predict_minibatch, need to convert that tensor([0.6403, 0.3597]) into a raw array of just [0.6403, 0.3597] as opposed to a tensor

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

To return values from predict_minibatch, need to convert that tensor([0.6403, 0.3597]) into a raw array of just [0.6403, 0.3597] as opposed to a tensor

In which line of code?

from lit.

jameswex avatar jameswex commented on August 25, 2024

Not sure, you should check all your entries in batched_output to be sure they are normal python lists and not tensors. It might be the 'probas' entry that is the issue here.

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

Not sure, you should check all your entries in batched_output to be sure they are normal python lists and not tensors. It might be the 'probas' entry that is the issue here.

Updated the code but getting this warning not error.

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

@jameswex how can I launch the app in jupyter notebook itself instead as web page? How can I modify above code to do it?

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

@jameswex second question is, how to get gradient visulaization in salience maps. In the above code?

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

When I change to PCA viz its gives TypeError: (-0.7481077572209469+0j) is not JSON serializable.

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

To run in a notebook, in your jupyter notebook, create your dataset and model classes and then create a LitWidget object with those objects and call render on it. An example can be seen here https://colab.sandbox.google.com/github/PAIR-code/lit/blob/main/lit_nlp/examples/notebooks/LIT_sentiment_classifier.ipynb in colab, but the code would be the same in jupyter.

If you want to see gradient-based salience methods in the LIT UI, then your model will need to have the apporpriate inputs and outputs to support them. See https://github.com/PAIR-code/lit/wiki/components.md#token-based-salience for details for having your model support the different salience methods.

OK. 

Also how to overcome TypeError: (-0.7481077572209469+0j) is not JSON serializable.

from lit.

jameswex avatar jameswex commented on August 25, 2024

The model and dataset code shouldn't change for notebooks. It's just that you create a LitWidget with the model and datasets, instead of a Server. Then you call render on the widget object.

I'm not sure about the root cause of that specific error. It's most likely that your predict_minibatch fn is returning some value for one of its fields for each example that isn't a basic, JSON-serializable type.

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

The model and dataset code shouldn't change for notebooks. It's just that you create a LitWidget with the model and datasets, instead of a Server. Then you call render on the widget object.

I'm not sure about the root cause of that specific error. It's most likely that your predict_minibatch fn is returning some value for one of its fields for each example that isn't a basic, JSON-serializable type.

Converted all to list. Still same error.

def predict_minibatch(self, inputs):
       # Preprocess to ids and masks, and make the input batch.
       encoded_input = self.tokenizer.batch_encode_plus([ex["sentence"] for ex in inputs],return_tensors="pt",add_special_tokens=True,max_length=512,padding="longest",truncation="longest_first")

       # Check and send to cuda (GPU) if available
       if torch.cuda.is_available():
           self.model.cuda()
           for tensor in encoded_input:
               encoded_input[tensor] = encoded_input[tensor].cuda()
       # Run a forward pass.
       with torch.no_grad():  # remove this if you need gradients.
           out: transformers.modeling_outputs.SequenceClassifierOutput = self.model(**encoded_input)
           unused_attentions = out.attentions
           # print(unused_attentions)
           # print(type(unused_attentions))

       # Post-process outputs.
       batched_outputs = {
           "probas": torch.nn.functional.softmax(out.logits, dim=-1).tolist(),
           "input_ids": encoded_input["input_ids"],
           "ntok": torch.sum(encoded_input["attention_mask"], dim=1).tolist(),
           "cls_emb": out.hidden_states[-1][:, 0].tolist(),  # last layer, first token
       }

       for i in range(len(unused_attentions)):
           batched_outputs[f"layer_{i:d}_attention"] = unused_attentions[i].detach().numpy()

       # unbatched_outputs = utils.unbatch_preds(batched_outputs)
       # Return as NumPy for further processing.

       # for k, v in batched_outputs.items():
       #     print("batched_output")
       #     print(v)
       #     print(type(v))
       detached_outputs = {k: v for k, v in batched_outputs.items()}
       # print("detached_outputs")
       # print(detached_outputs)
       # Unbatch outputs so we get one record per input example.
       for output in utils.unbatch_preds(detached_outputs):
           ntok = output.pop("ntok")
           output["tokens"] = self.tokenizer.convert_ids_to_tokens(output.pop("input_ids")[1:ntok - 1])
           # print('output["tokens"]')
           # print(output)
           yield output

from lit.

iftenney avatar iftenney commented on August 25, 2024

Can you print the contents of batched_outputs, including types?

The error above:

TypeError: (-0.7481077572209469+0j) is not JSON serializable.

Looks like the value is a complex number a+bj, which is probably why it's not able to be serialized. NumPy arrays of floats should be fine, though; they'll be automatically converted to lists here: https://github.com/PAIR-code/lit/blob/main/lit_nlp/lib/serialize.py#L32

from lit.

pratikchhapolika avatar pratikchhapolika commented on August 25, 2024

Can you print the contents of batched_outputs, including types?

The error above:

TypeError: (-0.7481077572209469+0j) is not JSON serializable.

Looks like the value is a complex number a+bj, which is probably why it's not able to be serialized. NumPy arrays of floats should be fine, though; they'll be automatically converted to lists here: https://github.com/PAIR-code/lit/blob/main/lit_nlp/lib/serialize.py#L32

@iftenney here is the outputs.

batched_outputs after for loop

for i in range(len(unused_attentions)):
            batched_outputs[f"layer_{i:d}_attention"] = unused_attentions[i].detach().numpy()
{'probas': [[0.6018652319908142, 0.3981347680091858], [0.5785479545593262, 0.42145204544067383], [0.6183280348777771, 0.3816719651222229], [0.6127758026123047, 0.3872241675853729]], 'input_ids': tensor([[  101,  2079,  2017,  2031,  1037, 19085,  2030,  1037, 12436, 20876,
          1029,   102,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  2026,  5980,  2097,  5091,  2022,  3407,  2085,   999,   999,
          6293,  2026,  2524,  1010, 14908,  2266,  2046,  2115, 12436, 20876,
          3531,   999,   999,   999,   999,   999,   999,  1029, 10047,  2013,
          3742,  1057,  1029,   102,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  1045,  2031,  1037, 19085,  1012,  1045,  2215,  2000, 13988,
          1012,   102,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  8840,  2140,  8700,  3348, 12436, 20876,  3398,  2054,  1996,
         12436, 20876,  2054,  2003,  2009,  2170,  1045,  2409,  1057,  1996,
         12436, 20876,  2087,  2450,  2655,  2009,  1037, 22418,  3398,  4521,
          4596,  2012,  2026,  2814, 17710, 13668,  2054,  2106,  2017,  1998,
          2115,  3611,  2079,  2253,  2041,  2000,  4521,  2253,  2005,  1037,
          3298,  2074,  2985,  2051,  2362,  2042,  2785,  1997, 26352,  2098,
          2041,  2651,  2339,  1029,  2049,  7929,  2073,  2106,  2017,  4553,
          2055,  3348,  4033,  1005,  1056,  2428,  2021,  2113,  2070,  2616,
          2073,  2106,  1057,  2175,  1029,  7592,  1029,   102]]), 'ntok': [12, 34, 12, 88], 'cls_emb': [[-0.014076177030801773, -0.0728173702955246, -0.078043133020401, 0.0938369482755661, -0.17423537373542786, 0.07189369201660156, 0.6690779328346252, 1.1941571235656738, -0.5418111085891724, 0.09891873598098755, 0.34711796045303345, -0.3437187671661377, -0.1604285091161728, -0.10622479021549225, 0.3024073839187622, 0.12053345888853073, -0.01676577888429165, ......]]

'layer_0_attention': array([[[[4.81520668e-02, 4.21391986e-02, 2.80070100e-02, ...,
          0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
         [1.28935039e-01, 3.51342373e-02, 8.70151743e-02, ...,
          0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
         [1.03371695e-01, 5.93042485e-02, 6.06599301e-02, ...,
          0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
         ..., dtype=float32), 'layer_1_attention': array([[[[3.56219172e-01....



batched_output key and value
[[0.6018652319908142, 0.3981347680091858], [0.5785479545593262, 0.42145204544067383], [0.6183280348777771, 0.3816719651222229], [0.6127758026123047, 0.3872241675853729]]
<class 'list'>


batched_output key and value

 for k, v in batched_outputs.items():
            print("batched_output key and value")
            print(v)
            print(type(v))
            print("*******************************************")
tensor([[  101,  2079,  2017,  2031,  1037, 19085,  2030,  1037, 12436, 20876,
          1029,   102,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  2026,  5980,  2097,  5091,  2022,  3407,  2085,   999,   999,
          6293,  2026,  2524,  1010, 14908,  2266,  2046,  2115, 12436, 20876,
          3531,   999,   999,   999,   999,   999,   999,  1029, 10047,  2013,
          3742,  1057,  1029,   102,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  1045,  2031,  1037, 19085,  1012,  1045,  2215,  2000, 13988,
          1012,   102,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  8840,  2140,  8700,  3348, 12436, 20876,  3398,  2054,  1996,
         12436, 20876,  2054,  2003,  2009,  2170,  1045,  2409,  1057,  1996,
         12436, 20876,  2087,  2450,  2655,  2009,  1037, 22418,  3398,  4521,
          4596,  2012,  2026,  2814, 17710, 13668,  2054,  2106,  2017,  1998,
          2115,  3611,  2079,  2253,  2041,  2000,  4521,  2253,  2005,  1037,
          3298,  2074,  2985,  2051,  2362,  2042,  2785,  1997, 26352,  2098,
          2041,  2651,  2339,  1029,  2049,  7929,  2073,  2106,  2017,  4553,
          2055,  3348,  4033,  1005,  1056,  2428,  2021,  2113,  2070,  2616,
          2073,  2106,  1057,  2175,  1029,  7592,  1029,   102]])
<class 'torch.Tensor'>
*******************************************
batched_output key and value
[12, 34, 12, 88]
<class 'list'>

detached_outputs

detached_outputs = {k: v for k, v in batched_outputs.items()}
        print("detached_outputs")

{'probas': [[0.6018652319908142, 0.3981347680091858], [0.5785479545593262, 0.42145204544067383], [0.6183280348777771, 0.3816719651222229], [0.6127758026123047, 0.3872241675853729]], 'input_ids': tensor([[  101,  2079,  2017,  2031,  1037, 19085,  2030,  1037, 12436, 20876,
          1029,   102,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  2026,  5980,  2097,  5091,  2022,  3407,  2085,   999,   999,
          6293,  2026,  2524,  1010, 14908,  2266,  2046,  2115, 12436, 20876,
          3531,   999,   999,   999,   999,   999,   999,  1029, 10047,  2013,
          3742,  1057,  1029,   102,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  1045,  2031,  1037, 19085,  1012,  1045,  2215,  2000, 13988,
          1012,   102,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0,     0,     0],
        [  101,  8840,  2140,  8700,  3348, 12436, 20876,  3398,  2054,  1996,
         12436, 20876,  2054,  2003,  2009,  2170,  1045,  2409,  1057,  1996,
         12436, 20876,  2087,  2450,  2655,  2009,  1037, 22418,  3398,  4521,
          4596,  2012,  2026,  2814, 17710, 13668,  2054,  2106,  2017,  1998,
          2115,  3611,  2079,  2253,  2041,  2000,  4521,  2253,  2005,  1037,
          3298,  2074,  2985,  2051,  2362,  2042,  2785,  1997, 26352,  2098,
          2041,  2651,  2339,  1029,  2049,  7929,  2073,  2106,  2017,  4553,
          2055,  3348,  4033,  1005,  1056,  2428,  2021,  2113,  2070,  2616,
          2073,  2106,  1057,  2175,  1029,  7592,  1029,   102]]), 'ntok': [12, 34, 12, 88], 'cls_emb': [[-0.014076177030801773, -0.0728173702955246, -0.078043133020401, 0.0938369482755661, -0.17423537373542786, 0.07189369201660156, 0.6690779328346252, 1.1941571235656738, -0.5418111085891724, 0.09891873598098755, 0.34711796045303345, -0.3437187671661377, -0.1604285091161728, -0.10622479021549225, 0.3024073839187622, 0.12053345888853073, -0.01676577888429165, 0.67

from lit.

iftenney avatar iftenney commented on August 25, 2024

Thanks, all of those values look okay although the indentation is very strange so I could be missing something.
Can you post the error you're still seeing? You might try running under pdb and seeing which field it's coming from.

from lit.

aryan1107 avatar aryan1107 commented on August 25, 2024

@pratikchhapolika
To visualize Huggingface models you can start by adding any basic models directly to LIT. Here is one example which I did using Huggingface.... the code might help #691

from lit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.