GithubHelp home page GithubHelp logo

Comments (21)

sanchitvj avatar sanchitvj commented on May 14, 2024 2

from deeplake.

AbhinavTuli avatar AbhinavTuli commented on May 14, 2024 1

That's great @sanchitvj. Here's a tutorial for uploading datasets using Hub that might be helpful for you!

from deeplake.

AbhinavTuli avatar AbhinavTuli commented on May 14, 2024 1

@sanchitvj No, it's not required but feel free to take a look if you ever want to understand how something is working under the hood!

from deeplake.

AbhinavTuli avatar AbhinavTuli commented on May 14, 2024 1

@sanchitvj did you take a look at the tutorial mentioned above? It has links to a couple of examples that would be helpful.
Here's an example that includes training as well, https://github.com/activeloopai/Hub/tree/master/examples/fashion-mnist.
Let me know if you have any particular doubts. I'd be happy to help.

from deeplake.

AbhinavTuli avatar AbhinavTuli commented on May 14, 2024 1

@sanchitvj sorry for getting back to you so late, somehow missed this.
The purpose of the generator class is to take a single item from a list and return a dictionary of numpy arrays. The dictionary will contain separate keys corresponding to each feature of the dataset(i.e. for images and for all the different annotations in MPII). You don't really need to go too much into how hub collections work for this.
Did you get a chance to go through the tutorial :- https://github.com/activeloopai/Hub/discussions/125?
Also, take a look at this example :-https://github.com/activeloopai/omdena-aerial/blob/master/store_omdena.py, it's a little easier to understand than the COCO example.
If it's still not clear, do join our dedicated Slack channel and we can set up a call to discuss in detail.

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@kristinagrig06 I would like to work on this issue, please assign me.

from deeplake.

mikayelh avatar mikayelh commented on May 14, 2024

Hi @sanchitvj ! Assigned you to this issue. Thanks for your willingness to contribute! Let me know if you have any questions! :)

from deeplake.

mikayelh avatar mikayelh commented on May 14, 2024

Hi, @sanchitvj ! Hope this finds you well. Dropping a note to check in on you an ask if you need a hand with uploading the dataset. Feel free to ask us in the GitHub Discussions (we have beta access!) or our dedicated Slack channel. Thanks a mil!

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

I've one query, do I need to know the codebase of hub.

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@AbhinavTuli is there any example available on how to use the hub for loading dataset, visualize data(like what is present in the data), and training(using TensorFlow). The dataset I'm working on is challenging to use, process, and train.

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@AbhinavTuli Can I know what CocoGenerator class is doing? I'm facing difficulties understanding that. How the output of that class looks like. And in the COCO upload example, it's not clear because I can't see what are the outputs. I've done most of the part just want to deal with this issue of the generator. COCO upload example isn't much useful because mpii annotations is not the same as COCO. So can you guide me on how to write a generator function for this purpose and what all code files from the hub collections should I understand to get the basic idea to come over this issue?

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@AbhinavTuli I'm almost done. But how can I see that output is as expected? Here is my code. When I'm trying to print, this: '<hub.collections.dataset.core.Dataset object at 0x7f55ae0aac50>' is the output. So how can I check it's working correctly?

from deeplake.

AbhinavTuli avatar AbhinavTuli commented on May 14, 2024

Hey @sanchitvj, you can test out the code by using ds.store("./mpii"), this will store the dataset locally instead of uploading it to hub and should be much faster.
You can then load this saved dataset and try iterating over it

import hub
ds = hub.load("./mpii")
for item in ds:
    print(item["data"].compute())
    print(item["labels"].compute())

Just replace the keys ("data" and "labels") with your actual ones. Let me know how it goes!

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@AbhinavTuli This is the error coming after I'm doing ds.store(). Can you help me with it?

`0
Traceback (most recent call last):
File "", line 45, in call
ds["image"][i] = np.array(Image.open(img_path + all[i]['img_paths']))
KeyError: 0
Stack (most recent call last):
File "/usr/lib/python3.6/threading.py", line 884, in _bootstrap
self._bootstrap_inner()
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.6/dist-packages/distributed/threadpoolexecutor.py", line 55, in _worker
task.run()
File "/usr/local/lib/python3.6/dist-packages/distributed/_concurrent_futures_thread.py", line 65, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.6/dist-packages/distributed/worker.py", line 3411, in apply_function
result = function(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/distributed/worker.py", line 3304, in execute_task
return func(*map(execute_task, args))
File "/usr/local/lib/python3.6/dist-packages/hub/collections/dataset/init.py", line 13, in _generate
output = generator(input)
File "", line 65, in call
logger.error(e, exc_info=e, stack_info=True)
distributed.worker - WARNING - Compute Failed
Function: execute_task
args: ((<function generate at 0x7fdbcd107f28>, <main.MPIIGenerator object at 0x7fdb0e5c02e8>, (<class 'dict'>, [['dataset', 'MPI'], ['isValidation', 0.0], ['img_paths', '003353243.jpg'], ['img_width', 1280.0], ['img_height', 720.0], ['objpos', [984.0, 97.0]], ['joint_self', [[991.0, 109.0, 0.0], [972.0, 101.0, 0.0], [1040.0, 47.0, 1.0], [1071.0, 116.0, 1.0], [999.0, 222.0, 1.0], [1033.0, 248.0, 0.0], [1056.0, 82.0, 1.0], [942.0, 96.0, 1.0], [937.583, 95.954, 1.0], [851.417, 95.046, 1.0], [962.0, 39.0, 0.0], [0.0, 0.0, 0.0], [926.0, 52.0, 1.0], [957.0, 139.0, 1.0], [980.0, 211.0, 1.0], [926.0, 257.0, 1.0]]], ['scale_provided', 2.585], ['joint_others', [[672.0, 231.0, 1.0], [677.0, 151.0, 1.0], [672.0, 12.0, 1.0], [745.0, 89.0, 0.0], [757.0, 127.0, 1.0], [651.0, 65.0, 0.0], [709.0, 51.0, 0.0], [800.0, 67.0, 0.0], [780.16, 67.863, 1.0], [865.84, 64.137, 1.0], [707.0, 94.0, 1.0], [673.0, 22.0, 1.0], [763.0, 71.0, 1.0], [837.0, 62.0, 0.0], [814.0, 140.0, 1.0], [790.0, 220.0, 1.0]]], ['scale

kwargs: {}
Exception: AttributeError("'NoneType' object has no attribute 'keys'",)`

from deeplake.

AbhinavTuli avatar AbhinavTuli commented on May 14, 2024

I would probably need to look at the code to help you out but seems like an issue in implementing the call function

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@AbhinavTuli Here is the code. And how much time do you think it will take to store this 13 GB data.

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@AbhinavTuli @kristinagrig06 @davidbuniat dataset is uploaded, It's visible on the app and I've loaded it and used it. Working fine, so can I send the PR now with an example code.

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@AbhinavTuli I've sent PR but one of the checks is failing, can you help me understand it.

from deeplake.

davidbuniat avatar davidbuniat commented on May 14, 2024

@sanchitvj there is a linting error with Black, if you can fix it then happy to merge! thanks for making the dataset!

from deeplake.

sanchitvj avatar sanchitvj commented on May 14, 2024

@davidbuniat All build checks passed.

from deeplake.

davidbuniat avatar davidbuniat commented on May 14, 2024

@sanchitvj awesome! once we check the dataset is working will merge the PR! Thanks for the awesome job!

from deeplake.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.