Comments (5)
Update: Eventually status was updated to Failed due to:
Insufficient notebook instance capacity. Retry later or with a different instance type.
I am no longer able to select actions on this instance. However I did use this instance a few days ago; and now cannot access the notebooks on this instance.
from amazon-sagemaker-examples.
Thanks, @dtsukiyama , and sorry you're experiencing issues. You may try starting the Notebook Instance again to see what happens. It sounds like we may have temporarily just not had an p2 instances available in your region. If you continue to experience issues, I think the quickest way to get help would be to cut a ticket with AWS Support. That way, they can get the information needed for our engineering team dive in appropriately. Thanks!
from amazon-sagemaker-examples.
Hi @djarpin , thanks for replying. So I cut a ticket to support at the same time I reported this issue. They stated:
I understand that you have attempted to launch a ml.p2.xlarge notebook instance which is reflecting in a pending state. I have thoroughly reviewed your account and can confirm that currently you are approved for 1 ml.p2.xlarge. Upon checks on your account, I can see that there is a ml.p2.xlarge reflecting in a stopped state. This would be the reason that you are unable to launch the new instance as this would be seen as a second instance launch which is currently not available.
In order to launch your ml.p2.xlarge, you will need to terminate the current ml.p2.xlarge that is currently in a stopped state.
So, it isn't quite clear what this means. This instance had been used before, I was attempting to re-start it, and it failed. But it looks like they are saying that launching this would be seen as a new instance?
A failed notebook instance needs to be terminated, which I take to mean deleted. In this case it is not a huge deal since I did not have anything really important on the instance, I did have code and a couple of Jupyter notebooks. Also I am sorry if this is not quite the place to address this, I was just wondering if it was a bug. Thanks again.
from amazon-sagemaker-examples.
Thanks @dtsukiyama , I'll reach out to the Notebook engineering team on this. However, you can also submit a service limit increase so that you can have more than 1 ml.p2 Notebook Instance running at a time. Thanks.
from amazon-sagemaker-examples.
@dtsukiyama - Just confirmed with the engineering team that this was a temporary issue of not having an ml.p2.xlarge instance available. This can happen intermittently. If you run into this problem in the future, feel free to just retry starting the notebook instance in a few minutes to see if one has become available, or change to a different instance type (if another instance type will meet your needs anyway). Thanks.
from amazon-sagemaker-examples.
Related Issues (20)
- How do you use the custom generator to train the TensorFlow model on PageMaker?
- [Example Request] Minimal Example for Fine Tuning a LLM with FSDP utilizing the HuggingFace Trainer
- [Bug Report] Forbidden(403) on Introduction to JumpStart - Sentence Pair Classification
- getting error:
- Getting "TypeError: can only join an iterable" while running "print(predictor.predict(test_data).decode("utf-8"))"
- [Bug Report] Example notebook has incorrectly formatted serving.properties
- AttributeError: module 'pandas.core.strings' has no attribute 'StringMethods'
- Inference Recommender Job fails
- [Bug Report]Error with using dgl library in Sagemaker HOT 1
- Deploy this TheBloke/vicuna-13B-v1.5-GGUF model on AWS
- Parameter validation failed: Unknown parameter in PrimaryContainer HOT 2
- [Bug Report] - README - Train EleutherAI GPT-J with Model Parallel Link Broken
- smddp_deepspeec_example doesn't run because of dependency issues.
- Unable to download model artifacts due to 403 forbidden error HOT 1
- Alter JupyterLab dockerfile to block target domain / IP from running contiainer
- [Bug Report] RuntimeError when running instruction fine-tuning on mistral 7b, Sagemaker Jumpstart HOT 2
- Torch not compiled with CUDA enabled when deploying T5 using Triton
- Out of Memory when running the notebook according to instructions HOT 1
- [Bug Report] You are forcing Jumpstart to use ml.p4d.24xlarge even when instance_type is specified HOT 2
- [Example Request] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amazon-sagemaker-examples.