Comments (4)
I vote for this! Would love to use Sagemaker to parallelize some retraining of Caffe models, but I don't want to do all the extra work of setting up my own container, and writing my own train/serve functions.
from amazon-sagemaker-examples.
Thanks for your interest in Amazon SageMaker, @dholdaway !
Currently you could use Caffe in SageMaker by bringing your own container, similar to our scikit example. That said, we use customer feedback to help us prioritize which features we add over time. With that in mind, are there any specific aspects of bringing your own Caffe model to SageMaker that have been difficult or where we could make it easier/faster?
from amazon-sagemaker-examples.
so I have a container running caffe but I am unsure of the next step.
from amazon-sagemaker-examples.
Thanks, @dholdaway . Is your container setup according to the specifications in the SageMaker AWS docs? At a high level that means, if you intend to use it for training, you have a train
function that reads data in from /opt/ml/input/data/ and outputs a trained model to /opt/ml/model. And/or if you intend to use it for hosting, you have a serve
function that reads in your model artifact, takes in an HTTP POST request body, parses it, and returns a prediction.
If that's been setup, then all you should need to do is publish that container to AWS ECS. Then you can either create a training job or a hosted endpoint. At a high level, creating a training job means creating an estimator that points to your container image in ECS, specifies instance type and count, and points to your data in S3. For hosting, if you've trained the model as an estimator in SageMaker you can just .deploy
it. If you're hosting a model artifact that was trained elsewhere, you'll need to create a SageMaker model by registering that model artifact, create an endpoint config with instance information, and then create the endpoint from the endpoint config. More details on that process can be found in these examples, or the API reference in the docs.
from amazon-sagemaker-examples.
Related Issues (20)
- How do you use the custom generator to train the TensorFlow model on PageMaker?
- [Example Request] Minimal Example for Fine Tuning a LLM with FSDP utilizing the HuggingFace Trainer
- [Bug Report] Forbidden(403) on Introduction to JumpStart - Sentence Pair Classification
- getting error:
- Getting "TypeError: can only join an iterable" while running "print(predictor.predict(test_data).decode("utf-8"))"
- [Bug Report] Example notebook has incorrectly formatted serving.properties
- AttributeError: module 'pandas.core.strings' has no attribute 'StringMethods'
- Inference Recommender Job fails
- [Bug Report]Error with using dgl library in Sagemaker HOT 1
- Deploy this TheBloke/vicuna-13B-v1.5-GGUF model on AWS
- Parameter validation failed: Unknown parameter in PrimaryContainer HOT 2
- [Bug Report] - README - Train EleutherAI GPT-J with Model Parallel Link Broken
- smddp_deepspeec_example doesn't run because of dependency issues.
- Unable to download model artifacts due to 403 forbidden error HOT 1
- Alter JupyterLab dockerfile to block target domain / IP from running contiainer
- [Bug Report] RuntimeError when running instruction fine-tuning on mistral 7b, Sagemaker Jumpstart HOT 2
- Torch not compiled with CUDA enabled when deploying T5 using Triton
- Out of Memory when running the notebook according to instructions HOT 1
- [Bug Report] You are forcing Jumpstart to use ml.p4d.24xlarge even when instance_type is specified HOT 2
- [Example Request] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amazon-sagemaker-examples.