GithubHelp home page GithubHelp logo

abhishekms1047 / machine-learning-for-telecommunications Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aws-solutions-library-samples/machine-learning-for-telecommunications

0.0 1.0 0.0 7.8 MB

A base solution that helps to generate insights from their data. The solution provides a framework for an end-to-end machine learning process including ad-hoc data exploration, data processing and feature engineering, and modeling training and evaluation. This baseline will provide the foundation for industry specific data to be applied and models created to release industry specific ML solutions.

License: Apache License 2.0

Shell 0.40% Jupyter Notebook 96.49% Python 3.11%

machine-learning-for-telecommunications's Introduction

AWS Machine Learning for All

Machine Learning for All is a solution that helps data scientists in the industry get started using machine learning to generate insights from their data. The solution provides a framework for an end-to-end machine learning process including ad-hoc data exploration, data processing and feature engineering, and modeling training and evaluation.

AWS Machine Learning for Telecommunications

Machine learning (ML) helps Amazon Web Services (AWS) customers use historical data to predict future outcomes, which can lead to better business decisions.

Machine learning techniques are core to the Communications Service Providers (CSPs) industry. CSPs can use ML algorithms to construct and refine mathematical models from their business data, and then use these models to help identify fraudulent use of network services, automate network functions (zero touch), and reduce customer churn. AWS offers several machine learning services and tools tailored for a variety of use cases and levels of expertise, however it can be a challenge to understand the mechanics of model training and tuning, identify relevant data features, and design a workflow that can perform complex extraction, transformation, and loading (ETL) activities, and also scale to accommodate large datasets. To help customers get started with a machine learning workflow for CSPs use cases, AWS offers the Machine Learning for telecom starter kit solution. This solution uses AWS CloudFormation to deploy a scalable, customizable machine learning architecture that leverages Amazon SageMaker, a fully managed machine learning service, and Jupyter Notebook, an open source web application for creating and sharing live code, equations, visualizations and narrative text. The solution package provides a framework for an end-to-end machine learning process including ad-hoc data exploration, data processing and feature engineering, and model training and evaluation. It includes a sample telecom IP Data Record (IPDR) dataset to demonstrate how to use machine learning algorithms to test and train models for predictive analysis in telecom. Customers can use the included models as a starting point to develop their own custom machine learning models, and customize the included notebooks for their own use case.

Getting Started

Prerequisites

The following procedures assumes that all of the OS-level configuration has been completed. They are:

The Machine Learning solution is developed with python notebook that runs on SageMaker using pyspark and python as underlying execution code.

1. Build the machine learning solution

Clone the machine-learning-for-telecommunications from GitHub repository:

git clone https://github.com/awslabs/machine-learning-for-telecommunications

2. Build the machine learning solution for deployment

Build the distributable:

DIST_OUTPUT_BUCKET=my-bucket-name # S3 bucket name where customized code will reside
SOLUTION_NAME=my-solution-name
VERSION=my-version # version number for the customized code
INSDUSTRY=telecom
cd ./deployment \n
chmod +x ./build-s3-dist.sh \n
./build-s3-dist.sh $DIST_OUTPUT_BUCKET $SOLUTION_NAME $VERSION $INSDUSTRY \n

Notes: The build-s3-dist script expects the bucket name as one of its parameters, and this value should not include the region suffix.

3. Upload deployment assets to your Amazon S3 bucket

The CloudFormation template is configured to pull the Lambda deployment packages from Amazon S3 bucket in the region the template is being launched in. Create a bucket in the desired region with the region name appended to the name of the bucket. eg: for us-east-1 create a bucket named: my-bucket-us-east-1.

aws s3 cp ./regional-s3-assets/ s3://my-bucket-us-east-1/machine-learning-for-all/$VERSION --recursive --acl bucket-owner-full-control

4. Upload dataset assets to your Amazon S3 Bucket

This solution includes synthetic demo IP Data Record (IPDR) datasets in Abstract Syntax Notation One (ASN.1) format and call detail record (CDR) format. You can choose to copy these datasets into the source bucket to run the demo of AWS Glue transformations and ML model predictions, or use your own datasets.

In the following example, the dataset is copied from the solutions-us-east-1 S3 bucket to your S3 bucket for your solution named: my-bucket-us-east-1, which also holds your other customized regional code.

DIST_OUTPUT_BUCKET=my-bucket-name # S3 bucket name where customized code will reside
SOLUTION_NAME=my-solution-name
VERSION=my-version # version number for the customized code
INDUSTRY=telecom
REGION=us-east-1 # region for S3 buckets for synthetic data

SYNTHETIC_DATASET=s3://solutions-${REGION}/machine-learning-for-all/$VERSION/industry/$INDUSTRY/data
MY_DATASET=${DIST_OUTPUT_BUCKET}-${REGION}/${SOLUTION_NAME}/$VERSION/industry/$INDUSTRY/data

aws s3 cp $SYNTHETIC_DATASET/cdr-start/cdr_start.csv s3://$MY_DATASET/cdr-start/cdr_start.csv --acl bucket-owner-full-control
aws s3 cp $SYNTHETIC_DATASET/cdr-stop/cdr_stop.csv s3://$MY_DATASET/cdr-stop/cdr_stop.csv   --acl bucket-owner-full-control
aws s3 cp $SYNTHETIC_DATASET/cdr-start-sample/cdr_start_sample.csv  s3://$MY_DATASET/cdr-start-sample/cdr_start_sample.csv  --acl bucket-owner-full-control
aws s3 cp $SYNTHETIC_DATASET/cdr-stop-sample/cdr_stop_sample.csv  s3://$MY_DATASET/cdr-stop-sample/cdr_stop_sample.csv  --acl bucket-owner-full-control

Notes: Choose your desired region by changing region in the above example from us-east-1 to your desired region of the S3 buckets.

5. Launch the CloudFormation template

  • Get the link of the machine-learning-for-all.template uploaded to your Amazon S3 bucket.
  • Deploy the Machine Learning solution to your account by launching a new AWS CloudFormation stack using the link of the machine-learning-for-all.template.
  • Currently, the Machine Learning solution can be deployed in the following regions: [ us-east-1, us-east-2, us-west-2, eu-west-1, eu-central-1, ap-northeast-1, ap-northeast-2, ap-southeast-2 ]

Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

machine-learning-for-telecommunications's People

Contributors

jpeddicord avatar shsenior avatar georgebearden avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.