GithubHelp home page GithubHelp logo

isabella232 / data-factory-copy-blob-to-blob-python Goto Github PK

View Code? Open in Web Editor NEW

This project forked from azure-samples/data-factory-copy-blob-to-blob-python

0.0 0.0 0.0 12 KB

Python script for creating a data factory that copies data from one folder to another in an Azure Blob Storage

License: MIT License

Python 100.00%

data-factory-copy-blob-to-blob-python's Introduction

services platforms author
data-factory
python
spelluru

Sample: copy data one folder to another folder in an Azure Blob Storage

In this sample you do the following steps by using Python SDK:

  1. Create a data factory.
  2. Create a linked service to link your Azure Storage account to the data factory.
  3. Create a dataset that represents input/output data used by the copy activity.
  4. Create a pipeline with a copy activity that copies data.

Prerequisites

  • Azure subscription. If you don't have a subscription, you can create a free trial account.
  • Azure Storage account. You use the blob storage as source and sink data store. If you don't have an Azure storage account, see the Create a storage account article for steps to create one.
  • Create an application in Azure Active Directory following this instruction. Make note of the following values that you use in later steps: application ID, authentication key, and tenant ID. Assign application to "Contributor" role by following instructions in the same article.
  • Create a blob container in Blob Storage, create an input folder in the container, and upload some files to the folder. The sample code use the inputpy and outputpy as input and output folder names. If you use different folders, update these values in the source code.

Install the Python package

  1. Open a terminal or command prompt with administrator privileges.ย 

  2. First, install the Python package for Azure management resources:

    pip install azure-mgmt-resource
    
  3. To install the Python package for Data Factory, run the following command:

    pip install azure-mgmt-datafactory
    

    The Python SDK for Data Factory supports Python 2.7, 3.3, 3.4, 3.5 and 3.6.

Set values for placeholders

In the source code, set values to replace the following placeholders:

# Specify your Azure subscription ID
subscription_id = '<Azure subscription ID>'

# Specify a name for the Azure resource group. 
rg_name = '<Azure resource group name>'

# Specify a name for the data factory. It must be globally unique.
df_name = '<Data factory name>'        

# Specify your Active Directory application ID, application authentication key, and tenant ID
credentials = ServicePrincipalCredentials(client_id='<AAD client ID>', secret='<AAD app authentication key>', tenant='<AAD tenant ID>')

# Specify your Azure storage account name and key
storage_string = SecureString('DefaultEndpointsProtocol=https;AccountName=<Azure storage account>;AccountKey=<Azure storage authentication key>')

See Also

For step-by-steps instructions to create this sample from scratch, see Quickstart: create a data factory and pipeline using Python.

data-factory-copy-blob-to-blob-python's People

Contributors

microsoftopensource avatar msftgits avatar spelluru avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.