GithubHelp home page GithubHelp logo

nicolaus-hee / sql-to-csv-converter Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 7 KB

Python script to convert HeidiSQL table exports from SQL format into CSV

Python 100.00%
csv-converter heidisql sql-parser

sql-to-csv-converter's Introduction

sql-to-csv-converter

Purpose

This script serves to convert a SQL dump created by HeidiSQL and consisting of (many) INSERT statements into CSV to allow for easier import in other tools. I had a very large dataset (~10 GB, >100 million lines) to export and convert so for performance reasons neither a direct CSV export from a regular HeidiSQL query nor a table export from PhpMyAdmin was an option.

Usage

python sql_to_csv.py -s source.sql -t output.csv

The script will find the first INSERT statement in source.sql and derive the column names from it. All subsequent ...VALUES... statements are converted into comma-separated lines which are stripped off remaining SQL syntax such as brackets. HeidiSQL portions the INSERTs in batches of 10,000 lines. Column names mentioned in any INSERT after the first are ignored to keep the CSV clean.

Example

Sample input (from HeidiSQL > Tools > Export Database as SQL, see sample.sql for full example):

INSERT INTO `table_name` (`statusID`, `status`, `equipment`, `timestamp`, `user`, `seconds_since_last`) VALUES
        (769506, 'ACTIVE', 10457770, '2018-01-02 16:50:42', '', 720706),
        (769514, 'ACTIVE', 10458220, '2018-01-02 16:50:47', '', 720705);

INSERT INTO `table_name` (`statusID`, `status`, `equipment`, `timestamp`, `user`, `seconds_since_last`) VALUES
        (769506, 'ACTIVE', 10457770, '2018-01-02 16:50:42', '', 720706),
        (769507, 'ACTIVE', 10457810, '2018-01-02 16:50:42', '', 720705);

Sample output (see sample.csv for full example):

statusID,status,equipment,timestamp,user,seconds_since_last
769506,'ACTIVE',10457770,'2018-01-02 16:50:42','',720706
769507,'ACTIVE',10457810,'2018-01-02 16:50:42','',720705
769506,'ACTIVE',10457770,'2018-01-02 16:50:42','',720706
769507,'ACTIVE',10457810,'2018-01-02 16:50:42','',720705

Performance

The output file is only touched every 10,000 lines yielding a significant performance increase versus saving every line individually. Converting 110,000,000 lines took 12 minutes, i.e. approx. 150,000 items per second.

sql-to-csv-converter's People

Contributors

nicolaus-hee avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.