GithubHelp home page GithubHelp logo

ckurze / simple-xml-import Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 13 KB

Simple import of xml files into MongoDB. Configurable mapping of XML Tags to Collections.

License: MIT License

Java 100.00%

simple-xml-import's Introduction

Simple XML Importer

This simple project is a generic importer of XML files into MongoDB. It allows to map which elements should be written into a certain collection. Furthermore, a SAX Parser is used which allows to process large files (in contrast to the DOM file where the whole XML document has to fit into memory).

I created this project while playing with product catalogs (like BMEcat, eCl@ass, ETIM, ...) that are used for product data exchange between companies and for supply chain automation.

A good example of a product catalog can be found at https://www.busch-jaeger.de/service-tools/downloads/stammdaten/. It follows the ETIM 6.0 product classification standard.

Build

Maven project that manages all dependencies:

mvn package

Run

The XML Loader takes a JSON file as configuration, an example is given below.

{
        "mongoURI":"mongodb://localhost:27017/test?retryWrites=true",
        "database": "product_catalog",
        "xml_file": "/Users/christian.kurze/Downloads/2018-06_Busch-Jaeger_2018_ETIM_6.0.zip",
        "mapping": {
          "HEADER": "catalog",
          "PRODUCT": "product"
        },
        "dropCollections": true
}

The parameters are as following:

  • monogURI: MongoDB URI following the URI connection string syntax
  • database: The MongoDB database where to write the data
  • xml_file: The absolute path to the zipped XML file to be imported
  • mapping: A mapping of XML tags to collection names - all occurences of this tag will be written into the mentioned collections
  • dropCollections: Boolean to indicate if the collections should be dropped before inserting data

Start:

java -jar XMLImportBulk.jar -c config/sample.json

Future Improvements

  • More clever handling of the stack (not needed elements are written to the internal stack consuming memory)

simple-xml-import's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.