GithubHelp home page GithubHelp logo

csv_merge's Introduction

CSV Merge

This Python script allows merging two CSV files based on a common column. By default, it considers 'email' as the common column but it can be changed according to your dataset. The script creates a new CSV file keeping all columns from both CSVs. If any rows do not find a matching value in the other CSV based on the common column, those rows are added to a separate CSV file.

Features

  • Merges two CSV files based on a common column.
  • Outputs two CSVs: one for rows with matching values and the other for rows without a match.
  • The common column can be specified while running the script (defaults to 'email').

Requirements

Python 3.x and Pandas library are required.

Installation

Make sure you have Python3 and pip (Python package installer) installed on your system. Then use pip to install pandas:

pip install pandas
  • If you have both Python2 and Python3 installed and Python3 is not your default Python version, you may have to use pip3 install pandas to install for Python3.

  • If you only have Python3 installed or it's your default version, pip install pandas should work fine.

If you're not sure, you can always use python3 -m pip install pandas which will definitely install pandas for Python3.

Usage

To run the script, use the following command structure:

python csv_merge.py csv1 csv2 --name outputfile --path outputdirectory --on column

Where:

  • csv1 : Path to the first CSV file. (required)
  • csv2 : Path to the second CSV file. (required)
  • --name : Name of the output file for matched rows. Defaults to 'merged.csv'. (optional)
  • --path : Directory to save the output files. Defaults to the current directory. (optional)
  • --on : Column to merge on. Defaults to 'email'. (optional)

If the --name argument is included then the unmatched CSV file name will be a concatention of 'unmatched_' + name. If the --name argument is not included then the unmatched CSV file name will be unmatched.csv

Example:

python csv_merge.py data1.csv data2.csv --name result.csv --path /home/user/Results/ --on customerId

In the above example, the script will:

  • Merge 'data1.csv' and 'data2.csv' based on the 'customerId' column.
  • Write the merged data to '/home/user/Results/result.csv'.
  • Rows in either data1.csv or data2.csv that do not have a matching 'customerId' in the other file will be written to '/home/user/Results/unmatched_result.csv'.
  • If every row has a match, an unmatched CSV will still be created. It will just only have the header columns.

Enjoy merging your CSV files with ease!

csv_merge's People

Contributors

james9446 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.