GithubHelp home page GithubHelp logo

csv-common-rows's Introduction

How to Run

  1. Main function is mutualCustomers which can be found in mutualCustomers.js along with its inputs (see Adjusting Inputs )
  2. From this directory, in terminal run the following to get an array of user objects that are mutual to your input files.
node mutualCustomers.js

If two files are valid and have any users intersection, the results will show like this:

[
  {
    'First Name': 'james',
    'Last Name': 'davis',
    Age: 19,
    State: 'alabama'
  },
  {
    'First Name': 'emily',
    'Last Name': 'kim',
    Age: 72,
    State: 'north dakota'
  },
  ....
]
  • Note all values will be lowercase in the results

Unit Tests

Unit tests can be found in __tests__ folder and can be run in terminal with npm run test

Adjusting Inputs:

In mutualCustomers.js, the function mutualCustomers(allCSVFileArray) can be executed to get the mutual customers between two customer CSVs.

Path to each CSV file:

const store1 = 'Store1 copy.csv'
const store2 = 'Store2 copy.csv'

Array containing the CSV file paths

const allCSVFileArray = [store1, store2]

Expected header for all of the files:

const expectedHeaders = ["First Name", "Last Name", "Age", "State"] 
  • Note: The first row of all the CSV files should include these headers, or else the function will throw an error

Other Constants that can be optionally changed:

const totalFiles = 2 // how many files should be expected, set to 2
const ageIndex = 2 // index of "Age" field in our headers

How the Code Works:

1. The main function is mutualCustomers which can be found in mutualCustomers.js. This function has the following inputs:

  • fileArr - array of strings representing the paths to the CSV files
  • headerArr - array of strings repressenting expected headers on all of the files
  • fileCount - number representing how many files are expected in fileArr
  • ageIndex - index of "Age" element within headerArr

This file additionally contains constants such as the CSV file paths, the expected file headers, and additionally the count of files expected and the index of the "Age" field.

When mutualCustomers is invoked with valid arguments, the function will invoke a check via isValidFileArray to see if the inputs to mutualCustomers are valid. If so, getIntersectionOfArr will be invoked.

2. The isValidFileArray function can be found in confirmValidFiles.js and its purpose is to determine if the fileArr argument is valid through the use of multiple helper functions within the same file

  • checkforFileArr function confirms that the fileArray argument is an array
  • checkFileCount function confirms the elements with fileArray is equal to the expectedCount argument
  • checkForCsvExt function confirms that the file path passed in is a CSV
  • checkFileExists function that confirms that the file path passed in is an existing file
  • checkForNoDuplicates function that confirms that there are no duplicate file paths within the file array

3. getIntersectionOfArr function is found in getIntersection.js and is invoked if all of the files in the file array are valid. This function has 3 main steps:

  • Creating an array of Sets where each set represents the data from one CSV which occurs through the readCsvFile function found in convertCSVdata.js.
  • Reducing the array of sets into a single set of common customers between the files via reduceUserSets function in the same file
  • Creating a final user array via the createUserObject function that formats the data from the intersection set into an array of user objects

4. readCsvFile function is found in convertCSVdata.js and is invoked for the first portion of getIntersectionOfArr to read a CSV file and convert its data into a Set with strings repreenging this data:

  • Using fs.createReadStreamand readline.createInterface, it will read a CSV file line by line
  • The validateHeader helper function is invoked on first like of the data to determine if the fields match the expeted headers
  • Subsequent lines will be validated (e.g. no empty fields, remove leading/trailing whitespace) and added to a Set in order to remove repeated rows within a file

csv-common-rows's People

Contributors

jasnoo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.