GithubHelp home page GithubHelp logo

crisil_challenge_he's Introduction

CRISIL Machine-Learning Hiring Challenge-19

CRISIL Machine Learning Hiring Challenge'19 organised by Hackerearth

Extract Financial Statement Values

Financial statement analysis is critical to every organization to enable companies to make better economic decisions that yields more income in the future. For a given financial statement, a rulebook is followed to extract the values associated with accrual, audit status, balance sheet, measurement date and pension plans as shown (refer rulebook & sample data here)

Task:

  • Build an NLP model which analyzes each document, looks for relevant financial terms (described below)

  • The NLP Model should learn on Documents available under Training data folder using Rules listed here

  • Apply the same model on Test Data documents to extract relevant financial information

  • The results need to be updated & uploaded in the sheet provided: ‘Results.csv’ (download dataset)

Note: No training labels are explicitly available for this problem statement. You will be able to test your model's accuracy by submitting values extracted for Test data.

Data Description:

Columns Description of Values
Credit Name, State, Security ID, Org ID, FYE Identifiers from Documents provided to you already in “Results.csv”
Accounting Basis Identify ‘Basis of Accounting’ as [‘Accrual’, ‘Cash’, ‘Modified Accrual’, ‘Modified Cash’, ‘Regulatory’]
Pension Plan 1 Name* Pension Plan Identifier for Pension Plan with highest Total Pension Liability (0 if no date is specified)
Pension Plan 1 Measurement Date Reporting Date for Pension Plan 1 (0 if no date is specified) [DD/MM/YYYY]
Pension Plan 1 Total Pension Liability Total Pension Plan liability for Pension Plan 1 (0 if no liability is specified) [int64]
Balance Sheet Cash Total value of Balance sheet-Governmental funds under ‘Cash & Cash Equivalents’ row. This will include all other row items as well which have been highlighted in the rules for cash and cash equivalent. (0 if no balance sheet amount is specified) [int64]
Pension Plan 2 Name* Pension Plan Identifier for Pension Plan with second highest Total Pension Liability (0 if no date is specified)
Pension Plan 2 Measurement Date Reporting Date for Pension Plan 1 (0 if no date is specified) [DD/MM/YYYY]
Pension Plan 2 Total Pension Liability Total Pension Plan liability for Pension Plan 1 (0 if no date is specified) [int64]

*Note: - Map Pension Plan Names to Industry Standards as specified here

Data Volume:

Train Data: 479 Documents, 1.5GB Test Data: 98 Documents, 373.7MB Submission Format: You need to update Results.csv provided here, with your predictions and upload it online. Data Files: Download Dataset (~1.7GB)

Evaluation Metric:

[ML_Model] accuracy_score

Leaderboard score = 100 * mean(accuracy_score[columns])

[Offline] Source Code, Models/Logic used

Leaderboard

Problem Link

Leaderboard Link

Rank 5

crisil_challenge_he's People

Contributors

shivam13juna avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.