GithubHelp home page GithubHelp logo

arjunvachhani / xlsxhelper Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 1.0 90.06 MB

An memory efficient, fast excel file reader designed for processing large Xlsx files.

C# 100.00%
dotnet dotnet-core excel excelreader xlsx xlsxreader huge-data-files huge-files fast low-memory

xlsxhelper's Introduction

XlsxHelper

XlsxHelper has been crafted with the primary intention of efficiently parsing extensive Excel files. This library is designed to be lightweight, ensuring that it doesn't load the entire dataset into memory all at once. Instead, it adopts a sequential approach, fetching and returning one row per iteration. As a result of this methodology, the memory overhead is minimized.

When to use XlsxHelper

  • You need to process large xlsx file.
  • You want to read content with very little RAM usage.
  • You want full control of mapping/parsing fields to Model.

When to not use XlsxHelper

  • You want to read rows in random manner.
  • You want to read thing like width of row/column, font size / color etc
  • You want to read xls file format.

Project Status

XlsxHelper is actively maintained. Please feel free to ask question and raise issues.

How to get started

Install NuGet package https://www.nuget.org/packages/XlsxHelper/

Example 1

using (var workbook = XlsxReader.OpenWorkbook(filePath))
{
    foreach (var worksheet in workbook.Worksheets)//read all worksheets
    {
        Console.WriteLine($"Worksheet {worksheet.Name}"); //get name of worksheet
        using var worksheetReader = worksheet.WorksheetReader; //get WorksheetReader from worksheet
        foreach (var row in worksheetReader) //read row from worksheetreader
        {
            Console.WriteLine($"Content of row {row.RowNumber}"); //display current row number
            foreach (var cell in row.Cells) // Display all cell content
            {
                Console.WriteLine($"[{cell.CellValue} at ({cell.ColumnName}{row.RowNumber})]");
            }
            Console.WriteLine($"Content of row {row.RowNumber} ends.");
        }
    }
}

Example 2

using (var workbook = XlsxReader.OpenWorkbook(filePath))
{
    var worksheet = workbook.Worksheets.First();
    bool headerRow = true;
    Dictionary<string, ColumnName> headerLooklup = null;
    foreach (var row in worksheet.WorksheetReader)
    {
        if (headerRow)
        {
            headerLooklup = ReadHeader(row); //Read all header names from first row
            headerRow = false;
            continue;
        }
        var student = new Student();
        student.FirstName = row[headerLooklup[nameof(Student.FirstName)]].CellValue; //get cell value 
        student.LastName = row[headerLooklup[nameof(Student.LastName)]].CellValue;
        student.Grade = row[headerLooklup[nameof(student.Grade)]].CellValue;
        student.Marks = new Marks
        {
            Biology = row[headerLooklup[nameof(Marks.Biology)]].GetInt32(),
            Chemistry = row[headerLooklup[nameof(Marks.Chemistry)]].GetInt32(),
            Mathematics = row[headerLooklup[nameof(Marks.Mathematics)]].GetInt32(),
            Physics = row[headerLooklup[nameof(Marks.Physics)]].GetInt32()
        };

        //Process student object. 
        //yield return student;
    }

    static Dictionary<string, ColumnName> ReadHeader(Row row) //read header
    {
        Dictionary<string, ColumnName> headerLooklup = new Dictionary<string, ColumnName>();
        foreach (var cell in row.Cells)
        {
            headerLooklup.Add(cell.CellValue, cell.ColumnName);
        }
        return headerLooklup;
    }
}

XlsxHelper is fast and lightweight for normal files. Below results shows time and memory used to read 50MB/50,000 Records

XlsxHelper LightweightExcelReader ExcelDataReader AsDataset ExcelDataReader
Time to read first row 5ms 14ms - -
Time to read all rows(50,000) 3.90 sec 7.60 sec 13.50 sec 10.10 sec
Memory usage at the time of reading first row 31.412 MB 32.649 MB - 42.057 MB
Memory usage at the time of reading last row 38.891 MB 901.976 MB 471.662 MB 42.414 MB

XlsxHelper is memory optimized for reading huge xlsx files. see below results for reading 1 Million Employee records

XlsxHelper ExcelDataReader
Time to read first row 5ms -
Time to read all rows(1,000,000) 102.50 sec 61.10 sec
Memory usage at the time of reading first row 31.920 MB 668.381 MB
Memory usage at the time of reading last row 41.103 MB 667.934 MB

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.