GithubHelp home page GithubHelp logo

scribelabsai / amazon-trp-node Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 0.0 1.57 MB

Amazon Textract Response Parser library for Node.

License: MIT License

JavaScript 5.54% TypeScript 94.46%
aws-textract-parser typescript aws-textract aws

amazon-trp-node's Introduction

amazon-trp-node

Amazon Textract Response Parser library for TS.


This is a TS port of the Python TRP script from AWS.

Installation

npm install @scribelabsai/amazon-trp

Usage

  1. Load the blocks into a document (e.g. using GetDocumentAnalysisCommand from @aws-sdk/client-textract)
import { TextractClient, GetDocumentAnalysisCommand } from '@aws-sdk/client-textract';
import { Document } from '@scribelabsai/amazon-trp';

import type { BlockStruct } from '@scribelabsai/amazon-trp';

const client = new TextractClient();
const resp = await client.send(new GetDocumentAnalysisCommand({ JobId: 'MY_JOBID }));
const doc = new Document(resp.blocks as BlockStruct[]);
  1. Do something with the document (e.g. getting tables)
doc.pages.forEach((p) => {
  p.tables.forEach((t) => {
    // Do something with the table
  });
});

License

MIT, see LICENSE file.

amazon-trp-node's People

Contributors

ailinvenerus avatar alexbostock avatar dependabot[bot] avatar ehadoux avatar github-actions[bot] avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

amazon-trp-node's Issues

Difficulty Accessing Table Content in Textract Output

I'm aware that this question may seem rather basic, but I'm facing an issue with accessing the complete content of tables in my code. In the example below, the console.log("p", p.text) successfully displays all the content of the image. However, console.log("c.text", c.text) appears to access only some elements. Could someone provide an example of how to iterate over the table properly?

const params = {
  Document: {
    Bytes: blob,
  },
  FeatureTypes: ["TABLES"],
};

const command = new AnalyzeDocumentCommand(params);
try {
  const data = await textractClient.send(command);

  const doc = new Document(data.Blocks as BlockStruct[]);
  doc.pages.forEach((p) => {
    console.log("p", p.text);
    p.tables.forEach((t) => {
      t.rows.forEach((l) => {
        console.log("l", l.cells);
        l.cells.forEach((c) => {
          console.log("c.text", c.text);
        });
      });
    });
  });
}
I'm looking for assistance in understanding why c.text is not accessing the entire content and how to properly iterate over the table to access all its elements. Any insights or examples would be greatly appreciated.

Tahnks its not easy to be a noob

Publish to NPM

It looks like the last version that was published to NPM was 2 years ago v1.0.0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.