i need to change davinci model for a new one, because it's deprecated now https://platform.openai.com/docs/deprecations
The B2LABS API support Web and Mobile Applications.
WEB DEMO: https://b2labs-textextractor-web.vercel.app/
Create .env file based on .env.example with the appropriate configurations.
you will need python installed in your machine
-
Install dependencies using pip or other package manager.
-
Run the server after installing the dependencies:
- to execute locally
python api/main.py
Refer to the .example.env file for environment variables.
i try to apply solid to this backend by made some classes and export those classes and their methods
FILE contains TextExtractor
- TextExtractor is an interface representing a strategy for extracting text.
FILE contains PyMuPDFTextExtractor
- PyMuPDFTextExtractor is a concrete implementation of the text extraction strategy using PyMuPDF.
## PyMuPDF Library:
PyMuPDF is used to open and read the PDF document (fitz.open("pdf", file_data)).
The document is processed page by page using a loop (for page_number in range(doc.page_count):).
## Text Extraction:
For each page in the PDF, the text content is extracted using page.get_text().
The extracted text from each page is concatenated to form the complete text content of the PDF.
## Error Handling:
Exception handling is implemented to capture any errors that may occur during the text extraction process.
If an error occurs, it is caught, and an error message is returned in the result dictionary ({'error': f"Error extracting text: {str(e)}"}).
FILE contains OpenAITextGenerator
- OpenAITextGenerator is a concrete implementation of the text generation strategy using OpenAI.
The OpenAITextGenerator class is responsible for interacting with the OpenAI API to generate text based on a given prompt.
The class is initialized with the OpenAI API key, which is typically kept confidential and should be stored securely.
## Generation Method:
The generate_text method of the OpenAITextGenerator class takes a prompt as input.
The prompt is constructed using the extracted text from the PDF content in the TextProcessingService class.
In the provided example, the prompt is constructed with the message "Given the following PDF text:\n{extracted_text}\nGenerate a relevant text:".
## API Request:
The OpenAI API key and prompt are used to make a request to the OpenAI API using the openai.Completion.create method.
The prompt and other relevant parameters are passed to the API, and the response is received.
FILE contains TextProcessingService
- TextProcessingService is a high-level module that uses the extracted text and generates additional text.
Initialization:
The class is initialized with instances of TextExtractor and TextGenerator.
These instances are provided through dependency injection, allowing the class to work with different implementations of text extraction and text generation.
Process Text Method:
The main method of the class is process_text(file_data).
It takes the raw content of a PDF document (file_data) as input.
Text Extraction:
It calls the extract_text method of the injected TextExtractor to extract text from the provided PDF content.
The result of text extraction is stored in the extraction_result variable.
This separation adheres to SRP and DIP, making the code more modular and adherent to SOLID principles. The TextProcessingService can easily switch between different implementations of text extraction and generation without modifying its code, making it more flexible and maintainable.
Benefit: Helps legal professionals quickly identify key terms, obligations, and legal implications. News Article Summarization:
Benefit: Provides a brief summary of news stories, enabling users to stay informed without reading every article.
FRONTEND ARE DEPLOYED AT VERCEL