Hey! Stumbled upon this project, it looks really promising, and the code is very clean

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

ok <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support for File Upload about claude-api-py HOT 13 CLOSED

ashwinpathi commented on July 28, 2024

Support for File Upload

from claude-api-py.

Comments (13)

AshwinPathi commented on July 28, 2024 1

@Xceron I think the requests library does something under the hood that makes the claude api reject any requests using it, or at least, thats what I experienced so far.

Thats why I made the custom_requests.py library in urllib, since these seemed to bypass whatever the issue with the requests library was (in fact, I used the exact same headers and body for urllib and requests and requests didnt work and urllib did!).

node.js fetch likely uses a different underlying set of headers/params/etc. I looked into that a bunch when I initially started reverse engineering and that was my conclusion.

from claude-api-py.

AshwinPathi commented on July 28, 2024 1

@Xceron that sounds interesting. I tried hacking at this problem as well, and I basically just made my own FormData class as an input parameter to my custom requests post method.

Form data is basically just a special body + a few extra headers on top of POST, so it shouldn't be too difficult to implement with the current framework.

I can try getting an implementation of this soon.

from claude-api-py.

Xceron commented on July 28, 2024 1

I cannot reproduce the error with the new version, seems to be fixed. Thank you very much for your work and patience!

from claude-api-py.

AshwinPathi commented on July 28, 2024

Hi @Xceron , thanks for working on this. The empty attachments list is just a placeholder, and I'd like to have attachments in the future.

tbh I haven't experimented with attachments yet, so I'll look into this further and get back to you.

from claude-api-py.

AshwinPathi commented on July 28, 2024

ok @Xceron turns out its pretty involved.

In your current code, when you just call open() on the file, it will return some random object which can't be decoded, so I would fix that first.

However, the bigger issue is that the /api/convert_document endpoint actually uses a FormData field, which is a little hard to do with vanilla urllib. Ex.

If you can somehow make it work with the requests library or if you find a clean way to do this with urllib, I would greatly appreciate that.

I believe once you complete this, the data that /api/conver_document returns can be directly passed into the attachments list and be sent through the API.

An attachment is basically just a JSON that looks like:

{
    "file_name": FILE_NAME,
    "file_type": FILE_TYPE,
    "file_size": FILE_SIZE,
    "extracted_content": RAW_FILE_CONTENTS,
}

ex:

from claude-api-py.

Xceron commented on July 28, 2024

Hey, thanks for getting back to me!

I am kinda stuck, let me walk you through the things I did:

Imported the claude API request to /api/convert_document/ into postman as is and tried uploading a file -> returns permission error for me
Then I used https://requestcatcher.com/ to see how the requests look like in the backend (too lazy to spin up a server, this is a simple solution) and send the postman request, including the file, to it
Then recreated the same request/file upload in python

The result then looks like this:

    def convert_file(
            self, organization_uuid: str, file_path: str
    ) -> Optional[JsonType]:
        """Uploads a file"""
        payload = {"orgUuid": organization_uuid}

        debug_request_catcher = "test"

        files = [
            ('file', (file_path, open(file_path, 'rb'), 'application/pdf'))  # TODO: Infer mimetype from file extension
        ]
        header = {}
        header.update(self._get_default_header())
        response = requests.request(
            "POST",
            f"https://{debug_request_catcher}.requestcatcher.com/test",
            headers=header,
            data=payload,
            files=files
        )
        if not response.ok:
            return None
        return response.json()

The file uploads successfully to requestcatcher, so the upload itself works. However, if I change the endpoint to claude, I will get permission errors (just like in postman). So I seem to be missing something. There is also a different project which reverse engineered the API in nodejs, their upload endpoint is here. However, I am not that good in understanding nodejs to see my error.

from claude-api-py.

Xceron commented on July 28, 2024

This would at least explain the issues I were facing.

My current workaround is me parsing the contents manually and then adding the transcribed text to the message, seems to work well enough thus far.

from claude-api-py.

AshwinPathi commented on July 28, 2024

Hi @Xceron , I managed to implement sending attachments. You can take a look at the changes I made in this commit: ddb21ae.

Since the requests library still doesn't work, I made my own FormData class and manually encoded files. An example use case is as follows:

# all the boilerplate setup....
client = claude_client.ClaudeClient(SESSION_KEY)
organizations = client.get_organizations()
claude_obj = claude_wrapper.ClaudeWrapper(client, organizations[0]['uuid'])

conversation_uuid = claude_obj.get_conversations()[0]['uuid']
claude_obj.set_conversation_context(conversation_uuid)

# Actual attachment sending
attachment = claude_obj.get_attachment('/some/random/attachment.pdf')
response = claude_obj.send_message("", attachments=[attachment])
print(response)

There are some dumb parts about this implementation (especially the part about checking whether or not a file is a text based file). If you think anything can be changed, feel free to open a PR. Also, if you are up for it, maybe you can explore why requests vs urllib seems to have different behavior.

Let me know if this works for you.

from claude-api-py.

Xceron commented on July 28, 2024

Hi @Xceron , I managed to implement sending attachments. You can take a look at the changes I made in this commit: ddb21ae.

Since the requests library still doesn't work, I made my own FormData class and manually encoded files. An example use case is as follows:
# all the boilerplate setup....
client = claude_client.ClaudeClient(SESSION_KEY)
organizations = client.get_organizations()
claude_obj = claude_wrapper.ClaudeWrapper(client, organizations[0]['uuid'])

conversation_uuid = claude_obj.get_conversations()[0]['uuid']
claude_obj.set_conversation_context(conversation_uuid)

# Actual attachment sending
attachment = claude_obj.get_attachment('/some/random/attachment.pdf')
response = claude_obj.send_message("", attachments=[attachment])
print(response)
There are some dumb parts about this implementation (especially the part about checking whether or not a file is a text based file). If you think anything can be changed, feel free to open a PR. Also, if you are up for it, maybe you can explore why requests vs urllib seems to have different behavior.

Let me know if this works for you.

Hey, thanks for coming back to me! The current method does not work for me with any binary file, i.e., PDFs. I went through the code and cannot see any inherent bugs when I traced a request. The only difference between your code and a manual request seems to be the generation of the boundary. Maybe Anthropic changed this on their end? If it works for you: Are you on Linux or macOS? I am using Windows, but did not see any errors indicating an error with the file reading itself.

from claude-api-py.

AshwinPathi commented on July 28, 2024

@Xceron I've tested on macOS. Do you get a 403 or 500 status code? Also I think the actual boundary text shouldnt matter as long as you place it in the right locations.

MacOS for example might have a boundary that looks like ----Webkitxxxxxxxxxx.... but the requests library boundary looks like a hash.

Ill take a closer look.

from claude-api-py.

Xceron commented on July 28, 2024

I dont get any error at all as I end up getting into this:

claude-api-py/claude/custom_requests.py

Line 224 in 713e045

return Response(ok=False, data=b"")

from claude-api-py.

AshwinPathi commented on July 28, 2024

@Xceron I added some additional fields to the Response message, but on my Mac (and on a Linux computer I've also tested this on), it seems to work. I'm not too sure what could be going on without getting detailed logs.

Maybe as a start, you can omit the response handling code to always throw an error so you can check out whats going on. I'd go from there to figure out what things might be missing.

claude-api-py/claude/custom_requests.py

Line 215 in 713e045

 def _safe_request_read(request: Request, data: Optional[bytes] = None) -> Response: 

If you need any more information from me, let me know.

from claude-api-py.

AshwinPathi commented on July 28, 2024

@Xceron Cool, glad to see it worked!

from claude-api-py.

Support for File Upload about claude-api-py HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs