GithubHelp home page GithubHelp logo

Comments (13)

AshwinPathi avatar AshwinPathi commented on July 28, 2024 1

@Xceron I think the requests library does something under the hood that makes the claude api reject any requests using it, or at least, thats what I experienced so far.

Thats why I made the custom_requests.py library in urllib, since these seemed to bypass whatever the issue with the requests library was (in fact, I used the exact same headers and body for urllib and requests and requests didnt work and urllib did!).

node.js fetch likely uses a different underlying set of headers/params/etc. I looked into that a bunch when I initially started reverse engineering and that was my conclusion.

from claude-api-py.

AshwinPathi avatar AshwinPathi commented on July 28, 2024 1

@Xceron that sounds interesting. I tried hacking at this problem as well, and I basically just made my own FormData class as an input parameter to my custom requests post method.

Form data is basically just a special body + a few extra headers on top of POST, so it shouldn't be too difficult to implement with the current framework.

I can try getting an implementation of this soon.

from claude-api-py.

Xceron avatar Xceron commented on July 28, 2024 1

I cannot reproduce the error with the new version, seems to be fixed. Thank you very much for your work and patience!

from claude-api-py.

AshwinPathi avatar AshwinPathi commented on July 28, 2024

Hi @Xceron , thanks for working on this. The empty attachments list is just a placeholder, and I'd like to have attachments in the future.

tbh I haven't experimented with attachments yet, so I'll look into this further and get back to you.

from claude-api-py.

AshwinPathi avatar AshwinPathi commented on July 28, 2024

ok @Xceron turns out its pretty involved.

In your current code, when you just call open() on the file, it will return some random object which can't be decoded, so I would fix that first.

However, the bigger issue is that the /api/convert_document endpoint actually uses a FormData field, which is a little hard to do with vanilla urllib. Ex.
image

If you can somehow make it work with the requests library or if you find a clean way to do this with urllib, I would greatly appreciate that.

I believe once you complete this, the data that /api/conver_document returns can be directly passed into the attachments list and be sent through the API.

An attachment is basically just a JSON that looks like:

{
    "file_name": FILE_NAME,
    "file_type": FILE_TYPE,
    "file_size": FILE_SIZE,
    "extracted_content": RAW_FILE_CONTENTS,
}

ex:
image

from claude-api-py.

Xceron avatar Xceron commented on July 28, 2024

Hey, thanks for getting back to me!

I am kinda stuck, let me walk you through the things I did:

  • Imported the claude API request to /api/convert_document/ into postman as is and tried uploading a file -> returns permission error for me
  • Then I used https://requestcatcher.com/ to see how the requests look like in the backend (too lazy to spin up a server, this is a simple solution) and send the postman request, including the file, to it
  • Then recreated the same request/file upload in python

The result then looks like this:

    def convert_file(
            self, organization_uuid: str, file_path: str
    ) -> Optional[JsonType]:
        """Uploads a file"""
        payload = {"orgUuid": organization_uuid}

        debug_request_catcher = "test"

        files = [
            ('file', (file_path, open(file_path, 'rb'), 'application/pdf'))  # TODO: Infer mimetype from file extension
        ]
        header = {}
        header.update(self._get_default_header())
        response = requests.request(
            "POST",
            f"https://{debug_request_catcher}.requestcatcher.com/test",
            headers=header,
            data=payload,
            files=files
        )
        if not response.ok:
            return None
        return response.json()

The file uploads successfully to requestcatcher, so the upload itself works. However, if I change the endpoint to claude, I will get permission errors (just like in postman). So I seem to be missing something. There is also a different project which reverse engineered the API in nodejs, their upload endpoint is here. However, I am not that good in understanding nodejs to see my error.

from claude-api-py.

Xceron avatar Xceron commented on July 28, 2024

This would at least explain the issues I were facing.

My current workaround is me parsing the contents manually and then adding the transcribed text to the message, seems to work well enough thus far.

from claude-api-py.

AshwinPathi avatar AshwinPathi commented on July 28, 2024

Hi @Xceron , I managed to implement sending attachments. You can take a look at the changes I made in this commit: ddb21ae.

Since the requests library still doesn't work, I made my own FormData class and manually encoded files. An example use case is as follows:

# all the boilerplate setup....
client = claude_client.ClaudeClient(SESSION_KEY)
organizations = client.get_organizations()
claude_obj = claude_wrapper.ClaudeWrapper(client, organizations[0]['uuid'])

conversation_uuid = claude_obj.get_conversations()[0]['uuid']
claude_obj.set_conversation_context(conversation_uuid)

# Actual attachment sending
attachment = claude_obj.get_attachment('/some/random/attachment.pdf')
response = claude_obj.send_message("", attachments=[attachment])
print(response)

There are some dumb parts about this implementation (especially the part about checking whether or not a file is a text based file). If you think anything can be changed, feel free to open a PR. Also, if you are up for it, maybe you can explore why requests vs urllib seems to have different behavior.

Let me know if this works for you.

from claude-api-py.

Xceron avatar Xceron commented on July 28, 2024

Hi @Xceron , I managed to implement sending attachments. You can take a look at the changes I made in this commit: ddb21ae.

Since the requests library still doesn't work, I made my own FormData class and manually encoded files. An example use case is as follows:

# all the boilerplate setup....
client = claude_client.ClaudeClient(SESSION_KEY)
organizations = client.get_organizations()
claude_obj = claude_wrapper.ClaudeWrapper(client, organizations[0]['uuid'])

conversation_uuid = claude_obj.get_conversations()[0]['uuid']
claude_obj.set_conversation_context(conversation_uuid)

# Actual attachment sending
attachment = claude_obj.get_attachment('/some/random/attachment.pdf')
response = claude_obj.send_message("", attachments=[attachment])
print(response)

There are some dumb parts about this implementation (especially the part about checking whether or not a file is a text based file). If you think anything can be changed, feel free to open a PR. Also, if you are up for it, maybe you can explore why requests vs urllib seems to have different behavior.

Let me know if this works for you.

Hey, thanks for coming back to me! The current method does not work for me with any binary file, i.e., PDFs. I went through the code and cannot see any inherent bugs when I traced a request. The only difference between your code and a manual request seems to be the generation of the boundary. Maybe Anthropic changed this on their end? If it works for you: Are you on Linux or macOS? I am using Windows, but did not see any errors indicating an error with the file reading itself.

from claude-api-py.

AshwinPathi avatar AshwinPathi commented on July 28, 2024

@Xceron I've tested on macOS. Do you get a 403 or 500 status code? Also I think the actual boundary text shouldnt matter as long as you place it in the right locations.

MacOS for example might have a boundary that looks like ----Webkitxxxxxxxxxx.... but the requests library boundary looks like a hash.

Ill take a closer look.

from claude-api-py.

Xceron avatar Xceron commented on July 28, 2024

I dont get any error at all as I end up getting into this:

return Response(ok=False, data=b"")

from claude-api-py.

AshwinPathi avatar AshwinPathi commented on July 28, 2024

@Xceron I added some additional fields to the Response message, but on my Mac (and on a Linux computer I've also tested this on), it seems to work. I'm not too sure what could be going on without getting detailed logs.

Maybe as a start, you can omit the response handling code to always throw an error so you can check out whats going on. I'd go from there to figure out what things might be missing.

def _safe_request_read(request: Request, data: Optional[bytes] = None) -> Response:

If you need any more information from me, let me know.

from claude-api-py.

AshwinPathi avatar AshwinPathi commented on July 28, 2024

@Xceron Cool, glad to see it worked!

from claude-api-py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.