GithubHelp home page GithubHelp logo

artsy / metaphysics Goto Github PK

View Code? Open in Web Editor NEW
359.0 43.0 90.0 46.21 MB

Artsy's GraphQL API

License: MIT License

JavaScript 15.90% Shell 0.08% TypeScript 83.98% Dockerfile 0.05% Jinja 0.01%
graphql node expressjs typescript

metaphysics's Introduction

Metaphysics

Metaphysics is a GraphQL-compliant API that wraps various Artsy APIs. You can try it here against our staging API.

It is built on express, express-graphql, and graphql. With graphiql providing a sandbox to work with.

It is currently used in production all over the place in Artsy.net, and the Artsy iOS App

Meta CircleCI codecov

Getting Setup

To get yourself set up with all the project's dependencies:

git clone https://github.com/artsy/metaphysics
cd metaphysics

# Run the setup script
./scripts/setup.sh

This will pull the environment variables from aws into .env.shared. It will also overwrite .env with the values in .env.example. If you need to override any of these values or add new .env values place them in the .env file.

Development

With your dependencies set up, you can run Metaphysics by running:

yarn start

Which will start the server on http://localhost:3000

Be sure that memcached is no longer running before starting hokusai by running

brew services stop memcached

Recommended: You can run the commands inside the terminal in VS Code, then the debugger will be hooked up by default.

Setting up your local GraphiQL

We recommend the graphiql.app client for testing queries locally.

You will need to set up headers with both:

  • x-access-token - Open https://staging.artsy.net, sign in and evaluate sd.CURRENT_USER.accessToken in a dev console (CMD+Shift+C in Chrome).
  • x-user-id - As above, but sd.CURRENT_USER.id.

If you're new to GraphQL, you can checkout Artsy's GraphQL Workshop.

For GraphQL Endpoint, set it to http://localhost:3000/v2.

Note that /v2 is the default and /v1 has been fully deprecated and removed.

Introspection Setup

Getting docs for the schema on MP in your playground of choice (Postman, Insomnia, Altair, etc) is called introspection.

Introspection is available by default when developing.

Introspection on staging and production are for internal use only, so artsy devs can use it to make development for MP clients (eigen, force, etc) easier, but it is and should not be used by any of the clients or anyone else.

In order to set this up in your playground of choice (Postman, Insomnia, Altair, etc), you need to send the following header:

Authorization: Bearer <secret>

and replace <secret> with the value you get from hokusai using

hokusai staging env get INTROSPECT_TOKEN
hokusai production env get INTROSPECT_TOKEN

or the contents of Metaphysics INTROSPECT_TOKEN in 1Password.

Sample Queries

Once you have the GraphiQL client running against your local service, you can verify things are working by executing these queries:

Get your account information

{
  me {
    name
    email
  }
}

If any of these queries fail, it's probable that you misconfigured your x-access-token or x-user-id HTTP headers.

Docs

Docker and Kubernetes setup

This is deployed using Hokusai to manage Docker and Kubernetes. To replicate this:

  • Install Docker for Mac and Hokusai

    $ brew tap caskroom/cask && brew cask install docker
    $ pip install hokusai

    If you are using your system Python distribution, you may need to run this as:

    $ sudo pip install hokusai --ignore-installed
  • Configure Hokusai

    export AWS_ACCESS_KEY_ID={{ MY_AWS_ACCESS_KEY_ID }}
    export AWS_SECRET_ACCESS_KEY={{ MY_AWS_SECRET_ACCESS_KEY }}
    hokusai configure --kubectl-version {{ kubectl_version }} --s3-bucket {{ kubectl_config_s3_bucket }} --s3-key {{ kubectl_config_s3_key }}
    hokusai check

    Artsy staff should find follow the instructions in https://github.com/artsy/potential/blob/main/platform/Kubernetes.md#hokusai

  • Start the server

    hokusai dev start

Testing

  • Run tests in the Docker Compose test stack via Hokusai:

    hokusai test
  • Or, to run tests locally: npm test to run the entire suite npm run watch to spin up the test watcher

About Artsy

This project is the work of engineers at Artsy, the world's leading and largest online art marketplace and platform for discovering art. One of our core Engineering Principles is being Open Source by Default which means we strive to share as many details of our work as possible.

You can learn more about this work from our blog and by following @ArtsyOpenSource or explore our public data by checking out our API. If you're interested in a career at Artsy, read through our job postings!

metaphysics's People

Contributors

alloy avatar anandaroop avatar artsy-peril[bot] avatar artsyit avatar ashfurrow avatar ashkan18 avatar broskoski avatar craigspaeth avatar damassi avatar dblandin avatar ds300 avatar dzucconi avatar eessex avatar izakp avatar joeyaghion avatar jonallured avatar kierangillen avatar mbilokonsky avatar mdole avatar mounirdhahri avatar mzikherman avatar olerichter00 avatar orta avatar oxaudo avatar peril-staging[bot] avatar renovate-bot avatar sepans avatar starsirius avatar sweir27 avatar zephraph avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metaphysics's Issues

We should paginate bidders in causality JWT

The key here was that if there are more than 10 auctions a user is registered for than we didn't get the correct registration status putting the user in a pending state of registration

auth token should pass through

The unresolved issue I see here is authentication + caching. Ideally:

  • An X-AUTH-TOKEN is passed through the request
  • Metaphysics maintains an admin token to have the data it needs to show ui-specific computed attributes (i.e. show_carousel).

First thing that comes to mind is whitelisting attributes based on what is returned by a non-admin, and allowing those + whatever is computed to be returned in the response. We ideally don't want to deal with two sets of caches per admin and non-admin (and whatever else comes down the line). There may be some cases that I am overlooking here (can_download_image for example).

Force queries incompatible with current schema.

After my change that uses the ID! type in most all places instead of String!, the queries that Force produces are no longer accepted. This is because Force uses a ‘query’ function that specifies the type of the argument, which is still of type String!.

The solution is to either revert the change from String! to ID! or update Force to use ID! as well.

I feel like ID! is clearer about the value of the field, but it can also be argued that the naming of the field conveys this enough as well: id, artist_id, etc.

What are your thoughts?

Specify time zone defaults.

Regarding the exhibition_period field added in #226, and dates in general, we should make it possible to return these taking the user’s time zone into account. My thoughts are:

Aggregate sale data from Gravity and Causality

As part of the end-to-end integration plan for live auctions, we want to have clients bootstrap their data by querying Metaphysics. Ideally, we can have Metaphysics interpret the query and request the definitive Sale and Artwork data from Gravity, and then join in Causality lot data. When we add auth, a JWT will need to be forwarded to Causality to authorize the query.

API inconsistency

I found this to be a bit unintuitive:

{
  artist(id: "banksy") {
    counts {
      artworks
      for_sale_artworks
    }
  }
}

Because Artist.artworks takes a filter parameter, so I would have expected the counts to also take the same filter parameter:

{
  artist(id: "banksy") {
    counts {
      artworks(filter: IS_FOR_SALE)
      artworks(filter: IS_NOT_FOR_SALE)
    }
  }
}

Opinions on me PRing that?

Configurable caching

Currently we refresh on every request in the background. This could maybe be configurable to refresh on every N request so it can be tuned to easy up on Gravity if need be.

Meanwhile, the cache hangs around indefinitely—should there be an expiration? —I don't really think so but maybe there is an argument in favor.

Also should ensure Redis is setup to appropriately evict keys when it's full http://redis.io/topics/lru-cache

Better use of available data instead of making unnecessary requests

In many cases we blindly fetch an object from Gravity instead of using an already available embedded result that might be present. Currently we sometimes "optimize" for this by passing shallow: true where we can be sure of an existing embedded result:

args: {
shallow: {
type: GraphQLBoolean,
description: 'Use whatever is in the original response instead of making a request',
},
},
resolve: ({ artist }, { shallow }) => {
if (!artist) return null;
if (shallow) return artist;
return gravity(`artist/${artist.id}`)
.catch(() => null);
},

However it's likely possible to make this optimization automatically by inspecting the query and available data.

Persistent queries on production to mitigate against DOSing/crawling and for efficiency.

From a talk given by Nick Schrock (a co-creator of GraphQL):

It would seem from his responses that they are not using persistent queries as a way to mitigate DOSing, but he also mentions that they don’t use it for HTTP caching, so I’m unsure what they are using it for. (Possibly for cutting down on POST data?)

He mentions that Facebook might release their strategy/implementation for persistent queries, but that has not been done yet afaik, or at least I couldn’t find anything on it.


My naive thoughts on how/what we could do:

  • Allow any dynamic querying in staging.
  • Store IDs for queries, that our clients need, on production and only allow those to be performed. (Thus preventing crawling/DOSing that way.)
  • In addition, only POSTing an ID to the service will also cut down on data being sent over the wire. (Which can be a big win for mobile devices.)

The IDs could maybe just be a hash of the query? Which would make it possible to have Relay (custom client) middleware on the Artsy clients that can generates these IDs automatically based on the query, rather than having to store those IDs in the clients. Some issues with this are:

  • Parameters that may be dynamic, such as e.g. artist ID/slug, should not be made a part of the query, but be interpolated.
  • But maybe not all parameters should be interpolatable, e.g. page size. (Or maybe just set a max?)

cropped image url always tries to crop from large image

https://github.com/artsy/metaphysics/blob/master/schema/image/cropped.js#L11-L14

This should be able to determine the largest available version and crop that one. Lots of profile cover images for the partner page don't have a large version, and so trying to request the cropped url from metaphysics results in a nonexistent image url, like

background-image: url(https://i.embed.ly/1/display/crop?url=https%3A%2F%2Fd32dm0rphc51dk.cloudfront.net%2F0605qKOomA-mNhx61j9vDA%2Flarge.jpg&width=400&height=300&key=a1f82558d8134f6cbebceb9e67d04980&quality=95);

(see its url param above: url=https%3A%2F%2Fd32dm0rphc51dk.cloudfront.net%2F0605qKOomA-mNhx61j9vDA%2Flarge.jpg)

and see the actual profile data:
https://api.artsy.net/api/v1/profile/kristina-parsons

Log/surface errors

This thread: https://github.com/artsy/force-private/pull/3774#discussion_r44986905

Basically, I think we should be logging whenever leafs are erroring (for a particular query). That way, instead of waiting to notice that something is broken or missing somewhere down the line, we can stay more on top of inconsistencies in our data or API responses, etc.

Maybe those queries should actually return non-successful status codes and it can be logged in NewRelic? If we want them to still be successful in prod, maybe we can have some middleware that sends any errors and the related queries to NewRelic or wherever.

Tune memory usage

Memory usage generally tends to stay within the bounds of Heroku's Standard-1X dyno limits but occasionally there is some kind of runaway spike. This doesn't appear to correlate at all with throughput/reqs which were flat during that last spike

[deploy] Send notifications to Slack

We need Semaphore to send deploy notifications to #platform-alerts #platform-machines (and possibly #web, #gmv-ios-dev) so that:

  • others can always know that a deploy occurred
  • people looking at #platform-alerts can correlate deploys to 🔥

Re-create the Semaphore integration

Since this repo went OSS, CI build results should ideally be public.

The only way to do that is to re-create the project - I don't want to do this personally, as this is the first time I've started using Semaphor and I have no idea how things are set up.

[Home] Recommended can contain duplicate entry.

For the following query:

{
  home_page {
    module(key: "recommended_works") {
      results {
        id
      }
    }
  }
}

I get the following response, where the first and last results entry are the same:

{
  "data": {
    "home_page": {
      "module": {
        "results": [
          {
            "id": "herb-ritts-brigitte-nielson-standing-malibu"
          },
          {
            "id": "sandro-miller-william-klein-slash-smoke-and-veil-paris-vogue-1958"
          },
          {
            "id": "carmen-mitrotta-call-to-faith-no-3"
          },
          {
            "id": "hermann-mejia-suit-2"
          },
          {
            "id": "ulay-untitled-1"
          },
          {
            "id": "kaws-guess-ad-disruption"
          },
          {
            "id": "odile-richer-toi-ma-beaute-baroque-slash-you-are-my-baroque-beauty"
          },
          {
            "id": "mary-ellen-mark-beautiful-emine-psoing-trabzon-turkey"
          },
          {
            "id": "lillian-bassman-the-v-back-evenings-dress-by-trigere-suzy-parker-new-york-harpers-bazaar-july-1955"
          },
          {
            "id": "herb-ritts-brigitte-nielson-standing-malibu"
          }
        ]
      }
    }
  }
}

I would expect the entries to be unique. I filed a Gravity ticket for this https://github.com/artsy/gravity/issues/10210.

[Home] Artist rail data missing

When querying for the homepage modules, currently there are no artists based modules in the results, just artworks.

E.g.: there are results like "Works by Iconic Artists" containing artworks, yet no modules like "Recommended Artists to Follow" that would contain a list of artists.

Per request caching option

Would want to be able to do something like:

{ artwork(id: "robert-longo-untitled-pl-8-from-men-in-the-cities-1") {
  id
  sale_artwork(cached: false) {
    id
  }
}}

and bypass any caching completely.

Also wondering if a memoization option would also be useful, to bypass the loader

CI badge

It says "unknown" and links to a 404.

Cache key could use _id rather than slug?

I'm not sure how applicable this is for metaphysics given that in web clients you tend to come at things from the slug, but we've been bitten by slugs != actual object ids in other projects.

So I would float this here.

https://artsy-metaphysics-staging.herokuapp.com/artwork/andy-warhol-skull returns
image

while https://artsy-metaphysics-staging.herokuapp.com/artwork/4d8b93ba4eb68a1b2c001c5b returns all the same data with a different key

image

Could be a concern as you consider invalidation strategies down the road.

Kudos on boldly stringifying the id to get at a key :)

Preparations for use in Relay

I haven't researched at all what is needed here, but whatever work is needed to make metaphysics work nicely with Relay will be needed for both web and mobile.


[Updated by @alloy]

  • Add Global Object Identification.
    • Need to change the id fields to be a composite key, so that the type and specific instance can be inflected from it. E.g. Artist#banksy. The common implementation deployed in the GraphQL community also Base64 encodes that key.
    • Ensure existing clients use the appropriate ID field. Which, for those not using Relay atm, means they probably need to use _id, which maps to the Gravity ID. However from short discussion with @joeyAghion, it seems like searching by _id could be a problem, because only some models would support searching by slug on that field. Alternatively we’ll have to add e.g. gravityID, which would be the equivalent of id right now.
  • Expose associations as connections, which allow for slicing and paginating.
  • See if we need to do anything else to support deferring query fragments.

Host the app in a different place

This repo can chew up a lot of cache space when in production and serving high traffic pages.

Heroku cache/redis instances are pretty expensive especially for the larger ones which are needed.

I propose one of the following three choices:

  • keep everything on heroku and accept that the redis instance is going to be $$
  • keep the app on heroku and use AWS ElasticCache (this may have security policy or other implications)
  • deploy the app fully to OpsWorks and use AWS ElasticCache

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.