cloudendpoints / esp Goto Github PK

Extensible Service Proxy

Home Page: https://cloud.google.com/endpoints/

License: BSD 2-Clause "Simplified" License

Python 5.23% Shell 8.99% JavaScript 1.68% C++ 45.55% Makefile 0.01% C 1.02% Perl 29.99% HTML 0.02% Lua 0.08% Gnuplot 0.07% Go 3.14% Dockerfile 0.41% Starlark 3.82%

esp's Introduction

The Extensible Service Proxy

Extensible Service Proxy, a.k.a. ESP is a proxy which enables API management capabilities for JSON/REST or gRPC API services. The current implementation is based on an NGINX HTTP reverse proxy server.

ESP provides:

Features: authentication (auth0, gitkit), API key validation, JSON to gRPC transcoding, as well as API-level monitoring, tracing and logging. More features coming in the near future: quota, billing, ACL, etc.
Easy Adoption: the API service can be implemented in any coding language using any IDLs.
Platform flexibility: support the deployment on any cloud or on-premise environment.
Superb performance and scalability: low latency and high throughput

ESP can Run Anywhere

However, the initial development was done on Google App Engine Flexible Environment, GCE and GKE for API services using Open API Specification and so our instructions and samples are focusing on these platforms. If you make it work on other infrastructure and IDLs please let us know and contribute instructions/code.

Prerequisites

Common prerequisites used irrespective of operating system and build tool chain are:

Git
Node.js is required for running included example Endpoints bookstore application.

Getting ESP

To download the Extensible Service Proxy source code, clone the ESP repository:

# Clone ESP repository
git clone https://github.com/cloudendpoints/esp

# Initialize Git submodules.
git -C esp submodule update --init --recursive

Released ESP docker images

ESP docker images are released regularly. The regular images are named as gcr.io/endpoints-release/endpoints-runtime:MAJOR_VERSION.MINOR_VERSION.PATCH_NUMBER. For example, gcr.io/endpoints-release/endpoints-runtime:1.30.0 has MAJOR_VERSION=1, MINOR_VERSION=30 and PATCH_NUMBER=0.

Symbolically linked images:

MAJOR_VERSION is linked to the latest image with same MAJOR_VERSION.

For example, gcr.io/endpoints-release/endpoints-runtime:1 is always pointed to the latest image with "1" major version.

Secure image:

Normally ESP container runs as root, it is deemed as not secure. To make ESP container secure, it should be run as non-root and its root file system should be read-only. Normal docker images can be made to run as non-root, but such change may break some existing users. Starting 1.31.0, a new secure image is built with suffix "-secure" in the image name, e.g. gcr.io/endpoints-release/endpoints-runtime-secure:1.31.0. It will be run as non-root.

You can switch to use the secure images if the followings are satisfied:

Nginx is not listening on ports requiring root privilege (ports < 1024).
If a custom nginx config is used and it has the server_config path set to "/etc/nginx", the secure image will not work. The server_config is moved to the "/home/nginx" folder in the secure image. Please replace "/etc/nginx" with "/home/nginx" for *sever_config" in your custom nginx config before using the secure image.

If some folders can be mounted externally, the root system can be made read-only. Please see this GKE deployment yaml file as example on how to make root system read-only.

Repository Structure

doc: Documentation
docker: Scripts for packaging ESP in a Docker image.
include: Extensible Service Proxy header files.
src: Extensible Service Proxy source.
google and third_party: Git submodules containing dependencies of ESP, including NGINX.
script: Scripts used for build, test, and continuous integration.
test: Applications and client code used for end-to-end testing.
tools: Assorted tooling.
start_esp: A Python start-up script for the ESP proxy. The script includes a generic nginx configuration template and fetching logic to retrieve service configuration from Google Service Management service.

ESP Tutorial

To find out more about building, running, and testing ESP, please review

Contributing

Your contributions are welcome. Please follow the contributor guidlines.

esp's People

Contributors

Stargazers

Watchers

esp's Issues

Is mTLS, or at least upstream TLS, supported for gRPC?

Hey there esp authors,

I am looking for clarification. I know it says in the docs that mTLS is supported via flags, but, upon closer inspection of the config template, it seems they are only applied for non-gRPC backends.

Are client certs - I.e. mTLS - supported when using “grpc://“?

Also, is upstream TLS (I.e. a gRPC server serving itself via TLS, even if client certs are not in use) supported?

CORS not working with gRPC endpoints

I have a gRPC service exposed to the world via Google Cloud Endpoints/esp. I am able to access this service via a grpc-web NodeJS client without any issues. However, when trying to access the service via a browser JS app (using the same grpc-web client code), calls fail due to the OPTION calls (CORS) being rejected.

Here is my endpoints.yaml configuration file:

type: google.api.Service
config_version: 3

name: redacted.cloud.goog

title: My API
apis:
- name: my.namespace.api

endpoints:
- name: redacted.cloud.goog
  allow_cors: true

usage:
  rules:
  - selector: "*"
    allow_unregistered_calls: true

http:
  rules:
    # ...

Here's the relevant part of my k8s deployment configuration:

      - name: esp
        image: gcr.io/endpoints-release/endpoints-runtime:1
        args: [
          "--http_port=8080",
          "--http2_port=9090",
          "--service=redacted.cloud.goog",
          "--version=2017-12-08r1",
          "--backend=grpc://127.0.0.1:5000"
        ]
        ports:
        - containerPort: 8080
        - containerPort: 9090

And here's what I see in the ESP logs:

 {
  error_cause:  "service_control"    
  http_method:  "OPTIONS"    
  http_response_code:  403    
  location:  "us-east4-c"    
  log_message:  "Endpoints management skipped for an unrecognized HTTP call: OPTIONS /redacted"    
  producer_project_id:  "redacted"    
  referer:  "http://localhost:4200/"    
  request_latency_in_ms:  1    
  request_size_in_bytes:  802    
  response_size_in_bytes:  401    
  timestamp:  1512750043.5942123    
  url:  "/redacted"    
 }

src/api_manager:config_manager_test is flaky with -c opt

bazel test -c opt //contrib/endpoints/src/api_manager:config_manager_test --runs_per_test=100

In istio/proxy repo or after #216

bazel test -c opt //contrib/src/api_manager:config_manager_test --runs_per_test=100

will result

INFO: Elapsed time: 67.781s, Critical Path: 9.71s
//contrib/endpoints/src/api_manager:config_manager_test                  FAILED in 20 out of 100 in 0.2s

The test is temporarily disabled in #216.

Documentation Issues?

A couple of documentation issues I ran into when setting up a new build machine (debian/jessie)

The package "build-essentials" is required for libtoolize
The package "aclocal" is required by bazel for the command by the same name
The installation requirements for bazel, clang, etc are all scripted, please consider recommending users run these scripts and include instructions for running them as an alternative.

Unable to specify JSON message print options

The following excerpt defines the print options available in grpc/protobuf:

struct JsonPrintOptions {
  bool add_whitespace;
  bool always_print_primitive_fields;
  bool always_print_enums_as_ints;
  bool preserve_proto_field_names;
}

Unfortunately some of these values are simply never set, and other are hard coded from the "JsonOptions" bitmask being passed into the function and hard-coded elsewhere.
(see https://github.com/cloudendpoints/esp/blob/master/src/api_manager/utils/marshalling.cc#L53)

IDEALLY:

What we would ideally like is a means of decorating the message type in the proto to provide these json transformation options:

boolean always_print_primitive_fields = false;
boolean always_print_enums_as_ints = false;
boolean preserve_proto_field_names = false;

This would allow esp or even the underlying protobuf/grpc layer to inspect the descriptor, determine the contract formatting type, and provide the appropriate output. This is a fairly trivial change either in esp's "src/api_manager/utils/marshalling.cc" or "src/google/protobuf/util/json_util.cc" in google/protobuf; however, I also realize the cost associated with a release is non-zero.

Background/Use Case:

We have an existing api we are trying to replace originally written in nodejs with open api and express. We are reimplementing this service on a new backend storage and we were attempting to use GRPC and ESP. The service currently expresses an API where both empty string "", and zero 0 values are expected by the client code. In addition the service also currently exposes fields with an underscore "_" in them.

Without being about to ensure that the fields are both written to the output, and left in their original name we are kinda lost how to proceed without trying to simultaneously retool all client uses. Although untested, I believe specifying [json_name] in the proto may solve for the "_" issue, it still leaves us shy of the mark.

Option to NOT reject unknown paths

I'm trying to replace my current NGINX proxy with ESP but all paths are not documented in the swagger file. This causes the ESP to reject those requests.

Is it possible to configure the ESP to instead pass all requests down stream?

Is it possible to view the streaming request/response message counts in Stackdriver monitoring?

According to the debug logs, the following metrics are recorded:

esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/response_bytes"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/request_bytes"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/streaming_durations"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/streaming_response_message_counts"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/streaming_request_message_counts"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/top_request_count_by_referer"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/top_request_count_by_end_user_country"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/top_request_count_by_end_user"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/response_sizes"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/request_sizes"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/backend_latencies"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/request_overhead_latencies"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/total_latencies"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/quota_used_count"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/error_count"
esp	May 3, 2018, 8:19:09 PM	metrics: "serviceruntime.googleapis.com/api/consumer/request_count"

In particular, there are serviceruntime.googleapis.com/api/consumer/streaming_response_message_counts and serviceruntime.googleapis.com/api/consumer/streaming_request_message_counts. It appears that this information is exposed on the Cloud Endpoints Console page as "Streaming request message counts" and "Streaming response message counts".

However, I could not find these metrics in Stackdriver Monitoring. Is it possible to access them somehow?

tests in src/nginx/t all failed with latest perl 5 version 24

bazel test //src/nginx/t:all

failed for all tests. The errors are:

Can't locate src/nginx/t/ApiManager.pm in @inc (you may need to install the src::nginx::t::ApiManager module) (@inc contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/
perl/5.24.1 /usr/local/share/perl/5.24.1 /usr/lib/x86_64-linux-gnu/perl5/5.24 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.24 /usr/share/perl/5.24 /usr/local/lib/s
ite_perl /usr/lib/x86_64-linux-gnu/perl-base) at /usr/local/google/home/qiwzhang/.cache/bazel/_bazel_qiwzhang/d0919d74e85a7fedd8b21f930b308d93/execroot/main/bazel-o
ut/k8-fastbuild/bin/src/nginx/t/unspecified_service_control.runfiles/main/src/nginx/t/unspecified_service_control.t line 32.
BEGIN failed--compilation aborted at /usr/local/google/home/qiwzhang/.cache/bazel/_bazel_qiwzhang/d0919d74e85a7fedd8b21f930b308d93/execroot/main/bazel-out/k8-fastbu
ild/bin/src/nginx/t/unspecified_service_control.runfiles/main/src/nginx/t/unspecified_service_control.t line 32

But tests under third_party work.

bazel test //third_party/nginx-tests/....

Most likely a bug in bazel rules in https://github.com/bazelbuild/rules_perl.git

Pass api key from query parameter to header for gRPC transcoding

If ESP is deployed in a chaining fashion, first one is doing gRPC transcoding, and second one is for other ESP features, both ESP proxys need api-key.

If apikey is in the query parameter in HTTP, when trancoding to gRPC, it should be set to header.

Allow forwarding healthz/ to a different backend port

For gRPC implementation in go, serving both HTTP and gRPC on the same port is still problematic, so a production service is unlikely to serve monitoring endpoints (healthz/, metricz/ and those *z/ ones) on the grpc port.

So I think it's a good idea to allow healthz/ to be forwarded to a different port other than the grpc port.

additional_bindings do not work if you mix methods

I have the attempted to use both a get method and a post method for a service, and only one outside the additional_bindings becomes available in the ESP.

    // This RPC streams random numbers from the server.
    rpc GenerateStream (GenerateRequest) returns (stream GenerateResponse) {
        option (google.api.http) = {
          get: "/v1/generate/stream/milliseconds/{milliseconds}"
		  additional_bindings {
			post: "/v1/generate/stream"
			body: "*"
		  }
        };
    }

If I try an additional_bindings with a get, this works as expected.

using version
gcr.io/endpoints-release/endpoints-runtime:1.15.0

Jenkins test run out of metadata space limit

In GCE tests, it needs to use "gcloud compute ssh" instance, but ssh-key is not removed. It used a lot of metadata limit of 32KB.

We need to remove all used ssh-key in the project metadata.

Ability to pass headers from ESP to grpc backend

I don't see a way, documented or otherwise, to pass headers to gRPC similar to proxy_set_header in Nginx. I figured, perhaps, that all headers were transcoded to metadata, but that doesn't seem to be the case.

Is it possible to add a directive where we can set headers in the proxy, and see them in gRPC? For instance, if behavior in gRPC is based on source IP or some other transport value that isn't available without some headers?

I would be happy to contribute however I can. I'm running gRPC and ESP on Kubernetes in grpc-java.

After a few seconds of idle time, the next request to ESP has high latency

I am using ESP with Cloud Endpoints connected to a gRPC backend deployed on Kubernetes engine. The proxy is configured with a working SSL certificate from Let's Encrypt.

To support Stackdriver uptime monitoring via HTTPS, I have set up an endpoints forwarding rule like this (should be irrelevant, though; I can reproduce this with direct gRPC calls as well):

http:
  rules:
  - selector: my.service.Name.HealthCheck
    get: /HealthCheck

The gRPC handler for that method is literally a no-op — it simply returns an empty message.

However, in Cloud Endpoints, these uptime checks show up as having a very high backend latency:

The first part of the screenshot consists only of those uptime checks. As you can see, the requests have a median "backend latency" of ~1s and the 95th percentile is ~2s.

I can reproduce this behavior on the command line:

# For testing, this request goes directly to my backend. (Exposed for testing purposes through port 80. The binary info is because the gRPC protocol has some binary data; the important part is the low latency.)
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "http://my.domain/my.service.Name/HealthCheck"; date
Thu Apr 26 08:15:58 CEST 2018
Warning: Binary output can mess up your terminal. Use "--output -" to tell 
Warning: curl to output it to your terminal anyway, or consider "--output 
Warning: <FILE>" to save to a file.

real	0m0.046s
user	0m0.004s
sys	0m0.003s
Thu Apr 26 08:15:58 CEST 2018
# Still low latency.
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "http://my.domain/my.service.Name/HealthCheck"; date
Thu Apr 26 08:16:42 CEST 2018
Warning: Binary output can mess up your terminal. Use "--output -" to tell 
Warning: curl to output it to your terminal anyway, or consider "--output 
Warning: <FILE>" to save to a file.

real	0m0.063s
user	0m0.004s
sys	0m0.003s
Thu Apr 26 08:16:42 CEST 2018
# This request goes to the proxy. Notice the higher latency.
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "https://my.domain/my.service.Name/HealthCheck"; date
Thu Apr 26 08:16:49 CEST 2018
{}
real	0m1.361s
user	0m0.012s
sys	0m0.003s
Thu Apr 26 08:16:50 CEST 2018
# Another slow request to the proxy.
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "https://my.domain/my.service.Name/HealthCheck"; date
Thu Apr 26 08:17:04 CEST 2018
{}
real	0m1.134s
user	0m0.012s
sys	0m0.004s
Thu Apr 26 08:17:05 CEST 2018
# From now on, requests are much faster.
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "https://my.domain/my.service.Name/HealthCheck"; date
Thu Apr 26 08:17:06 CEST 2018
{}
real	0m0.238s
user	0m0.012s
sys	0m0.003s
Thu Apr 26 08:17:07 CEST 2018
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "https://my.domain/my.service.Name/HealthCheck"; date
Thu Apr 26 08:17:09 CEST 2018
{}
real	0m0.143s
user	0m0.012s
sys	0m0.003s
Thu Apr 26 08:17:09 CEST 2018
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "https://my.domain/my.service.Name/HealthCheck"; date
Thu Apr 26 08:17:10 CEST 2018
{}
real	0m0.084s
user	0m0.012s
sys	0m0.004s
Thu Apr 26 08:17:10 CEST 2018
# By the way, requests to /healthz have low latency:
$ date; time /usr/local/opt/curl/bin/curl --http2 --data "" "https://my.domain/healthz"; date
Thu Apr 26 08:17:10 CEST 2018
{}
real	0m0.084s
user	0m0.012s
sys	0m0.004s
Thu Apr 26 08:17:10 CEST 2018

As you can see, the first request has high latency, with much lower latency on subsequent requests. However, after a few seconds of inactivity, latency increases back to the original (high) values again.

Did anyone encounter this before or have ideas for a possible fix/workaround? Happy to provide any other logs you might need. (The default Cloud Endpoints logs just show the request, with high latency, but no extra info.)

This is my proxy's command line:

        "--ssl_port=9000",
        "--service=myservicename.myprojectid.cloud.goog",
        "--rollout_strategy=managed",
        "--backend=grpc://127.0.0.1:5050",
        "--healthz=healthz"

And this is how the proxy and backend are exposed by Kubernetes:

  - name: grpc
    port: 443
    targetPort: 9000
    protocol: TCP
  - name: grpc-backend
    port: 80
    targetPort: 5050
    protocol: TCP

Required properties not working as expected

Not sure if I'm posting in the write place, but I'm having a problem getting the validation to fail when it should on my esp proxy setup.

This is basically the setup I have:
https://cloud.google.com/endpoints/docs/openapi/get-started-compute-engine

Validation successfully fails on POST routes that only contain one parameter. But not on routes that look like the following with 2 parameters:

Part of my deployment YAML:

  /something/{uuid}/thing:
    post: 
      description: Add a new intake
      operationId: thingPost
      parameters:
        - name: uuid
          required: true
          in: path
          type: string
        - name: meal
          in: body
          description: Meal intake record, covering macro-nutrients
          required: true
          schema: 
            $ref: '#/definitions/thing'
      responses:
        "204":
          description: Meal record succesfully received.
        default:
          description: unexpected error
  thing:
    required:
      - time_stamp
      - carb
    properties:
      time_stamp: 
        type: integer
        description: Unix Timestamp
      carb:
        type: integer
        description: Amount of carbohydrates (g) in the meal.
      fat:
        type: integer
        description: Amount of fat (g) in the meal.
      protein: 
        type: integer
        description: Amount of protein (g) in the meal.
      comment:
        type: string
        description: Free text field. Could contain, e.g., meal description. Such as \"Sandwich with milk\".

I would expect to get a validation error if I miss out the time_stamp property, as that is a required property. But I don't.

Why is this happening? Am I doing something wrong? I would post the whole YAML but I'm not really alowed to.

Thanks.

Resource name containing URL escaped character results in 404

We have something like:

rpc GetFoo(GetFooRequest) returns (Foo) {
  option (google.api.http).get = "/resources/{name=foos/*}";
}

If the resource name contains character ":", like "foos/abc:def", then the URL in the HTTP GEt request becomes "/resources/foos/abc:def", and then ESP return the following error response before our gRPC server got hit.

{
 "code": 5,
 "message": "Method does not exist.",
 "details": [
  {
   "@type": "type.googleapis.com/google.rpc.DebugInfo",
   "stackEntries": [],
   "detail": "service_control"
  }
 ]
}

This makes sense since "foos/abc:def" matches the URL pattern for defining custom API method.

However, escaping colon with URL encoding does not solve the problem. For example, if we change the resource name from "foos/abc:def" to "foos/abc%3Adef", we still get the same error claiming method not found as above. Is this the intended behavior? If so, how can we have arbitrary characters in the resource name?

Update NGINX

We're still based on 1.11.5 but the latest version is 1.11.12.

HTTPS client should support SNI

The HTTPS client fetching auth public key doesn't support SNI, in some case ESP won't be able to fetch keys. See #262

[gRPC] User-Agent header is overwritten by ESP

I wanted to use the user-agent header to shut out old app versions from my service. However, my backends logs indicate a user agent string of

grpc-c++/1.4.2 grpc-c/4.0.0 (linux; chttp2; gregarious)

which is not the one provided by my client, and accessing my backend directly works.

Would it be possible for ESP to forward the original user agent header to my backend?

Transcode to HTTP/Protobuf

Is there a way to configure ESP to use binary protobuf for payloads when a gRPC service is accessed via HTTP?

In other words, instead of always transcoding protobuf to JSON when using the HTTP transcoder, stick to binary protobuf?

ESP swallows RpcException trailers metadata

Setup
I am using csharp Grpc 1.8.3 client-server communication and ESP sidecar together with gcloud endpoints for monitoring.

On a server side I rise RpcException with Trailers metdata attached. (https://grpc.io/grpc/csharp/api/Grpc.Core.RpcException.html)

Server side grpc-json communication
When I call http endpoint on a server and an exception is raised RpcException metadata is transcoded from GRPC to JSON correctly and I can see it in HTTP response.

Client-server communication without ESP
When I call server method from a another service through GRPC communication directly (NOT through ESP), on a client side I get RpcException raised with Trailers metadata attached. Which works fine.

Client-server communication with ESP
However, when I call a server method through ESP Trailers metadata is empty.

Problem
I suspect that ESP does not proxy correctly trailers data. Does anyone have any input on that?

JWT validation failed: Unable to fetch verification key

Hi,
I have an ESP on top of both an appengine service and a k8s cluster (elastic search)

I have upgraded the esp config to talk to our prod x-google-jwks_uri but the esp cannot reach the url (which can be browsed.

{
    "code": 16,
    "message": "JWT validation failed: Unable to fetch verification key",
    "details": [
        {
            "@type": "type.googleapis.com/google.rpc.DebugInfo",
            "stackEntries": [],
            "detail": "auth"
        }
    ]
}

i can send the x-google-jwks_uri to whoever needs it. This url uses a certificate signed Issued by: Entrust Certification Authority - L1M.

From the appengine and k8s nginx log i can see:

"[error] 43#43: peer closed connection in SSL handshake"

which makes me think something is missing in the configuration for both to properly either resolve the dns or validate the certificate (which again is supposed to be public).

any thought?

transcoding: support "grpc-status-details-bin"

Seems grpc-go sends "grpc-status-details-bin" in trailers to have error details (https://github.com/grpc/grpc-go/blob/v1.4.1/transport/http2_server.go#L750). We might need to support that in transcoding.

TODO:

Confirm what is the standard format for "grpc-status-details-bin", it is not documented in gRPC Wire Protocol.
Confirm C and Java stack behavior.

Export config manager status as a metrics

For managed rollouts, it's unclear that which version of config is deployed. This information is currently only available from /endpoints_status, and should be exported through the standard channels, eg. with a custom metric for esp.

"a client request body is buffered to a temporary file"

hi, we are seeing many entries of the above type in the esp logs, and also observe unexplained latencies that could be related:

2018/05/24 06:52:13 [warn] 12#12: 1 a client request body is buffered to a temporary file /var/cache/nginx/client_temp/0000000002, client: , server: , request: "POST /************************ HTTP/2.0", host: "*******************************"

the call is a GRPC call, and less than 200 bytes in body size. /etc/nginx/endpoints/nginx.conf in the container says "client_body_buffer_size 128k;", so it's not "just" nginx config or a large request.

please let me know if there is anything I can do to shed some more light on this

Add config ID in the log-entry

When performing a managed rolled out, It’s absolutely necessary to distinguish the logs and metrics generated by old config and the new ones.

The config id should be presented as a label to provide better insights of what’s going on with the rollout.

JWT validation failed: KEY_RETRIEVAL_ERROR

I am using Google Cloud Endpoints with container engine. I have set up a custom security definition to verify a JWT from Azure AD. I have set the x-google-jwks_uri="https://login.microsoftonline.com/common/discovery/v2.0/keys".

I receive the following error on the esp

E0906 20:02:18.843727344 11 auth_jwt_validator.cc:570] Cannot find matching key in key set for kid=HHByKU-0DqAqMZh6ZFPd2VWaOtg and alg=RS256

This key is clearly in the keys list. Is there anything else that could be responsible for not finding it?

gRPC FlowControl test client crash with gRPC 1.3.2

Request: release with 'client_body_buffer_size' change from endpoints-tools

I made a change in the endpoint-tools project: cloudendpoints/endpoints-tools#53.

I would like to start making use of it. Is it possible to create a new release of the esp Docker image (gcr.io/endpoints-release/endpoints-runtime) that contains this change?

Consider trigger intermediate report in ProxyFlow

Using timer seems inefficient when a streaming have long waiting time. Trigger report in read/write path with timestamp check might be better.

Nginx doesn't re-resolve Kubernetes Service VIPs

We have a bunch of Nginx location blocks as follows:

location /Service1/MethodA {
  grpc_pass service-1:5000 override;
}

location /Service2/MethodB {
  grpc_pass service-2:5000 override;
}

location /Service3/MethodC {
  grpc_pass service-3:5000 override;
}

If, for example, I were to delete and re-create the service-3 Service in Kubernetes, it would be assigned a new VIP address, but the Pods would remain the same, resulting in "backend unavailable" errors.

Is it possible to re-resolve Service IP addresses (or obey DNS TTL) without having to restart Nginx?

Receiving 503 on first hits of newly deployed endpoints

version: gcr.io/endpoints-release/endpoints-runtime:1.12.0
example args used:
-s snapshot-api-test.vendasta-internal.com
-v 2018-03-27r1
-abgrpc://127.0.0.1:11000
-p 11003
-P 11005
-S 11006
-z healthz
-n /etc/nginx/mscli/nginx.conf

Issue:
The first time our service is hit with a request everything works correctly.
If instead of a single first request, if we run multiple requests in parallel the response from endpoints is

{
 "code": 14,
 "message": "Failed to fetch metadata",
 "details": [
  {
   "@type": "type.googleapis.com/google.rpc.DebugInfo",
   "stackEntries": [],
   "detail": "internal"
  }
 ]
}

Then all subsequent requests work

SSL passthough

Right now there is no option to call the backend service over https? I'm using cloud endpoints in combination with an nginx ingress controller in k8s as backend, and I only want to allow https connections on that (since we also use that same controller for websockets on a different port). Is there a way to enable passthrough when the incoming request is https?

Option to host open-api spec on a given path

We're using a grpc backend with an additional service.yaml. In order to use tools like swagger-codegen one needs the finals opne-api.spec then ESP realizes. Right now we're using a proto compiler tool (https://github.com/grpc-ecosystem/grpc-gateway/tree/master/protoc-gen-swagger), but that currently can't take data from service.yaml, so the resulting open-api.spec is incomplete. While there is a pull request for that feature, a remaining issue would be to host the api spec and hence the request for esp to allow to configure a path under which the api spec would be available.

"message": "JWT validation failed: BAD_FORMAT",

I am getting following from esp (cloud endpoint with app engine flex):

{
 "code": 16,
 "message": "JWT validation failed: BAD_FORMAT",
 "details": [
  {
   "@type": "type.googleapis.com/google.rpc.DebugInfo",
   "stackEntries": [],
   "detail": "auth"
  }
 ]
}

I am signing the token with a json key_file (see attached image about the signature being valid)

header:

{
  "typ": "JWT",
  "alg": "RS256",
  "kid": "5175a59697375791842e65b064dce1c100d0455a"
}

payload

{
  "iat": 1510841801,
  "exp": 1510845401,
  "aud": "https://airport5-185713.appspot.com",
  "iss": "[email protected]"
}

the actual token:
eyJ0eXAiOiAiSldUIiwgImFsZyI6ICJSUzI1NiIsICJraWQiOiAiNzgzOTY4YmE4Yzc1NWQzMGQ5YTAyYjQ4MmNmMGU3NDJhMzZhNTRlMSJ9.eyJpYXQiOiAxNTEwODQzNTI1LCAiZXhwIjogMTUxMDkxNTUyNSwgImF1ZCI6ICJodHRwczovL2FpcnBvcnQ1LTE4NTcxMy5hcHBzcG90LmNvbSIsICJpc3MiOiAiY2xpZW50MUBhaXJwb3J0NS0xODU3MTMuaWFtLmdzZXJ2aWNlYWNjb3VudC5jb20ifQ==.BG_4vPJBzi_T5ZkBwtgx33553L3wSy9QeDxDWaEWvJAZEg4l0XZqclBL_p7PzFAjYhkQrmAY4CCyFqQ_ZCAM78mqIzbmy4uduyuyqm12miclJV62edNPSMXUjyTKk5C4pGsJ4fr02uf6BGENrMlS7COnXXUfS-kBIjxHmErBL49ItH8gYB2xJ30iON5W_HuqfH1QR_vch9-DWMcg8E2iSCuRR5GUBrR9kww35E4nhdFcvRYIA2eX4F4zPX5DEpPWsSHfE-TvPZIAbx1r_-PPkkHfwMfjXoatEfqw8m2EFfaKMDxxKZi_26iAQ9MQ1TqeFfGfLGhI4xZ0fwx8ZPfaGA==

certificate public key (783968ba8c755d30d9a02b482cf0e742a36a54e1):
[https://www.googleapis.com/robot/v1/metadata/x509/[email protected]]

openAPI.yaml:

google_jwt_client-2:
    authorizationUrl: ""
    flow: "implicit"
    type: "oauth2"
    x-google-issuer: "[email protected]"
    x-google-jwks_uri: "https://www.googleapis.com/robot/v1/metadata/x509/[email protected]"
    x-google-audiences: "https://airport5-185713.appspot.com"

I am generating the token with:

def generate_jwt(service_account_file = file_path):

        signer = google.auth.crypt.RSASigner.from_service_account_file(
            service_account_file)
        
        now = int(time.time())
        expires = now + 3600*20  # twenty hours

        payload = {
            'iat': now,
            'exp': expires,
            'aud': 'https://airport5-185713.appspot.com',
            'iss': '[email protected]'
        }

        jwt = google.auth.jwt.encode(signer, payload)
        logging.debug(jwt)

        return jwt

to reproduce:
curl -H "Authorization: Bearer ${TOKEN}" "${ENDPOINTS_HOST}/airportName"
where ENDPOINTS_HOST=https://airport5-185713.appspot.com

ESP to add a header to indicate ESP version

I understand ESP adds a header when sending the request to the backend if the request included a JWT and that JWT was successfully authenticated.

Does ESP add any headers when API keys are successfully validated? What about sending a ESP version header? Is there any way to tell by examining the HTTP request that it was processed through ESP?

Thanks!

Use bazel binary installer in Jenkins

Reload ssl certificates

Hi 👋 ,

I'm working on a k8s cluster where the ssl certificates are managed by cert-manager via letsencrypt. cert-manager update the kube secret every 90 days with the new certificate. The esp is our gateway server and mounts the secret(ssl certs) for nginx.

The problem is that nginx doesn't reload the ssl certs and continues to serve the old ones from memory, do you think it's in this projects scope to watch the ssl directory and reload when there are any changes?

Allow ESP container parameters to be specified by environment variables

Reading the startup parameters via env vars (vs through flags) could help simplify configurability, especially in a container environment such as Kubernetes using ConfigMaps.

In addition, this could also allow instances of reloading the configuration dynamically instead of starting a new container when env vars change.

Support for HTTP form encoded bodies

For many client applications such as oauth servers, it would be nice to have support for transcription of POST bodies in form encoding (https://www.w3.org/TR/html401/interact/forms.html#h-17.13.3.4), both application/x-www-form-urlencoded and multipart/form-data, to gRPC. This would work the same way as JSON transcoding, just from a different body encoding.

Adjustable SSL config

I'd like to customize my ESP SSL config with hardened settings. Is there a supported/official way of doing that?

gRPC JWT Validation with Google ID

Is it impossible to authenticate a gRPC service with a Google ID?

#auth config
type: google.api.Service
config_version: 3

authentication:
  providers:
  - id: google_id_token
    issuer: https://accounts.google.com
    jwks_uri: https://www.googleapis.com/oauth2/v3/certs
  rules:
  # This auth rule will apply to all methods.
  - selector: "*"
    requirements:
      - provider_id: google_id_token

{
 "code": 16,
 "message": "JWT validation failed: KEY_RETRIEVAL_ERROR",
 "details": [
  {
   "@type": "type.googleapis.com/google.rpc.DebugInfo",
   "stackEntries": [],
   "detail": "auth"
  }
 ]
}

Valid JWT rejected with BAD_FORMAT

When sending a request to a Cloud Endpoints deployment with a JWT generated by Keycloak we struggle with the following error message:

{
  "code": 16,
  "message": "JWT validation failed: BAD_FORMAT",
  "details": [
    {
      "@type": "type.googleapis.com/google.rpc.DebugInfo",
      "stackEntries": [],
      "detail": "auth"
    }
  ]
}

We've gone through the JWT troubleshooting document and tried everything we could think of, short of recompiling the ESP with a bunch of debug log statements to figure out where in this method it fails.

The JWT header looks like this:

{
  "alg": "RS256",
  "typ": "JWT",
  "kid": "Ld2tfCjn73R6spC8A0ZEn2w5KJ9JtQti5AcZtvPMDmY"
}

and the JWT payload looks like this:

{
  "jti": "3b7a8f19-38aa-46dd-8af8-9825529f1cf7",
  "exp": 1510319182,
  "nbf": 0,
  "iat": 1510318882,
  "iss": "https://SNIP",
  "aud": "SNIP",
  "sub": "SNIP",
  "typ": "Bearer",
  "azp": "manual-test-client",
  "auth_time": 0,
  "session_state": "ef69df24-986c-4025-a60c-6460b95b3819",
  "acr": "1",
  "allowed-origins": [],
  "realm_access": {
    "roles": [
      "uma_authorization"
    ]
  },
  "resource_access": {
    "account": {
      "roles": [
        "manage-account",
        "manage-account-links",
        "view-profile"
      ]
    }
  },
  "clientId": "manual-test-client",
  "clientHost": "10.128.0.6",
  "preferred_username": "service-account-SNIP",
  "clientAddress": "10.128.0.6",
  "email": "service-account-SNIP"
}

The payload contains more fields than the ESP needs - should that cause an error?

Could the nbf being set to 0 cause an issue?

ESP / Service Control does not validate IAM policies

I've setup my swagger definition for securing the entire API with Google ID JWT.

security: 
  - google_id_token: []

securityDefinitions: 
  google_id_token:
    authorizationUrl: ""
    type: "oauth2"
    flow: "implicit"
    x-google-issuer: "https://accounts.google.com"

I also have setup some policies on my Endpoint so that only one service account is roles/servicemanagement.serviceConsumer. When i make an authenticated call with any service account even from other GCP projects/ organizations, it seems like there is no validation beyond that the JWT is properly signed.

Looking at the code, it seems like this might only be enforced if I use an API Key which seems a bit redundant.

High failure rate for tests on flex

During this week's release, I have noticed high failure rate for some tests. Please see the attachment about the results.
RESULT.log

Increase the 60 second http request timeout

Hi,

Is there a way to increase the http timeout beyond 60 seconds?

GZIP Support

I can't find any docs around enabling this or support for this? Am I missing a hidden feature or does endpoints just not do this? It would be a huge improvement if transcoded JSON responses were compressed. Thanks!

Upstream SSL certificate verify error on JWT validation

I am using Google Cloud Endpoints with container engine.

My securityDefinition is:

  ridero_jwt:
    authorizationUrl: ""
    flow: "implicit"
    type: "oauth2"
    x-google-issuer: "https://pg.ridero.store"
    x-google-jwks_uri: "https://pg.ridero.store/auth/certs"
    x-google-audiences: "store-admin-pg"

When I am trying to use JWT token, I get the following error in ESP stdout:

2017/09/08 04:28:17 [error] 9#9: upstream SSL certificate verify error: (18:self signed certificate)
10.60.3.1 - - [08/Sep/2017:04:28:17 +0000] "POST /v1/sku HTTP/1.0" 401 212 "-" "curl/7.54.0"

In response I get:

{
 "code": 16,
 "message": "JWT validation failed: Unable to fetch verification key",
 "details": [
  {
   "@type": "type.googleapis.com/google.rpc.DebugInfo",
   "stackEntries": [],
   "detail": "auth"
  }
 ]
}

In tracer I get:

I have tried to connect into esp container with "kubectl run", install curl and make request and there was no errors.
I have checked trusted-ca-certificates.crt and it contains my CA: DST Root CA X3.
I have tried to set x-google-jwks_uri to http://auth.default.svc.cluster.local/certs, but esp couldn't resolve domain name.

//src/nginx/t:config_rollouts_managed is flaky under TSAN

https://endpoints-jenkins.appspot.com/job/esp/job/presubmit/650/execution/node/124/log/

not ok 44 - no sanitizer errors
#   Failed test 'no sanitizer errors'
#   at /home/jenkins/.cache/bazel/_bazel_jenkins/752dabe8e6eb24287f1fbb3854c08f93/execroot/__main__/bazel-out/local-fastbuild/bin/src/nginx/t/config_rollouts_managed.runfiles/__main__/third_party/nginx-tests/lib/Test/Nginx.pm line 89.
#          got: 'WARNING: ThreadSanitizer: heap-use-after-free (pid=18)
# SUMMARY: ThreadSanitizer: heap-use-after-free /usr/include/c++/4.9/bits/basic_string.h:293 std::string::_M_data() const
# WARNING: ThreadSanitizer: heap-use-after-free (pid=18)
# SUMMARY: ThreadSanitizer: heap-use-after-free /usr/include/c++/4.9/bits/basic_string.h:293 std::string::_M_data() const
# WARNING: ThreadSanitizer: heap-use-after-free (pid=18)
# SUMMARY: ThreadSanitizer: heap-use-after-free src/api_manager/service_control/aggregated.cc:270 operator()
# WARNING: ThreadSanitizer: heap-use-after-free (pid=18)
# SUMMARY: ThreadSanitizer: heap-use-after-free ??:0 __tsan_atomic32_fetch_add
# WARNING: ThreadSanitizer: heap-use-after-free (pid=18)
# SUMMARY: ThreadSanitizer: heap-use-after-free ??:0 operator delete(void*)
# ThreadSanitizer: reported 5 warnings'
#     expected: ''
# Looks like you failed 1 test of 44.

Remove X-Endpoint-API-UserInfo header from clients

ESP should remove any X-Endpoint-API-UserInfo headers from clients.

This header should be generated by ESP. ESP should not forward it.