Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPCRoute with no matches results in invalid policy #13425

Open
zaharidichev opened this issue Dec 4, 2024 · 3 comments
Open

GRPCRoute with no matches results in invalid policy #13425

zaharidichev opened this issue Dec 4, 2024 · 3 comments
Labels

Comments

@zaharidichev
Copy link
Member

What is the issue?

Whenever we create a GRPCRoute that has no matches at all, the proxy sees that as invalid outbound policy.

How can it be reproduced?

Reproduction:

Create a GRPC server:

apiVersion: v1
kind: Service
metadata:
  name: server
spec:
  type: ClusterIP
  selector:
    app: server
  ports:
  - port: 8080
    protocol: TCP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: server
  template:
    metadata:
      labels:
        app: server
      annotations:
        linkerd.io/inject: enabled
    spec:
      containers:
        - name: server
          image: buoyantio/bb:v0.0.5
          command: [ "sh", "-c"]
          args:
          - "/out/bb terminus --grpc-server-port 8080 --response-text hello --fire-and-forget"
          ports:
            - name: grpc-port
              containerPort: 8080

Create a GRPCRoute for this server:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: GRPCRoute
metadata:
  name: grpc
spec:
  parentRefs:
    - name: server
      group: core
      kind: Service
      port: 8080

Make sure the status of the Route is accepted and backend refs are resolved.

Now, create a simple client (non GRPC), just to be able to fascilitate a policy resolution for this target.

apiVersion: v1
kind: Pod
metadata:
  name: client
  annotations:
    linkerd.io/inject: enabled
    config.linkerd.io/proxy-log-level: debug
spec:
  containers:
  - name: client
    image: curlimages/curl
    command:
      - "sh"
      - "-c"
      - >
        while true; do
          sleep 3600;
        done

Ssh into the client and issue a simple curl request to the service on port 8080.

kubectl  exec -it client -c client -- sh

~ $ curl -v server:8080
* Host server:8080 was resolved.
* IPv6: (none)
* IPv4: 10.43.248.236
*   Trying 10.43.248.236:8080...
* Connected to server (10.43.248.236) port 8080
* using HTTP/1.x
> GET / HTTP/1.1
> Host: server:8080
> User-Agent: curl/8.11.0
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 500 Internal Server Error
< l5d-proxy-error: unexpected error
< connection: close
< l5d-proxy-connection: close
< content-length: 0
< date: Wed, 04 Dec 2024 07:55:17 GMT
<
* shutting down connection #0
~ $

If you tail the logs of the client proxy you will see that the policy that has been delivered is invalid:

kubectl logs pod/client  -c linkerd-proxy -f | grep ClientPolicy
[    51.027132s] DEBUG ThreadId(01) linkerd_app_outbound::policy::api: policy=ClientPolicy { parent: Default { name: "invalid" }, protocol: Detect { timeout: 10s, http1: Http1 { routes: [Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: None, headers: [], query_params: [], method: None }], policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: RouteParams { timeouts: Timeouts { response: None, idle: None, request: None }, retry: None, allow_l5d_request_headers: false } } }] }], failure_accrual: None }, http2: Http2 { routes: [Route { hosts: [], rules: [Rule { matches: [MatchRequest { path: None, headers: [], query_params: [], method: None }], policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: RouteParams { timeouts: Timeouts { response: None, idle: None, request: None }, retry: None, allow_l5d_request_headers: false } } }] }], failure_accrual: None }, opaque: Opaque { routes: Some(Route { policy: RoutePolicy { meta: Default { name: "invalid" }, filters: [InternalError("invalid client policy configuration")], distribution: Empty, params: () } }) } }, backends: [] }

You can also observe the following error:

[   111.949112s]  WARN ThreadId(01) linkerd_app_outbound::policy::api: Client policy misconfigured error=invalid gRPC route: invalid route match: missing RPC match

The reason why this happens is because the proxy tries to parse the RPC match but it fails because it is missing. Lets walk through the stack in the policy controller:

  1. It is missing because the policy controller contains the following logic for putting together the API response. This indicates that if our method match is None, then the RPC will be None as well (which is the source of the problem.)
    https://github.com/linkerd/linkerd2/blob/main/policy-controller/grpc/src/routes/grpc.rs#L23

  2. The reason why the method in this case is None can be seen in the following piece of code.This is the code that takes the deserialized K8s API representation and turns it into our internal type.

    let method = method.map(|value| match value {
    .

  3. The K8s representation in turn is deserialized using the following logic in the gateway API repo:

fn deserialize_method_match<'de, D: serde::Deserializer<'de>>(
    deserializer: D,
) -> Result<Option<GrpcMethodMatch>, D::Error> {
    <Option<GrpcMethodMatch> as serde::Deserialize>::deserialize(deserializer).map(|value| {
        match value.as_ref() {
            Some(rule) if rule.is_empty() => None,
            _ => value,
        }
    })
}

...
impl GrpcMethodMatch {
    fn is_empty(&self) -> bool {
        let (method, service) = match self {
            Self::Exact { method, service } => (method, service),
            Self::RegularExpression { method, service } => (method, service),
        };

        method.as_deref().map(str::is_empty).unwrap_or(true)
            && service.as_deref().map(str::is_empty).unwrap_or(true)
    }
}

It is clear at this point that if method is an empty string or service is an empty string then this is an empty GRPC match and None is returned, even if the rype of the match has been specified.

Now, if we look closely, we will see that the version of the GRPCRoute CRDs that we use, has a default for route matches that create a route match of type Method with nothing else specified: https://github.com/linkerd/linkerd2/blob/main/charts/linkerd-crds/templates/gateway.networking.k8s.io_grpcroutes.yaml#L254

This makes it so, whenever we create a GRPCRoute with no explicit matches, we end up interpreting that as an Invalid one.

Logs, error output, etc

already provided

output of linkerd check -o short

n/a

Environment

  • Linkerd version: edge-24.11.8
  • k3d version v5.6.3
  • k3s version v1.28.8-k3s1 (default)

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

None

@kflynn
Copy link
Member

kflynn commented Dec 5, 2024

For the record, that default was removed from the GRPCRoute CRD in Gateway API v0.8.0, so if you update your Gateway API CRDs to v0.8.0 or higher, everything should Just Work. (I'm pretty sure I tested this with Gateway API 1.1.1, but I'll doublecheck that.)

@kflynn
Copy link
Member

kflynn commented Dec 13, 2024

I deleted my previous comment about how Gateway API v1.1.1 makes this Just Work™.

In the very early days of Gateway API, a Route with a Service parentRef but no backendRefs would implicitly use the parentRef as a backendRef, so the GRPCRoute that @zaharidichev cites should indeed have worked.

In both current Gateway API and in Linkerd's implementation, you need to supply backendRefs: the GRPCRoute with no backendRefs at all is accepted, and if you've installed Gateway API v1.1.1 it won't have a weird matches stanza added to it. It won't actually route to anything, though, so it probably shouldn't be accepted in the first place.

You can do this

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: GRPCRoute
metadata:
  name: grpc
spec:
  parentRefs:
    - name: server
      group: core
      kind: Service
      port: 8080
  rules:
    - backendRefs:
        - name: server
          port: 8080

and it'll work fine even though it has no matches clause at all, but it seems like this is still an area where we can clean things up a bit.

@kflynn
Copy link
Member

kflynn commented Dec 13, 2024

(The implicit-backendRef thing is buried in GEP-1324, at https://gateway-api.sigs.k8s.io/geps/gep-1294/#implicit-backendref.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants