Skip to content

KEP 1645: add endpointSliceObjectsPresence field to ServiceImport#6037

Open
MrFreezeex wants to merge 1 commit intokubernetes:masterfrom
MrFreezeex:svcimport-epslice-availables
Open

KEP 1645: add endpointSliceObjectsPresence field to ServiceImport#6037
MrFreezeex wants to merge 1 commit intokubernetes:masterfrom
MrFreezeex:svcimport-epslice-availables

Conversation

@MrFreezeex
Copy link
Copy Markdown
Member

@MrFreezeex MrFreezeex commented Apr 28, 2026

  • One-line PR description: add endpointSlicesAvailable field to ServiceImport
  • Other comments:

Add endpointSliceObjectsPresence to ServiceImport status so that consumer such as Gateway API implementations can decide if they should use EndpointSlice objects directly. Since surfacing EndpointSlices from remote clusters is optional without this fields consumers that wanted to directly reached the backends by reading the EndpointSlice objects had to fail in unclear ways or needed to have more complex probing logic to determine that info.

Cilium does not always sync back the EndpointSlice objects. We do that only if the ServiceImport is headless or if a certain annotation is present on the ServiceImport object (as the ServiceImport IPs load-balancing is not required
to have the EndpointSlice objects present in this implementation). Since EndpointSlice syncing is optional this is fully spec compliant. The addition of the endpointSliceObjectsPresence on ServiceImport would greatly benefit us if adopted by consumers, because our users would probably see a clearer error that should led them to add the implementation specific annotation signaling that EndpointSlices should be synced.

Note that I decided to not propose a ServiceExport field to replace the Cilium annotation at this moment, since I am not sure that any other implementation have the same behavior (if at least one other implementation have something like that it too would probably be worth it though!).

@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Apr 28, 2026
@k8s-ci-robot k8s-ci-robot added sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 28, 2026
@MrFreezeex MrFreezeex force-pushed the svcimport-epslice-availables branch from 012f0e9 to aff1efb Compare April 28, 2026 16:59
@MrFreezeex MrFreezeex changed the title KEP 1645: add endpointSlicesAvailable field to ServiceImport KEP 1645: add endpointSliceObjectsPresence field to ServiceImport Apr 28, 2026
@MrFreezeex MrFreezeex force-pushed the svcimport-epslice-availables branch from aff1efb to 46df9da Compare April 28, 2026 17:07
@lauralorenz
Copy link
Copy Markdown
Contributor

A few thoughts:

  • On "must" fill out and whether the field is optional: Optional in the CRD makes upgrades easier, but what does the field being null mean? Is it the same as picking one of "present" or "absent"? If such a meaning is ascribed to null, then what is the default and what does that mean for implementations as they upgrade?
  • On naming: Ideas "EndpointSliceObjectsPresence" ->
    • "EndpointSliceSynced"
    • "EndpointSliceSupplied"
    • just "EndpointSliceObjects:[Present|Absent]"
    • or just "EndpointSlices:[Present|Absent]"
      • but then it kinda feels like it should have all the endpointslices there directly
      • Would we ever actually include a ref to the EndpointSlices here directly instead? Probably not for performance reasons but bringing it up

@MrFreezeex
Copy link
Copy Markdown
Member Author

MrFreezeex commented Apr 29, 2026

  • On "must" fill out and whether the field is optional: Optional in the CRD makes upgrades easier, but what does the field being null mean? Is it the same as picking one of "present" or "absent"? If such a meaning is ascribed to null, then what is the default and what does that mean for implementations as they upgrade?

Yeah the field being optional is fully for upgrade, if the field is not there yet it should only be because the implementation does not support that CRD version yet. As a consumer you can't yet expect anything in that case (you are in the same state as without that field). It might take a bit of time so that consumers can safely rely on that info (implementation should support it, users upgrade to a version of implementations supporting it etc).

The "must" is to say that you won't be passing core/required conformance test (whenever that will be added, meaning probably 0.6.0 at the earliest).

On naming: Ideas [...]

I think I might prefer "EndpointSliceObjects:[Present|Absent]" among the one you suggested 👀. It presuppose less things and we might have an easier time extending that enum it in the future (not sure we will ever do but it still more future possibilities 🤷‍♂️).

or just "EndpointSlices:[Present|Absent]"
but then it kinda feels like it should have all the endpointslices there directly

Yep we probably want more to convey to consumers the decision of the implementation that EndpointSlice should or should not be there at some point not necessarily that it's already there indeed!

Would we ever actually include a ref to the EndpointSlices here directly instead? Probably not for performance reasons but bringing it up

This is indeed probably not necessary since there's already all the labels in the EndpointSlice objects which you can specifically watch for efficiently as a consumer/client and indeed it has some performance cost to maintain those. We probably want to do the same thing as the relation between Service and EndpointSlice and not introduce that IMO.

Add this field so that consumers such as Gateway API implementations
can decide if they should use EndpointSlice objects directly. Since
surfacing EndpointSlices from remote clusters is optional without this
fields consumers that wanted to directly reached the backends by reading
the EndpointSlice objects had to fail in unclear ways or needed to have more
complex probing logic to determine that info.

Cilium does not always sync back the EndpointSlice objects. We do that only
if the ServiceImport is headless or if a certain annotation is present on the
ServiceImport object (as the ServiceImport IPs load-balancing is not required
to have the EndpointSlice objects present in this implementation).
Since EndpointSlice syncing is optional this is fully spec compliant.
The addition of the `endpointSliceObjectsPresence` on ServiceImport would
greatly benefit us if adopted by consumers, because  our users would probably
see a clearer error that should led them to add the implementation specific
annotation signaling that EndpointSlices should be synced.

Note that I decided to not propose a ServiceExport field to replace the Cilium
annotation at this moment, since I am not sure that any other implementation
have the same behavior (if at least one other implementation have something
like that it too would probably be worth it though!).

Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
@MrFreezeex MrFreezeex force-pushed the svcimport-epslice-availables branch from 46df9da to ab7b188 Compare April 29, 2026 17:27
@skitt
Copy link
Copy Markdown
Member

skitt commented May 5, 2026

/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MrFreezeex, skitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 5, 2026
@lauralorenz
Copy link
Copy Markdown
Contributor

Was thinking that we could do a best-effort reach out to the MCS implementations notifying them of this PR and availability for comment/+1s. Since this is a non breaking change I think a maximum wait time of 2 weeks would be sufficient.

@MrFreezeex
Copy link
Copy Markdown
Member Author

Was thinking that we could do a best-effort reach out to the MCS implementations notifying them of this PR and availability for comment/+1s. Since this is a non breaking change I think a maximum wait time of 2 weeks would be sufficient.

Hmm reaching out to implementation seems more something that we could do once the API is ready and released on kubernetes-sigs/mcs-api? There is no real guarantee that any implementations is actually conformant today without accounting this change (besides probably Submariner and Cilium that are known to execute the conformance suite).

Speaking of, we also would need to reach out most likely relatively with the conformance report program work from @AnupamGhosh once ready. We might be able to reach out with the announcement of the report program and encourage implementations to pick up from all the "recent" changes (including this one) at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants