Skip to content

Conversation

@TingluoHuang
Copy link
Member

@TingluoHuang TingluoHuang requested a review from a team as a code owner October 9, 2025 21:18
Copilot AI review requested due to automatic review settings October 9, 2025 21:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for reporting infrastructure failures to the run service. The changes introduce a new boolean parameter hasInfrastructureFailure throughout the job completion pipeline, allowing the system to track and report when jobs fail due to infrastructure issues rather than code-related problems.

  • Added hasInfrastructureFailure parameter to job completion APIs and contracts
  • Implemented logic to detect infrastructure issues from job annotations and set the failure flag
  • Updated all call sites to pass the new parameter through the completion flow

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Sdk/RSWebApi/RunServiceHttpClient.cs Added hasInfrastructureFailure parameter to CompleteJobAsync method
src/Sdk/RSWebApi/Contracts/IssueExtensions.cs Added IsInfrastructureIssue property mapping from issue annotations
src/Sdk/RSWebApi/Contracts/CompleteJobRequest.cs Added HasInfrastructureFailure property to the request contract
src/Runner.Worker/JobRunner.cs Updated CompleteJobAsync call to pass the infrastructure failure flag
src/Runner.Worker/GlobalContext.cs Added HasInfrastructureFailure property to track infrastructure failures
src/Runner.Worker/ExecutionContext.cs Added logic to detect infrastructure issues and set the global flag
src/Runner.Listener/JobDispatcher.cs Updated ForceFailJob to pass hasInfrastructureFailure parameter
src/Runner.Common/RunServer.cs Updated interface and implementation to include hasInfrastructureFailure parameter

@TingluoHuang TingluoHuang force-pushed the users/tihuang/infrafailure branch from bd5fbd1 to 9421d45 Compare October 13, 2025 19:55
public string BillingOwnerId { get; set; }

[DataMember(Name = "infrastructureFailureCategory", EmitDefaultValue = false)]
public string InfrastructureFailureCategory { get; set; }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will make run-service take this new field in the server side PR.

// Log the error and fail the PrepareActionsAsync Initialization.
Trace.Error($"Caught exception from PrepareActionsAsync Initialization: {ex}");
executionContext.InfrastructureError(ex.Message);
executionContext.InfrastructureError(ex.Message, category: "resolve_action");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what we missed with run-service.

// Log the error and fail the PrepareActionsAsync Initialization.
Trace.Error($"Caught exception from PrepareActionsAsync Initialization: {ex}");
executionContext.InfrastructureError(ex.Message);
executionContext.InfrastructureError(ex.Message, category: "invalid_action_download");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what we missed with run-service.

@TingluoHuang TingluoHuang merged commit 1eb15f2 into main Oct 13, 2025
10 checks passed
@TingluoHuang TingluoHuang deleted the users/tihuang/infrafailure branch October 13, 2025 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants