-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Report job has infra failure to run-service #4073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for reporting infrastructure failures to the run service. The changes introduce a new boolean parameter hasInfrastructureFailure throughout the job completion pipeline, allowing the system to track and report when jobs fail due to infrastructure issues rather than code-related problems.
- Added
hasInfrastructureFailureparameter to job completion APIs and contracts - Implemented logic to detect infrastructure issues from job annotations and set the failure flag
- Updated all call sites to pass the new parameter through the completion flow
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/Sdk/RSWebApi/RunServiceHttpClient.cs | Added hasInfrastructureFailure parameter to CompleteJobAsync method |
| src/Sdk/RSWebApi/Contracts/IssueExtensions.cs | Added IsInfrastructureIssue property mapping from issue annotations |
| src/Sdk/RSWebApi/Contracts/CompleteJobRequest.cs | Added HasInfrastructureFailure property to the request contract |
| src/Runner.Worker/JobRunner.cs | Updated CompleteJobAsync call to pass the infrastructure failure flag |
| src/Runner.Worker/GlobalContext.cs | Added HasInfrastructureFailure property to track infrastructure failures |
| src/Runner.Worker/ExecutionContext.cs | Added logic to detect infrastructure issues and set the global flag |
| src/Runner.Listener/JobDispatcher.cs | Updated ForceFailJob to pass hasInfrastructureFailure parameter |
| src/Runner.Common/RunServer.cs | Updated interface and implementation to include hasInfrastructureFailure parameter |
bd5fbd1 to
9421d45
Compare
| public string BillingOwnerId { get; set; } | ||
|
|
||
| [DataMember(Name = "infrastructureFailureCategory", EmitDefaultValue = false)] | ||
| public string InfrastructureFailureCategory { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will make run-service take this new field in the server side PR.
| // Log the error and fail the PrepareActionsAsync Initialization. | ||
| Trace.Error($"Caught exception from PrepareActionsAsync Initialization: {ex}"); | ||
| executionContext.InfrastructureError(ex.Message); | ||
| executionContext.InfrastructureError(ex.Message, category: "resolve_action"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is what we missed with run-service.
| // Log the error and fail the PrepareActionsAsync Initialization. | ||
| Trace.Error($"Caught exception from PrepareActionsAsync Initialization: {ex}"); | ||
| executionContext.InfrastructureError(ex.Message); | ||
| executionContext.InfrastructureError(ex.Message, category: "invalid_action_download"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is what we missed with run-service.
https://github.com/github/actions-runtime/issues/5301