FAQ or similar for troubleshooting a control plane that remains Down?

I am trying to get our (otherwise working) instance of gatling up and running again. Sometime last week the control plane went down. No idea why (we’ll come back to that).

Enterprise 2024.42.5
Running in AWS/Docker
have the control plane on a private location

Things I have tried
Rebuild the control plane
rebuild the entire server
The builds work as expected - no errors
redeploy both the control plane and the entire server
re-deploy works as expected - no errors
The control plane remains down, though.

The constraint I have is that the former owner left very little documentation on building/deploying/maintaining the service. And what little docs there exist are not correct or complete, so I am stuck with trying to debug by deployment through what is essentially a black box. We use groovy + jenkins to maintain this service and there have been no recent changes to any of that. Other than my attempts at restoration, there haven’t been any runs either.

Finally, I would love to be able to log into the ECS cluster in AWS to see what is happening. At this point, though, I can’t find the ssl cert to log in (likely the former only is the only person that knows where it is). Sigh.

So if anyone has any suggestions, please let me know. And if someone has compiled a list of “common reasons why the control plane won’t come up,” I would appreciate it.

Thanks for any tips or suggestions!

Hi there,

I’ll be reaching out by email to see if I can help you.

All the best,
Pete