Forum Light's Demons Strona Główna
POMOC Rejestracja RejestracjaSzukajFAQ Użytkownicy GrupyZaloguj

 The idea is to come up with situations you expect Zobacz następny temat
Zobacz poprzedni temat

Napisz nowy tematOdpowiedz do tematu
Autor Wiadomość
nafizcristia97



Dołączył: 03 Mar 2024
Posty: 2

PostWysłany: Nie Mar 03, 2024 13:19 Odpowiedz z cytatemPowrót do góry

The intentionally cause those situations in production (during the work day, with warning) to ensure that you can handle them. After running several exercises on our cluster, they often revealed issues like gaps in monitoring or configuration errors. We were very happy to discover those issues early on in a controlled fashion rather than by surprise six months later. Here are a few of the game day exercises we ran: Terminate one Kubernetes API server Terminate all the Kubernetes API servers and bring them back up (to our surprise, this worked very well) Terminate an etcd node Cut off worker nodes in our Kubernetes cluster from the API servers (so that they can't communicate). This resulted in all pods on those nodes being moved to other nodes.

We were really pleased to see how well Kubernetes responded to a lot of the disruptions we threw at it. Kubernetes is designed to be resilient to errors---it has one etcd cluster storing all the state, an API server which is simply a REST interface to that database, and a collection of stateless controllers" that coordinate all cluster management Brazil Mobile Number List If any of the Kubernetes core components (the API server, controller manager, or scheduler) are interrupted or restarted, once they come up they read the relevant state from etcd and continue operating seamlessly. This was one of the things we hoped would be true, and has actually worked very well in practice. Here are some kinds of issues that we found during these tests: "Weird, I didn't get paged for that, that really should have paged.

Image


Let's fix our monitoring there." "When we destroyed our API server instances and brought them back up, they required human intervention. We’d better fix that." "Sometimes when we do an etcd failover, the API server starts timing out requests until we restart it." After running these tests, we developed remediations for the issues we found: we improved monitoring, fixed configuration issues we'd discovered, and filed bugs with Kubernetes. Making cron jobs easy to use Let's briefly explore how we made our Kubernetes-based system easy to use. Our original goal was to design a system for running cron jobs that our team was confident operating and maintaining. Once we had established our confidence in Kubernetes, we needed to make it easy for our fellow engineers to configure and add new cron jobs. We developed a simple YAML configuration format so that our users didn't need to understand anything about Kubernetes’ internals to use the system.

_________________
Asia Mobile Number List
Ogląda profil użytkownika Wyślij prywatną wiadomość
Reklama






Wysłany: Nie Mar 03, 2024 13:19 Powrót do góry

Wyświetl posty z ostatnich:      
Napisz nowy tematOdpowiedz do tematu


 Skocz do:   



Zobacz następny temat
Zobacz poprzedni temat
 Skocz do:   
Nie możesz pisać nowych tematów
Nie możesz odpowiadać w tematach
Nie możesz zmieniać swoich postów
Nie możesz usuwać swoich postów
Nie możesz głosować w ankietach

Light's Demons  

To forum działa w systemie phorum.pl
Masz pomysł na forum? Załóż forum za darmo!
Forum narusza regulamin? Powiadom nas o tym!

Powered by Active24, phpBB © phpBB Group :: Labs by Port-All ::