Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Say I have a farm with a few hundred web servers. How does one manage that? What practices are common for starting/stopping that many servers, validating that all of them are intact and up to date, which are up and responding, log accumulation, etc?
Is there anything off the shelf to manage and monitor a large number of instances of either/both? What about standalone JBoss instances? If you want to boot up a hundred instances on fifty hosts, what are the options?
It depends on the infrastructure you're managing it on. For instance, if you were managing your hosts on infrastructure as a service (Like Apache mesos, OpenStack, or any of the many hosted cloud providers) then I would recommend following immutable infrastructure practices. That is, for each application stack you should have the following concepts.
Building frozen images of all required parts.
Provisioning your infrastructure using the images you've built.
At run time, link up dependent services including things like backends, logging, and monitoring.
Example technology stack:
(configuration management) Use Ansible to configure services by dropping or modifying configuration files and installing required software typically via package manager in an idempotent way. (95% of operating system configuration you need customized).
Use Hashicorp Packer to "bake" your operating system images. That is, it makes API calls to a cloud service to start an operating system instance, runs Ansible to configure it, then takes a new snapshot which will be used in the next step as your building blocks).
Use Hashicporp Terraform to describe the infrastructure layout of your service. e.g. let's say you used packer to "bake" a proxy image and an application server image. Your application requires persistent storage for data that needs to survive a system restart. You would use terraform to describe you have a proxy server in front of an application server and a slice of disk mounted on your application server for persistent storage.
Post startup initialization can be handled by cloud-init. This is where the final configuration steps are performed. Writing out configuration files with host names, formatting and mounting your persistent storage if it is raw disk, enabling and starting (restarting) services after they've been configured post-boot by cloud-init. (the remaining 5% of operating system configuration you needed customized but could only do it here because you didn't have runtime information like IP addresses during the image bake).
Other considerations:
Log aggregation and searching logs: Elastic search, logstash, and kibana are an oft referred stack for centralized logging (commonly referred as ELK).
Metrics and monitoring: I personally have enjoyed using Telegraf (shipping metrics), InfluxDB (time series metrics collecting), and Grafana (UI frontend providing dashboards and alerting based on metrics). Also, Hashicorp Consul provides defining services which can report service health.
Service discovery: Hashicorp Consul can be used for service discovery, simplifying DNS of discovered services, and a key value store where you can store things like UUIDs of packer baked images for adding process around your provisioned infrastructure (e.g. different environments like dev, qa, stage, prod).
There's a lot of tools mentioned but the cool thing is you can scale services to be hundreds of servers with little effort if you do it right. Mesos with Marathon is also pretty cool because it allows you to do neat things like elastic scaling (i.e. automatically provision more application servers and proxies if your web service is under heavy load or delete some when there's little load).
In my particular setup, we're provisioned Redhat machines that we don't get root on, and essentially have to do everything via ssh. Ansible is doable, but I don't know anything about it and my python sucks. Working on that last bit, though.
Essentially, my problem is that I just don't know what's out there or how to look it up.
Say you walked into a shop with a hundred nginx servers all defined and set up, and were asked to implement a mechanism that would allow them to start or stop an arbitrary slice of them from a central location using a limited number of commands. Assume that you don't have to manage the configuration file, but it would be a bonus. How would you approach it?
You need to be root to start and stop most services (unless your service is listening on ports higher than 1024). There's plenty of solutions for configuration management. "Python sucks" is an opinion. Python has strengths and weaknesses (like all programming languages). I tend to define my goals for what I want to accomplish and look for the right tool for the job regardless of the technology it's built upon. Taking into account strengths of the team I'm working on as well (I am not an island but work with other people).
If you're on a Linux workstation check out clusterssh. Or you can maintain a flat file list of hostnames and maintain shell scripts to execute across them. If I had to write such a shell script I would write scripts so they can be run in parallel across all hosts and use flock to write the output to a file. Bash can handle that.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.