Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script to add servers to Storage Pool #58

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

enggnr
Copy link
Contributor

@enggnr enggnr commented Jun 29, 2023

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    Provides a script to add servers to GlusterFS Pool

  • What is the current behavior? (You can also link to an open issue here)
    Such a script does not exist

  • Other information:
    Fixes Add logic for adding targets to glusterfs #28.

@github-actions
Copy link

🤖 OpenAI


Chat with 🤖 OpenAI Bot (@openai)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @openai in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Ignoring further reviews

  • Type @openai: ignore anywhere in the PR description to ignore further reviews from the bot.

Files ignored due to filter (1)

Ignored files

  • software.yml
Files not summarized due to errors (2)

Failed to summarize

  • home/.chezmoiscripts/universal/run_onchange_before_11-add-glusterfs-peer.tmpl (nothing obtained from openai)
  • home/.chezmoi.yaml.tmpl (nothing obtained from openai)

In the recent run, only the files that changed from the base of the PR and between d6e9d2d434f14f429d4b15af9ce23e236f69d1e6 and bff985deb46b1581c4a93e6ec87a9c820260e2a4 commits were reviewed.

Files not reviewed due to errors in the recent run (2)

Failed to review in the last run

  • home/.chezmoiscripts/universal/run_onchange_before_11-add-glusterfs-peer.tmpl (no response)
  • home/.chezmoi.yaml.tmpl (no response)

@ProfessorManhattan
Copy link
Contributor

Hey @enggnr --- a lot of these PRs require IP address information. We need to come up with a system that eliminates the need for hardcoded IP addresses. We should use the public services domain and CloudFlare tunnels / WARP to set this up.

The first time a computer gets provisioned, it should check gluster.megabyte.space for instance or some other endpoint to see if the initial host is set up and then use that domain (or resolve the IP from the domain). Just wanted to let you know this so you can have a better idea of what I'm trying to do. The idea is to make it as easy as possible to create your own cloud regardless of where the computers are on the internet which CloudFlare tunnels / WARP and Tailscale pave the way for. Tailscale would be preferrable but they force the expiration of API keys so it's not easily accomplished, Nebula might be what we want, and CloudFlare tunnels / WARP has so many features / functionality / extendability that that's the route we'll take for now.

@ProfessorManhattan
Copy link
Contributor

Hey @enggnr -- have you had a chance to look at my comment above? We need to figure out how to set up glusterfs without hardcoding any IP addresses.

@enggnr
Copy link
Contributor Author

enggnr commented Jul 11, 2023

@ProfessorManhattan, right time you posted this. I was about to start a discussion on improving the way services are discovered. Like you said, IP references need to be removed as much as possible, and DNS based configuration need to be used where possible.

CloudFlare tunnels / WARP should help with this. It may need some prior configuration but that can be documented, and possibly automated. If it makes sense, there can be a separate set of scripts or a repo for pre-requisite configuration before executing install-doctor. We can create a script just to populate DNS related data for other scripts to use.

In order to minimize changes/addition to the config file, we may need to follow naming of hosts to include the name of the service - something like gluster1, gluster2 for Gluster script, so that the scripts 'discover' the relevant DNS records automatically instead of having to take an input. Services like etcd that support discovery can also benefit from this kind of a setup.

@ProfessorManhattan
Copy link
Contributor

Hey @enggnr --- I started automating WARP and CloudFlared. Browser Isolation works well - I encourage you to try it out, if you're interested --- it basically runs all the websites you open in the cloud and then WebRTCs it to the browser. I have not been able to figure out a way of getting various tools like the curl command to work when running bash <(curl -sSL https://install.doctor/start) because the CloudFlare CA.pem file needs to be added to the CAs that some of these tools are using. This secures outgoing communications and adds some cool integrations with CloudFlare Teams.

The CloudFlare cloudflared tunnel program opens up DNS-based addresses without having to open any incoming ports. The configuration is found in dot_local/etc/cloudflared. Both this and WARP will automatically start up after running the script. But, for instance, for glusterfs --- we might need to add a new entry in the cloudflared config that accepts incoming connections. I have not yet experimented with it enough to know how to properly open an HTTPS-protected port open.

If you can think of any ways we can use crons, services, or some other method to periodically scan the WARP network for new computers and add them to clusters etc. (like the Gluster master node is set up and then, without interaction, previously setup client nodes run a cron and use pinging or whatever to determine the DNS to join etc.)

@enggnr
Copy link
Contributor Author

enggnr commented Jul 27, 2023

@ProfessorManhattan, DNS based service discovery may be easier. Given that it is possible to restrict internal DNS (traffic) when using Cloudflare, there is no risk of exposing internal resources over the internet (other providers may also have this feature). SRV records can be used since it is designed for this purpose.

The other part of this process is to have scripts/tools that can query this info and use it for the appropriate service. For e.g., etcd looks for _etcd-server-ssl._tcp.example.com and _etcd-server._tcp.example.com. It is possible to add a suffix to the SRV name. This needs to be passed as an input to the script so that it can look up SRV records that correspond to etcd. Similarly, inputs are needed to identify the SRV records for glusterfs and other services.

At the start of provisioning, all this information can be queried and added to the 'data' that chezmoi uses to setup the host accordingly (with a variables controlling whether this is part of a new cluster or being added to existing cluster or things like that).

@ProfessorManhattan
Copy link
Contributor

Hey @enggnr --- please go ahead and implement this by checking the DNS zone for the PUBLIC_SERVICES_DOMAIN managed by CloudFlare for glusterfs / etcd.

Good idea on using SRV records --- that's exactly what we need.

Copy link
Contributor

@ProfessorManhattan ProfessorManhattan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on main thread about using SRV records.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add logic for adding targets to glusterfs
2 participants