Skip to content
This repository has been archived by the owner on Jan 21, 2022. It is now read-only.

bosh cck fails #411

Open
Omnipresent opened this issue Dec 4, 2016 · 1 comment
Open

bosh cck fails #411

Omnipresent opened this issue Dec 4, 2016 · 1 comment

Comments

@Omnipresent
Copy link

Omnipresent commented Dec 4, 2016

I am running bosh-lite on AWS using the vagrant aws plugin. Sometimes I need to stop the instance so I run vagrant halt and to bring it back up I do vagrant up.

After halting and upping an instance there are certain VMs in various deployments that need to be recreated/restarted.

For cf-release I run bosh cck and press option 2 for the following 12 VMS. This gets CF running again.

$ bosh cck
# I press option 2 for following 12 VMs
# Recreate VM for 'postgres_z1/0 (c3a2c93a-76e6-4000-b850-ae66edb633c7)' without waiting for processes to start
# Recreate VM for 'router_z1/0 (0b9528b6-0e75-4d4e-871d-ebf3052824a7)' without waiting for processes to start
# Recreate VM for 'runner_z1/0 (8d556d91-15e1-4fbe-9d8f-bcd14178d371)' without waiting for processes to start
# Recreate VM for 'nats_z1/0 (59bb1c0d-810c-483b-8352-f39ca0e20353)' without waiting for processes to start
# Recreate VM for 'ha_proxy_z1/0 (52dea328-621a-4e52-a7e5-7113e5cc13fc)' without waiting for processes to start
# Recreate VM for 'doppler_z1/0 (3c537dfe-475b-48f6-a9d7-b118e0b61b58)' without waiting for processes to start
# Recreate VM for 'uaa_z1/0 (4d04496e-1287-4460-bd53-8ad88b7ec1ee)' without waiting for processes to start
# Recreate VM for 'etcd_z1/0 (69f46039-c522-4d58-9d87-be39e9399530)' without waiting for processes to start
# Recreate VM for 'blobstore_z1/0 (382285d4-b1b8-4e53-8fdb-a8ba43a6f8e6)' without waiting for processes to start
# Recreate VM for 'api_z1/0 (09c07831-d6e4-4cf1-8cca-76d05c499471)' without waiting for processes to start
# Recreate VM for 'loggregator_trafficcontroller_z1/0 (be357991-a8ca-439a-9bf8-d4aa63f1861f)' without waiting for processes to start
# Recreate VM for 'hm9000_z1/0 (7d340e16-c4b5-4f95-8ace-fc0dbe78077b)' without waiting for processes to start

I also have cf-mysql-release on this boshlite and I can't seem to be able to run bosh cck on it

$ cd ~/workspace/cf-mysql-release
$ ./scripts/generate-bosh-lite-manifest
Deployment set to '/home/omni/workspace/cf-mysql-release/cf-mysql.yml'
CF-MySQL Manifest was generated at /home/omni/workspace/cf-mysql-release/cf-mysql.yml
$ bosh status
Config
             /home/omni/.bosh_config

Director
  Name       Bosh Lite Director
  URL        https://54.144.35.228:25555
  Version    1.3262.3.0 (00000000)
  User       admin
  UUID       c9ff6012-3899-4540-bca1-77e05f5a32d0
  CPI        warden_cpi
  dns        disabled
  compiled_package_cache disabled
  snapshots  disabled

Deployment
  Manifest   /home/omni/workspace/cf-mysql-release/cf-mysql.yml

$ bosh cck
Acting as user 'admin' on deployment 'cf-warden-mysql' on 'Bosh Lite Director'
Performing cloud check...

Director task 149
Error 100: Unable to get deployment lock, maybe a deployment is in progress. Try again later.

Task 149 error

For a more detailed error report, run: bosh task 149 --debug

bosh vms shows the following vms in unresponsive state. How can I get the VMs recreate/restarted again after donig vagrant halt followed by vagrant up (followed by bosh target <public ip>?

$ bosh vms
Deployment 'cf-warden-mysql'

Director task 152

Task 152 done

+-------------------------------------------------------------+--------------------+-----+--------------------+------------+
| VM                                                          | State              | AZ  | VM Type            | IPs        |
+-------------------------------------------------------------+--------------------+-----+--------------------+------------+
| arbitrator_z3/0 (39066f55-3515-4098-adc5-7bfc7ca6b9d2)      | unresponsive agent | n/a | arbitrator_z3      |            |
| cf-mysql-broker_z1/0 (5632d437-d0b2-4688-93a1-a3a2b4126390) | unresponsive agent | n/a | cf-mysql-broker_z1 |            |
| cf-mysql-broker_z2/0 (0241e416-3bde-48ed-b51e-4d032b14640a) | unresponsive agent | n/a | cf-mysql-broker_z2 |            |
| mysql_z1/0 (a8e4916e-0258-446e-9404-330684e72fab)           | unresponsive agent | n/a | mysql_z1           |            |
| mysql_z2/0 (5bcd3148-0144-4cfe-b7e8-b0ee90abab2e)           | running            | n/a | mysql_z2           | 10.244.8.2 |
| proxy_z1/0 (ea646869-b8e4-4a44-92c2-1cfb25a18b4b)           | unresponsive agent | n/a | proxy_z1           |            |
| proxy_z2/0 (51ebc12c-2b6d-44c3-9f3d-09c049a4e032)           | unresponsive agent | n/a | proxy_z2           |            |
+-------------------------------------------------------------+--------------------+-----+--------------------+------------+
@dpb587-pivotal
Copy link
Contributor

Sorry for the lack of response. In this case, the "Unable to get deployment lock" typically means something else is trying to work on the deployment. I suspect health monitor was jumping in to help bring things back since I notice mysql_z2/0 is already running.

The bosh cck approach is the correct way to get things back up and running. If you run into that lock error, you can do bosh task -a to see what other tasks are running (like health monitor) and reattach to that resurrection task to watch the progress (e.g. bosh task 12345).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants