Troubleshooting#
Investigate API object health#
sclctl can be used to get a good understanding
of what is going on.
Check infrastructure information#
sclctl node list: Are Nodes registered as expected?sclctl controller list: Are all Controllers registered and reporting heartbeats regularly?- Are VLAN tags depleted? Is it possible to create new SCs (each SC has a unique VLAN tag)?
Check object status information#
Most SCL Objects that users interact with expose status information
written by controllers. The OpenAPI reference
provides detailed description of fields like ControllerStatus, NodeStatus,
VolumeStatus, RouterStatus, and VmStatus. Status fields can be useful to
answer questions such as:
- Does the SCL Object indicate any unrecoverable error status?
- Does it look like an SCL Object is "stuck" / is not making any progress?
Investigate health of systemd services#
Use regular systemd tools (systemctl status, journalctl -b -u) to
gain a deeper insight into each component:
| Component | Systemd Service Name |
|---|---|
| etcd | etcd |
| SCL API | scl-api |
| L2 network controller | scl-local-l2-net-ctrl |
| L3 network controller | scl-local-l3-net-ctrl |
| Image registry | vm-image-registry |
| VM scheduler | scl-scheduler-ctrl |
| VM controller | scl-vm-ctrl |
| Volume controller | scl-local-vol-ctrl.apiHost |
Useful questions are:
- Is the service running?
- Are there any log entries with the
ErrororWarninglevel?
If necessary, increase the log level of services as documented in the references to get more feedback. The services are designed to be stateless, so restarts should not cause any major problems (e.g. like a data loss).