背景
homolab一段时间没看,上去代理挂了,修好代理之后看gitlab没起来报错满存储了,把pvc扩容后启动继续报错,如下:
Mixlib::ShellOut::ShellCommandFailed: runit_service[prometheus] (monitoring::prometheus line 87) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received ‘1’
看着是prometheus出了问题,并且在启动prometheus之前网页是可以访问的,这里先禁用prometheus
在/etc/gitlab/gitlab.rb
文件中,添加或修改以下配置行来禁用Prometheus:
prometheus['enable'] = false
看样子跑起来了,先睡觉吧
原因排查
起来之后我想先排查存储空间的问题,目录/var/opt/gitlab
# du -d 1 -h
49G ./prometheus
291M ./postgresql
20K ./gitlab-kas
4.0K ./backups
16K ./lost+found
12K ./alertmanager
果然。。删除里面的data文件后,调整prometheus的日志保留天数,在/etc/gitlab/gitlab.rb
文件中,取消对于下段的注释,并修改保留天数
prometheus['flags'] = {
'storage.tsdb.path' => "/var/opt/gitlab/prometheus/data",
'storage.tsdb.retention.time' => "7d",
'config.file' => "/var/opt/gitlab/prometheus/prometheus.yml"
}
重启组件
gitlab-ctl reconfigure
gitlab-ctl restart
看起来问题解决了
# gitlab-ctl restart
ok: run: alertmanager: (pid 3666) 0s
ok: run: gitaly: (pid 3693) 1s
ok: run: gitlab-exporter: (pid 3737) 0s
ok: run: gitlab-kas: (pid 3820) 0s
ok: run: gitlab-workhorse: (pid 3840) 1s
ok: run: logrotate: (pid 3866) 0s
ok: run: nginx: (pid 3872) 0s
ok: run: postgres-exporter: (pid 3908) 1s
ok: run: postgresql: (pid 3938) 0s
ok: run: prometheus: (pid 3950) 1s
ok: run: puma: (pid 3988) 0s
ok: run: redis: (pid 3993) 0s
ok: run: redis-exporter: (pid 3999) 1s
ok: run: sidekiq: (pid 4006) 0s
ok: run: sshd: (pid 4012) 0s
尝试重启容器,fixed,
保险起见,做个升级,参考升级路线工具:
https://gitlab-com.gitlab.io/support/toolbox/upgrade-path/
评论
还没有任何评论,你来说两句吧!