现象描述:
vcenter重启后,不能进入,访问页面提示如下信息 Failed to connect to endpoint: ********Allow_pipeName=/var/run/vmware/vpxd-webserver-pipe
查看vpxd log信息如下:
2020-11-10T09:15:11.868Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.869Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.870Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.871Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.872Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.873Z error vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Failed to connect to database: ODBC error: (08001) - [unixODBC]could not connect to server: Connection refused
--> Is the server running on host "localhost" (127.0.0.1) and accepting
--> TCP/IP connections on port 5432?
--> . Retry attempt: 943617 ...
2020-11-10T09:15:11.883Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.884Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.885Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.886Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.887Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.888Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.889Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.890Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.891Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.892Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.893Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.894Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc
2020-11-10T09:15:11.895Z error vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Aborting after 943629 retries.
2020-11-10T09:15:11.895Z error vpxd[13543] [Originator@6876 sub=Default] Error getting configuration info from the database
2020-11-10T09:15:11.895Z error vpxd[13543] [Originator@6876 sub=Main] Init failed. SystemError: N5Vmomi5Fault11SystemError9ExceptionE(Fault cause: vmodl.fault.SystemError
--> )
--> [context]zKq7AVECAAAAAAnxsgAKdnB4ZAAAiLEqbGlidm1hY29yZS5zbwAAQDcbAI67GAFeRlN2cHhkAAFZSFMBFHVTAXa8UwGJGlICcAUCbGliYy5zby42AAEVE1I=[/context]
2020-11-10T09:15:11.896Z warning vpxd[13543] [Originator@6876 sub=VpxProfiler] ServerApp::Init [TotalTime] took 1799617 ms
2020-11-10T09:15:11.897Z error vpxd[13543] [Originator@6876 sub=Default] Failed to intialize VMware VirtualCenter. Shutting down
2020-11-10T09:15:11.897Z info vpxd[13543] [Originator@6876 sub=SupportMgr] Wrote uptime information
2020-11-10T09:15:11.897Z info vpxd[13543] [Originator@6876 sub=Default] Forcing shutdown of VMware VirtualCenter now
启动vmware-vpostgres服务
如下图,并且没有日志信息
通过vcenter命令行,多次尝试有时不能启动,并且多个服务不能启动包括vpxd和vpostgre
service-control --start --all
有时可以启动并正常访问
处理过程如下
先后经历了3个dell工程师调试,前后历时2个礼拜
第一个工程师简单处理了一阵,要了vcenter系统日志,通过5480端口导出(dell给出的ftp服务器都是国外的,上传很慢,之后用的百度网盘,你可以直接跳过这个坑)
第二个工程师分析日志之后怀疑可能是回环地址丢失,他根据的是https://kb.vmware.com/s/article/59476?lang=zh_cn 这个kb得出的结论,但实际我的有回环地址,并根据kb的解决方案调整了我的host文件,并且怀疑是否是我的fqdn的问题,可是我的fqdn是ip地址,不会存在问题
第三个工程师远程连接并调试,并根据这个kb https://kb.vmware.com/s/article/59555 解决了这个问题,但是这里的错误日志和我的这个并不相同,dell工程师给出的回答是经验判断
因为证书吊销列表里的废弃证书太多,导致服务控制脚本运行的时候访问这些内容花费了一些时间,然后数据库被连带着启动失败了。至于为什么会产生这么多废弃证书,dell工程师并没有具体指出原因,然后我又问了为什么数据库看不到错误日志的原因:说是失败不是因为数据库自己的问题导致的,所以数据库方面看不到错误 。