记一次vcenter不能访问故障记录

Posted by ZMY Blog on November 24, 2020

现象描述:

vcenter重启后,不能进入,访问页面提示如下信息 Failed to connect to endpoint: ********Allow_pipeName=/var/run/vmware/vpxd-webserver-pipe

查看vpxd log信息如下:

2020-11-10T09:15:11.868Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.869Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.870Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.871Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.872Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.873Z error vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Failed to connect to database: ODBC error: (08001) - [unixODBC]could not connect to server: Connection refused

-->   Is the server running on host "localhost" (127.0.0.1) and accepting

-->   TCP/IP connections on port 5432?

--> . Retry attempt: 943617 ...

2020-11-10T09:15:11.883Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.884Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.885Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.886Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.887Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.888Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.889Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.890Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.891Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.892Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.893Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.894Z info vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Logging in to DSN: VMware VirtualCenter with username vc

2020-11-10T09:15:11.895Z error vpxd[13543] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Aborting after 943629 retries.

2020-11-10T09:15:11.895Z error vpxd[13543] [Originator@6876 sub=Default] Error getting configuration info from the database

2020-11-10T09:15:11.895Z error vpxd[13543] [Originator@6876 sub=Main] Init failed. SystemError: N5Vmomi5Fault11SystemError9ExceptionE(Fault cause: vmodl.fault.SystemError

--> )

--> [context]zKq7AVECAAAAAAnxsgAKdnB4ZAAAiLEqbGlidm1hY29yZS5zbwAAQDcbAI67GAFeRlN2cHhkAAFZSFMBFHVTAXa8UwGJGlICcAUCbGliYy5zby42AAEVE1I=[/context]

2020-11-10T09:15:11.896Z warning vpxd[13543] [Originator@6876 sub=VpxProfiler] ServerApp::Init [TotalTime] took 1799617 ms

2020-11-10T09:15:11.897Z error vpxd[13543] [Originator@6876 sub=Default] Failed to intialize VMware VirtualCenter. Shutting down

2020-11-10T09:15:11.897Z info vpxd[13543] [Originator@6876 sub=SupportMgr] Wrote uptime information

2020-11-10T09:15:11.897Z info vpxd[13543] [Originator@6876 sub=Default] Forcing shutdown of VMware VirtualCenter now


启动vmware-vpostgres服务

如下图,并且没有日志信息

通过vcenter命令行,多次尝试有时不能启动,并且多个服务不能启动包括vpxd和vpostgre

service-control --start --all

有时可以启动并正常访问

处理过程如下

先后经历了3个dell工程师调试,前后历时2个礼拜

第一个工程师简单处理了一阵,要了vcenter系统日志,通过5480端口导出(dell给出的ftp服务器都是国外的,上传很慢,之后用的百度网盘,你可以直接跳过这个坑)

第二个工程师分析日志之后怀疑可能是回环地址丢失,他根据的是https://kb.vmware.com/s/article/59476?lang=zh_cn 这个kb得出的结论,但实际我的有回环地址,并根据kb的解决方案调整了我的host文件,并且怀疑是否是我的fqdn的问题,可是我的fqdn是ip地址,不会存在问题

第三个工程师远程连接并调试,并根据这个kb https://kb.vmware.com/s/article/59555 解决了这个问题,但是这里的错误日志和我的这个并不相同,dell工程师给出的回答是经验判断

因为证书吊销列表里的废弃证书太多,导致服务控制脚本运行的时候访问这些内容花费了一些时间,然后数据库被连带着启动失败了。至于为什么会产生这么多废弃证书,dell工程师并没有具体指出原因,然后我又问了为什么数据库看不到错误日志的原因:说是失败不是因为数据库自己的问题导致的,所以数据库方面看不到错误 。