[英]Nova compute and network is unable to contact nova service after restart manage services
我为openstack设置了2个节点。
第一个节点包含管理服务,例如nova-api
, nova-scheduler
,'glance`...。第二个节点包含网络和计算服务。
当我检查nova-manage service list
所有服务都显示出来。
重新启动管理节点(节点1)时,计算已断开。
当计算尝试连接管理节点时,其在计算日志中显示错误。
2013-01-21 20:49:28 TRACE nova.manager Traceback (most recent call last):
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/manager.py", line 155, in periodic_tasks
2013-01-21 20:49:28 TRACE nova.manager task(self, context)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 2244, in _heal_instance_info_cache
2013-01-21 20:49:28 TRACE nova.manager context, self.host)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/db/api.py", line 594, in instance_get_all_by_host
2013-01-21 20:49:28 TRACE nova.manager return IMPL.instance_get_all_by_host(context, host)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/db/sqlalchemy/api.py", line 103, in wrapper
2013-01-21 20:49:28 TRACE nova.manager return f(*args, **kwargs)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/db/sqlalchemy/api.py", line 1582, in instance_get_all_by_host
2013-01-21 20:49:28 TRACE nova.manager return _instance_get_all_query(context).filter_by(host=host).all()
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/orm/query.py", line 1922, in all
2013-01-21 20:49:28 TRACE nova.manager return list(self)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/orm/query.py", line 2032, in __iter__
2013-01-21 20:49:28 TRACE nova.manager return self._execute_and_instances(context)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/orm/query.py", line 2047, in _execute_and_instances
2013-01-21 20:49:28 TRACE nova.manager result = conn.execute(querycontext.statement, self._params)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1399, in execute
2013-01-21 20:49:28 TRACE nova.manager params)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1532, in _execute_clauseelement
2013-01-21 20:49:28 TRACE nova.manager compiled_sql, distilled_params
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1640, in _execute_context
2013-01-21 20:49:28 TRACE nova.manager context)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1633, in _execute_context
2013-01-21 20:49:28 TRACE nova.manager context)
2013-01-21 20:49:28 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/SQLAlchemy-0.7.3-py2.6-linux-x86_64.egg/sqlalchemy/engine/default.py", line 330, in do_execute
2013-01-21 20:49:28 TRACE nova.manager cursor.execute(statement, parameters)
2013-01-21 20:49:28 TRACE nova.manager OperationalError: (OperationalError) socket not open
重新启动计算和网络服务时,它可以解决问题。 但是直到我重新启动计算或网络,它给出了错误。
当我在计算机上检查为控制器打开的套接字时。
[root@compute ~]# ps -ef | grep compute
nova 30859 1 27 18:51 ? 00:00:03 /usr/bin/python /usr/bin/nova-compute --config-file /etc/nova/nova.conf --logfile /var/log/nova/compute.log
root 30996 30807 0 18:51 pts/0 00:00:00 grep compute
[root@compute ~]# netstat -p | grep 30859
tcp 0 0 compute:56988 controller:postgres ESTABLISHED 30859/python
tcp 0 0 compute:37869 controller:amqps ESTABLISHED 30859/python
tcp 0 0 compute:37871 controller:amqps ESTABLISHED 30859/python
unix 3 [ ] STREAM CONNECTED 3588759 30859/python
控制器有2个插座打开。 postgres
和amqps
。 当我reboot now
在控制器上运行reboot now
并检查有多少个插槽可用于控制器时。
[root@compute ~]# netstat -p | grep 30859
tcp 208 0 compute:56988 controller:postgres CLOSE_WAIT 30859/python
unix 3 [ ] STREAM CONNECTED 3590103 30859/python
unix 3 [ ] STREAM CONNECTED 3588759 30859/python
在此postgres
套接字中关闭。
当所有服务都出现在控制器中时。 我运行相同的命令来检查连接到控制器的套接字。 我得到了相同的结果。
为什么计算不为postgres
创建新的套接字?
您收到的套接字错误是由于nova-compute试图联系您在nova.conf中配置的数据库而引起的,正如上面的Matt Joyce所指出的那样。 在日志的前面,您可以查看配置该服务的所有值。 查找字符串“ FLAGS的完整集合”(至少会暗示其中已配置的内容),它从日志输出中隐藏“ sql_connection”的实际值(因为它通常嵌入了密码),但是可能有助于解释那里发生的事情。
根据我对您的问题的了解,nova-compute日志文件显示此错误, 直到您重新启动服务为止 。 在那之后我能正确阅读吗?
假设是正确的,安装基本软件包后是否有配置nova的东西? 厨师,木偶之类的人在服务启动时使用错误的配置来添加配置详细信息?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.