简体   繁体   中英

Service Fabric cluster does not start

1 We used to have Service Fabric properly working previously, however, right now we cannot start the cluster and we get an immediate error. The cluster creation itself had errors.

在此处输入图片说明

When I check Service Fabric logs in C:\\SFDevCluster I see

Host Application: PowerShell.exe -WindowStyle Hidden -NonInteractive -ExecutionPolicy RemoteSigned -Command & 'C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup\DevClusterSetup.ps1' -Auto -PathToClusterLogRoot C:\SFDevCluster\Log -SetupLogFileName DevClusterSetup.log -CreateOneNodeCluster
Transcript started, output file is C:\SFDevCluster\Log\DevClusterSetup.log
Performing Stop-Service on: FabricHostSvc . This may take a few minutes...
Create node configuration succeeded
Performing Start-Service on: FabricHostSvc . This may take a few minutes...

When I check Service Fabric traces I see FabricDeployer-XXXXXX(longnumber).trace, which has the following content

2019/09/09-09:06:06.239,Info,10844,FabricDeployer.FabricDeployer,Running deployer with Configure /fabricBinRoot:C:\Program Files\Microsoft Service Fabric\bin /fabricDataRoot:C:\SfDevCluster\Data /fabricLogRoot:C:\SFDevCluster\Log /cm:C:\Users\100659\AppData\Local\Temp\SEPC0T2R18-Server-ScaleMin.xml /oldClusterManifestString: /im: /instanceId: /targetVersion: /nodeName: /nodeTypeName: /runAsType: /runAsAccountName: /runAsPassword: /serviceStartupType:Manual /output: /currentVersion: /error: /bootstrapMSIPath: /machineName: /fabricPackageRoot: /jsonClusterConfigLocation: /enableCircularTraceSession:True /continueIfContainersFeatureNotInstalled: /skipDeleteData:
2019/09/09-09:06:06.241,Info,10844,ImageStoreClient.ManagedFileLock,Obtained writer lock for C:\SfDevCluster\Data\lock
2019/09/09-09:06:06.241,Info,10844,FabricDeployer.FabricDeployer,Executing Configure /fabricBinRoot:C:\Program Files\Microsoft Service Fabric\bin /fabricDataRoot:C:\SfDevCluster\Data /fabricLogRoot:C:\SFDevCluster\Log /cm:C:\Users\100659\AppData\Local\Temp\SEPC0T2R18-Server-ScaleMin.xml /oldClusterManifestString: /im: /instanceId: /targetVersion: /nodeName: /nodeTypeName: /runAsType: /runAsAccountName: /runAsPassword: /serviceStartupType:Manual /output: /currentVersion: /error: /bootstrapMSIPath: /machineName: /fabricPackageRoot: /jsonClusterConfigLocation: /enableCircularTraceSession:True /continueIfContainersFeatureNotInstalled: /skipDeleteData:
2019/09/09-09:06:06.249,Info,10844,FabricDeployer.FabricDeployer,Running operation System.Fabric.FabricDeployer.ConfigureOperation
2019/09/09-09:06:06.253,Info,10844,FabricDeployer.FabricDeployer,Creating FabricDataRoot C:\SfDevCluster\Data, if it doesn't exist on machine 
2019/09/09-09:06:06.254,Info,10844,FabricDeployer.FabricDeployer,Creating FabricLogRoot C:\SFDevCluster\Log, if it doesn't exist on machine 
2019/09/09-09:06:06.287,Info,10844,ImageBuilder.FabricDeployer,DnsService feature enabled : True.
2019/09/09-09:06:06.287,Info,10844,ImageBuilder.FabricDeployer,PartitionPrefix setting overriden in DnsService section, Overriden Value: --.
2019/09/09-09:06:06.287,Info,10844,ImageBuilder.FabricDeployer,PartitionSuffix setting overriden in DnsService section, Overriden Value: .
2019/09/09-09:06:06.287,Warning,10844,ImageBuilder.FabricDeployer,Current profile will be disabled by default for firewall rule
2019/09/09-09:06:06.297,Info,10844,FabricDeployer.FabricDeployer,Setting FabricDataRoot to C:\SfDevCluster\Data on machine 
2019/09/09-09:06:06.297,Info,10844,FabricDeployer.FabricDeployer,Setting FabricLogRoot to C:\SFDevCluster\Log on machine 
2019/09/09-09:06:06.297,Info,10844,FabricDeployer.FabricDeployer,Setting EnableCircularTraceSession to True on machine 
2019/09/09-09:06:06.297,Info,10844,FabricDeployer.FabricDeployer,Setting EnableUnsupportedPreviewFeatures to False on machine 
2019/09/09-09:06:06.297,Info,10844,FabricDeployer.FabricDeployer,Setting IsSFVolumeDiskServiceEnabled to False on machine 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter FabricDataRoot, has value C:\SfDevCluster\Data
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter FabricLogRoot, has value C:\SFDevCluster\Log
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter ServiceRunAsAccountName, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter ServiceRunAsPassword, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter SkipFirewallConfiguration, has value true
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter ServiceStartupType, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter ContainerNetworkName, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter ContainerNetworkSetup, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter SkipContainerNetworkResetOnReboot, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter SkipIsolatedNetworkResetOnReboot, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter IsolatedNetworkName, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter IsolatedNetworkSetup, has value 
2019/09/09-09:06:06.298,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter IsolatedNetworkInterfaceName, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter EnableCircularTraceSession, has value true
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter ContainerDnsSetup, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter: ContainerDnsSetup, value: <null>, interpreted value: Allow
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter EnableUnsupportedPreviewFeatures, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter IsSFVolumeDiskServiceEnabled, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter SfCnsNetworkPluginCnsUrlPort, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter SfCnsNetworkPluginCnmUrlPort, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter IsolatedNetworkPluginParams, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter UseContainerServiceArguments, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter ContainerServiceArguments, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter EnableContainerServiceDebugMode, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Setup section, parameter DisableContainers, has value 
2019/09/09-09:06:06.299,Info,10844,FabricDeployer.FabricDeployer,Copying ClusterManifest to C:\SfDevCluster\Data\clusterManifest.xml
2019/09/09-09:06:06.308,Info,10844,FabricDeployer.FabricDeployer,Set Service Fabric Host Service to start up type to Manual
2019/09/09-09:06:06.310,Info,10844,FabricDeployer.FabricDeployer,TargetInformationFileName is C:\SfDevCluster\Data\TargetInformation.xml
2019/09/09-09:06:06.317,Info,10844,FabricDeployer.FabricDeployer,Target information file C:\SfDevCluster\Data\TargetInformation.xml written on machine: 
2019/09/09-09:06:06.323,Info,10844,FabricDeployer.FabricDeployer,Host Settings file generated at C:\SfDevCluster\Data\FabricHostSettings.xml
2019/09/09-09:06:06.327,Info,10844,ImageStoreClient.ManagedFileLock,Released writer lock on C:\SfDevCluster\Data\lock

One interesting line from the previous is:

2019/09/09-09:06:06.287,Warning,10844,ImageBuilder.FabricDeployer,Current profile will be disabled by default for firewall rule

Which made me feel there could be some firewall rules blocking me, but I could not decide exactly what is goining on.

I had a look in Windows Event Viewer I see the following Service Fabric related events from different areas:

在此处输入图片说明

在此处输入图片说明

Also when I look under (Applications & Services Log ==> Microsoft-Service Fabric ==> Admin) I see the following:

Error FileChangeMonitor failed with E_ACCESSDENIED

Warning FileChangeMonitor failed file C:\\SfDevCluster \\Data\\FabricHostSettings.xml with ErrorCode E_ACCESSDENIED.

Error GetFileAttributesEx failed with the following error 5

Error Unable to stop FabricHostSvc service because System.InvalidOperationException: Cannot stop FabricHostSvc service on computer '.'. ---> System.ComponentModel.Win32Exception: The service has not been started --- End of inner exception stack trace --- at System.ServiceProcess.ServiceController.Stop() at System.Fabric.FabricDeployer.FabricDeployerServiceController.Stop(String serviceName, String machineName)

Error Unable to start fabric host service because System.InvalidOperationException: Cannot start service FabricHostSvc on computer '.'. ---> System.ComponentModel.Win32Exception: The service did not respond to the start or control request in a timely fashion --- End of inner exception stack trace --- at System.ServiceProcess.ServiceController.Start(String[] args) at System.Fabric.FabricDeployer.FabricDeployerServiceController.StartHostSvc(String machineName)

Error Error occurred while cleaning up isolated network setup exception System.ArgumentNullException: Value cannot be null. Parameter name: format at System.String.FormatHelper(IFormatProvider provider, String format, ParamsArray args) at System.Fabric.FabricDeployer.RemoveOperation.RemoveNetworks(DeploymentParameters parameters)

Warning ParseConfigSettings: ErrorCode=E_FAIL, FileName=C:\\SfDevCluster\\Data\\FabricHostSettings.xml

Warning CreateFileW failed: file=\\?\\C:\\SfDevCluster\\Data\\FabricHostSettings.xml error=32

We have tried all the following solutions but non worked:

  • Made sure Window Firewall Service is up and running
  • Run from an elevated powershell session : Unregister-ScheduledTask FabricCounters (interestingly enough, we do not even have the counters!)
  • Added { "name": "FabricContainerAppsEnabled", "value": "false"} to cluster configuration
  • Granted network service access to C:\\ProgramData\\Microsoft\\Crypto\\RSA\\MachineKeys
  • Removed the cluster and deleted C:\\SfDevCluster and tried to deploy again (still has errors)
  • Change the IP address from MACHINENAME to 127.0.0.1 & making sure that IPOrFQDN is as same as my machine name

Most of the above attempts are from this issue in Github: https://github.com/Azure/service-fabric-issues/issues/1056

This can happen if your ports are blocked. I recommend you to review what ports is available and to use them. Have you changed the default port?

I got it to work I opened the command line on the directory: %programfiles%\\Microsoft Service Fabric\\bin I tried to run Service Fabric executable FabricHost.exe -c When I attempted to run Service Fabric executable it was failing due to a corrupted DLL in OS (most likely due to group policy updates from my company), I got that corrupted DLL, I also used the TestConfiguration script and it works now.

To read more on test configuration script go to (Validate environment using TestConfiguration script) in https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-standalone-deployment-preparation .

Another reference to the issue where I got the hint: https://github.com/microsoft/service-fabric/issues/382

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM