One of my colleague came to me and showed me that her idle workflow instance which is currently waiting for delay activity to time up to continue the work, is actually not working. The workflow instance is not activated and continue the process automatically. I am here to share my troubleshooting experience.
It is a workflow service hosted in IIS 7.5, and involving AppFabric 1.1. Here is the story, the following is a sample how the workflow design look like:
The workflow allows user to supply a future datetime value to the service, and then the workflow instance will sleep until the time has come and then wake up and continue the process. So, as you see above, the Receive activity will accept the StartDateTime parameter, and then the Assign activity there will calculate the time span for how long the instance need to delay. After the Delay activity, will do a WriteLine activity.
The problem is the WriteLine activity never fire, the instance is still stuck at Delay activity. If you happen to encounter the similar problem now, you may want to check whether there is a missing net.pipe protocol enabled at IIS application level. The reason is workflow instance normally will turn to Idle state after some time, then the AppFabric Workflow Management Service (WMS) rely on net.pipe to activate workflow instances when the time comes.
Check at Web Site level:
Check at Web Application level:
Next, you may want to check your AppFabric persistence database. From the web server, is it connectable to the persistence database. Make sure you are able to ping the database IP address or hostname, and also confirm the firewall clearance. Note: SQL Server default TCP port is 1433.
When everything are confirmed correct, check the server time in web server and database server. My actual root cause is the server time are not the same in between web and database servers. The database server had 3 minutes time later than web server.
If you execute the following query from the AppFabric persistence database:
SELECT * FROM [System.Activities.DurableInstancing].[InstancesTable]
You will notice that your workflow instance has the PendingTimer column value is a past date compare to the web server date. Note: PendingTimer is in UTC time, you need to offset the hours manually base on your location. It is the time when your workflow instance should be re-activated by WMS. With the date time value discrepancy between the web and database server, I guess WMS ignore the instance activation since it is a past date.
Web Server time: 2014-04-10 21:00:00
Database Server time: 2014-04-10 20:57:00
The problem was solved after making both servers time in sync.
If you wonder what can still be done to those existing idle instances, what can you do to "wake" it up? The workaround is to suspend those idle instances, then resume them by using the AppFabric Dashboard in IIS.
It is a workflow service hosted in IIS 7.5, and involving AppFabric 1.1. Here is the story, the following is a sample how the workflow design look like:
The workflow allows user to supply a future datetime value to the service, and then the workflow instance will sleep until the time has come and then wake up and continue the process. So, as you see above, the Receive activity will accept the StartDateTime parameter, and then the Assign activity there will calculate the time span for how long the instance need to delay. After the Delay activity, will do a WriteLine activity.
The problem is the WriteLine activity never fire, the instance is still stuck at Delay activity. If you happen to encounter the similar problem now, you may want to check whether there is a missing net.pipe protocol enabled at IIS application level. The reason is workflow instance normally will turn to Idle state after some time, then the AppFabric Workflow Management Service (WMS) rely on net.pipe to activate workflow instances when the time comes.
Check at Web Site level:
Next, you may want to check your AppFabric persistence database. From the web server, is it connectable to the persistence database. Make sure you are able to ping the database IP address or hostname, and also confirm the firewall clearance. Note: SQL Server default TCP port is 1433.
When everything are confirmed correct, check the server time in web server and database server. My actual root cause is the server time are not the same in between web and database servers. The database server had 3 minutes time later than web server.
If you execute the following query from the AppFabric persistence database:
SELECT * FROM [System.Activities.DurableInstancing].[InstancesTable]
You will notice that your workflow instance has the PendingTimer column value is a past date compare to the web server date. Note: PendingTimer is in UTC time, you need to offset the hours manually base on your location. It is the time when your workflow instance should be re-activated by WMS. With the date time value discrepancy between the web and database server, I guess WMS ignore the instance activation since it is a past date.
Web Server time: 2014-04-10 21:00:00
Database Server time: 2014-04-10 20:57:00
The problem was solved after making both servers time in sync.
If you wonder what can still be done to those existing idle instances, what can you do to "wake" it up? The workaround is to suspend those idle instances, then resume them by using the AppFabric Dashboard in IIS.
No comments:
Post a Comment