Monday, October 14, 2013

Macs Unable to Connect to Secure Sites: OCSPD File Deleted

We received a report that 400+ Macs in two countries and a dozen locations were suddenly unable to login.  A quick fix was to remove the network cable, log in using the locally cached credentials and then plug the Ethernet cable back in.  However, the users were then unable to connect to any web-page using https, Outlook mail (using OWA) or any other connection that required secure communication.

All the Macs were bound to AD and being managed by Casper.

We soon found that by removing the JAMF binary we were able to log in but we still could not access any secure resources.  This made sense because at login/start-up the computer attempts to talk to the JSS and if secure communication is not possible the computer will hang.

Working with Apple Alliance Support (excellent as always) we were able to determine that the root of the problem lay in the fact that all the computers were missing the ocspd file from /usr/sbin/.  The ocspd file is used during certificate validation and if it is missing or corrupt a secure connection can not be established.

Using Composer we created packages to deploy new ocspd files.  Note: you must install like-for like, i.e. a good ocspd file from a 10.7.5 Mac must be deployed to another 10.7.5 Mac.

Unfortunately within a few minutes of deploying a new ocspd file, it was deleted.  After more digging through logs we found that it was a JAMF process that was causing the deletion so we removed the JAMF binary from all the Macs, pushed the good ocspd packages using ARD and it resolved the issue.

After the good ocspd packages are deployed, remove all the old computers from the JSS, ensure that a valid push-notification certificate is installed and that "enable certificate-based communication" is ticked in the framework settings of the JSS.  You should then be able to re-Recon all your Macs and the ocspd file will not be removed.  TEST FIRST on a few Macs!

If you suspect this issue the first thing to do is go to /usr/sbin and see if the ocspd file is missing.  If it is you must replace it with a known good ocspd file.  As described above, the easiest way to do this is with Casper Composer but if you do not have a copy of it you can do the following steps and deploy using ARD.
  • Copy a good ocspd file to /usr/sbin/ to the non-working system
  • Set ownership and permissions:
sudo chown root:wheel /usr/sbin/ocspd
sudo chmod 755 /usr/sbin/ocspd
  •  Once the file has been copied and permissions applied the issue should be resolved- no reboot is required

2 comments:

Anonymous said...

Well documented!

Do you have any idea what casper policy might have been terminating this daemon?

Anonymous said...

It looked like it had something to do with a Quick Add package. The strange thing is that the sites were not pushing Quick Add. One of the problems we had troubleshooting it is that all the Casper logs got deleted when the office was trying to fix the issue.