Today I had to go back onto to a client site to troubleshoot an issue that raised it’s ugly head at the most inappropriate time, about an hour before the traditional Christmas lunch on the last day of work. I’ve just had a lovely 2 weeks off work and was not looking forward to returning this morning, besides lets be honest it usually takes a couple days to hit my stride after a decent break.
We had developed a simple list with a few custom forms and a workflow, should be pretty routine, so when it was promoted from Dev VPC to the central Dev server, all worked fine. We then tried to promote it to the Test environment, where we ran into an issue with activating the Workflow. We’d built an msi to install the files to the server and then had a batch file that simply ran the stsadm commands to install & activate the feature.
STSADM.EXE -o installfeature -filename Workflows\feature.xml -force
STSADM.EXE -o activatefeature -filename Workflows\feature.xml -url http://moss-server
At this point we got a really weird error, the install happened fine but then the activation had failed with following message:-
Feature 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx' is not installed in this farm,
and can not be added to this scope.
This was really weird as we’d gotten the message that it had installed correctly. We originally started looking at issues with the Web Farm as this was a dual front end farm to mimic the production environment. At this point this was turning out to be too hard and it was time for the lunch I mentioned earlier.
So this afternoon we we did some more detailed log reading where we found this little message.
"The transaction log for database 'SharePoint_Config' is full'
Thinking this was a completely unrelated error, we thought we’d just take a look at the database server where we found the ‘C:’ drive completely full. The ‘SharePoint_Config’ database had a log file that was in excess of 7 Gig, we quickly moved all the databases to the empty ‘D:’, reattached the databases and all was fixed. Everything now installs properly and even better works.
There are some good lessons to be taken from this exercise:-
- The servers in questions were all built to ridiculous time frames. There were 2 farms(prod & test) and a dev server built and commissioned in under a week by one resource. Obviously some more testing would’ve been called for here, but I think this was an impossible ask and thus why the problems have occurred now.
- This is probably a No-Brainer, but never put data files on the same directory as the Operating System. I always change the SQL Server’s default data location. Also it’s probably worth investing in tools like MOM so that their is sufficient warning to prevent these problems occurring before this problem becomes critical.
- Cycle logs in SQL, I admit I haven’t done it yet on this server, but I’ve left instructions for this to be done ASAP.
- Don’t release on days when festivities are going to happen. Enough said on this.