Can you be too Thorough?
Can you take thoroughness to a fault? There are some things that cannot be taken too far like intelligence. In a business setting, have you ever met someone who was just too smart(competitors do not count)? How about too honest? Too skilled? Too good with customers? I do not think thoroughness fails into this category though my opinion has recently changed. I used to think that, as a DBA, I could not be through enough. In some scenarios this is true like backups and recovery testing. There are other areas where it just does not pay. For example, performance tuning. That last 10% might not be worth the 6 months that goes into it.
Let’s look at an extreme example of cycling the errorlog nightly. This could be as simple as setting the retention policy to greater than the default of 6 and configuring a job to run sp_cycle_errorlog. However, let’s get thorough.
- Lets write our own proc to do this. It calls sp_cycle_errorlog but we want to add our own logic.
- Lets start off by writing to the windows application log: “Beginning Errorlog Cycling Process.”
- Lets dump the errorlog to a table in msdb in case we ever want to query it.
- Dump xp_read_errorlog to #table and do a binary_checksum() with the new rows in the errorlog table so we are sure that SQL Server inserted them correctly.
- Errorlog table cleanup process
- SSIS package to push errorlog entries of the past 24 hours to a centralized enterprise-wide errorlog repository.
- Checksum across linked server
- Extract to text file increase we ever want to view it like a real errorlog file.
- Create a file name creation function that created a “unique” name and a table that maps the “unique” name with something that a human can understand.
- Prechecks for sp_cycle_errorlog
- Check disk space
- Run checkdisk
- Verify the previously cycled errorlog has a date of getdate()-1
- If the previous errorlog is less than 24 hours old, check to see how long SQL has been up.
- If the previous errorlog is older than 24 hours, halt processing. Page the DBA team with a critical alert because this should never happen EXCEPT once a year during the time change.
- Run a custom process that steps through the previous steps to verify the completed successful.
- Run sp_cycle_errorlog
- Set the step to retry 10 times.
- Log the Output to a text file.
- Write a custom windows email service to send the results of the job step to the DBA team encrypted with PGP.
- Post sp_cycle_errorlog steps
- Zip errorlog
- verify the archive with the compression software
- Run a CRC on the zipped file
- Copy the archive to the file server along with a text file containing the CRC.
- Run a CRC on the file server and compare with CRC in the text file
- Write “Cycle Errorlog Process Complete” to the windows application log.
- Email the entire DBA that this process was successful.
- Move copies to the cloud at Amazon’s S3, Windows Live Drive, Drop.io.
- Pass the entire content of the errorlog to twitter CHAR(140) as a time for the internet can crowdsource your errors.
Again, extreme example but that is being overly thorough to put it mildly. Where is the balance? I think I have come to realize that a trait of a great DBA is balancing thoroughness with discretion. Some tasks\process\queries need the utmost level of thoroughness yet some just need to work. Sometimes, you have something that needs to be thorough but it is more important for the query\task\process to be in place yesterday. In that case, you have to sacrifice thoroughness for speed.
- I am glad I was able to step back and learn from the real world example that got me thinking about this.
- What personality traits do you think tend to be common among the top 10% DBAs or geeks in general?
This content is published under the Attribution-Share Alike 3.0 Unported license.
Comments
-
michaelswart
-
rodcolledge
-
chrisleonard
