Imagine a 3500 Page Book…

November 12, 2009 · Posted in SQL Server · Comments 

having 400-500 level SQL Server internals content WITH (COMPRESSION=PAGE) that gets it down to char(140) per page.

The changing face of community

brownnose I am talking about Paul Randal’s twitter feed. Tweet after tweet of internals and storage engine goodness. I hope to take a little credit for getting Paul on twitter. Peer pressure is tough. Although, who is the first person he follows? A marketing dude! @BrentO WTF?

Then he turned into a monster. 50-100 tweets a day! Noisy, yes, but how do you unfollow Paul Randal?!?! That said, IMO, he is now on the cutting edge of the SQL community. Both technically and logistically. That combination is as rare a SQL Server(past or present) dev team member who also has great presentation skills. Err, wait… That would be Paul too.

Ok, I am about to go get some sandpaper, a blow torch, hydrochloric acid and my wife’s tooth brush to get that sh!t off of my nose. Shhh, don’t tell wifey.

http://twitter.com/paulrandal

This is just today!

  1. #sqlconn 2008 extended events allow session-level wait state tracking. Neat demo from Gert.about 3 hours ago from TweetDeck
  2. #sqlconn Don’t worry about high preemptive waits, they can be expected for some operationsabout 3 hours ago from TweetDeck

  3. #sqlconn Running xp_cmdshell will show up as a PREEMPTIVE_OS_PIPEOPS wait in sys.dm_os_wait_stats.about 3 hours ago from TweetDeck

  4. @datachick Truffles work well – they’re Kimberly-crack.about 3 hours ago from TweetDeck

  5. @datachick I don’t think I’d get the same response from @KimberlyLTrippabout 3 hours ago from TweetDeck

  6. #sqlconn Use sys.dm_os_threads to get the OS-level threads running. Then look in Process Explorer to get call stack + see why it’s waiting.about 4 hours ago from TweetDeck

  7. @mrdenny That is true. I guess in the city one would pose, and on the freeway, one would zooom.about 4 hours ago from TweetDeck

  8. #sqlconn Which are run by threads, running on multiple schedulers within an SMP/NUMA node 2/2 Try saying that ten times after a few beers!about 4 hours ago from TweetDeck

  9. #sqlconn A session can issue multiple requests. Which can run multiple tasks. Which are run by workers 1/2about 4 hours ago from TweetDeck

  10. #sqlconn 2005 RTM assigned tasks to least loaded scheduler – that code is commented out know and it just does round-robin now. Too complex.about 4 hours ago from TweetDeck

  11. @mrdenny I don’t know about that – whenever I’m driving from SFO to Mountain View, the 101 seems like a race track!about 4 hours ago from TweetDeck

  12. #sqlconn Assuming affinity mask is 0, you’ll have at least 1 scheduler per CPU, plus some hidden schedulers (e.g. DAC)about 4 hours ago from TweetDeck

  13. #sqlconn Forget about fibers. Benchmark special, lot’s of things stop working. I totally agree.about 4 hours ago from TweetDeck

  14. #sqlconn SQL does not distinguish between logical procs and physical procs – watch out for hyperthreadingabout 4 hours ago from TweetDeck

  15. #sqlconn Gert’s explaining about preemptive scheduling models, makes me want to go back and read Tanenbaum’s classic OS books from college.about 4 hours ago from TweetDeck

  16. @kbriankelley That’s like Wired. I don’t read every issue, but $12/year, how can you go wrong?about 4 hours ago from TweetDeck

  17. Waiting for Gert Drapers to start the last session of the #sqlconnference, on What are You Waiting On?about 4 hours ago from TweetDeck

  1. @AdamMachanic Scared to yell ‘Buck’: might be a thunderclap + @BuckWoody appears out of thin air. Bad enough when I know he’s coming! ;-Dabout 5 hours ago from TweetDeck
  2. #sqlconn Walked into the 500-person room again and this time let loose with a thundering ‘echo’. Results were childishly impressive :-) about 5 hours ago from TweetDeck

  3. RT @KimberlyLTripp: #sqlconn #sqlpass Just blogged SQL Server MVP Deep Dives bk (w/ pic from launch @ PASS):http://bit.ly/42YNDj. Enjoy!about 6 hours ago from TweetDeck

  4. #sqlconn RML Utilities looks to me like performance data collection in 2008about 6 hours ago from TweetDeck

  5. @MladenPrajdic You beat me to the tweetabout 6 hours ago from TweetDeck

  6. @BradDBA @heigesr2 I wondered if anyone would get the joke :-) about 6 hours ago from TweetDeck

  7. #sqlconn Use fn_trace_gettable() to treat a trace file as if it were a trace tableabout 7 hours ago from TweetDeck

  8. #sqlconn Q: if you stop/start a trace, does it append to existing file or create new one? A: it appendsabout 7 hours ago from TweetDeck

  9. RT @SQLChicken: @PaulRandal Here’s an article I wrote on how to create server-side trace with profiler http://bit.ly/eZQYV #sqlpass#sqlconnabout 7 hours ago from TweetDeck

  10. #sqlconn But it’s far easier to get a scripted out trace definition from Profilerabout 7 hours ago from TweetDeck

  11. #sqlconn Looks like setting up SQL Trace manually is pretty quirky compared to using the GUIabout 7 hours ago from TweetDeck

  12. @MladenPrajdic Answer is no then I guess.about 7 hours ago from TweetDeck

  13. #sqlconn Profile when a batch completes, not starts. Common mistake. You can’t get duration, cpu, etc except when it completes.about 7 hours ago from TweetDeck

  14. #sqlconn When profiling, make sure you filter, but don’t over-filter.about 7 hours ago from TweetDeck

  15. @MladenPrajdic What do you mean by type? XML vs varchar (max) vs text, for example?about 7 hours ago from TweetDeck

  16. When I don’t know the answer to an internals question, I ask Bob Ward @bobwardms. Follow him.about 7 hours ago from TweetDeck

  17. @StrateSQL Common problem. SharePoint is an example.about 7 hours ago from TweetDeck

  18. Someone next to me just had a phone call and it was the muppets as a ring tone – MANAMANA!!! Doo dooo de doo doo. Excellent!about 7 hours ago from TweetDeck

  19. This is the first #sqlconn ference in 3 years where I have a spare day where I can actually watch sessions and learn stuff – very cool!about 7 hours ago from TweetDeck

This content is published under the Attribution-Share Alike 3.0 Unported license.

What is Professional Development?

October 13, 2009 · Posted in SQL Server · Comments 

I am writing this post because the SQL PASS Professional Development(PD) Virtual Chapter is going to be launching soon. I wanted to define what this means to me, not PASS, and put out a desperate cry call for help. Arnie Rowland and Mark Caldwell are my partners in crime with supporting roles from Blythe Morrow and Thomas LaRock. The virtual chapter is launching at the Summit. More to come on this. This is what we are in need of.

  • Short article\post authors on Professional Development topics as they relate to the SQL Server Professional
  • SQL Bloggers who would like their existing professional development content syndicated to the PD feed like general topic on SQLServerpedia.
  • Volunteers for monthly live meetings
  • Help us spread the word through your networks and blogs
  • Your ideas on how the Professional Dev PASS virtual chapter can serve the community.

This chapter can only be as good as its members and by helping the chapter grow, you are really helping yourself grow.

So what do I think consists of professional development? Well, you could start of with this tag but that might give you the wrong impression. ;) Let’s see what the Internet says:

From Wikipedia:

Professional development refers to skills and knowledge attained for both personal development and career advancement. Professional development encompasses all types of facilitated learning opportunities, ranging from college degrees to formal coursework, conferences and informal learning opportunities situated in practice. It has been described as intensive and collaborative, ideally incorporating an evaluative stage [1] There are a variety of approaches to professional development, including consultation, coaching, communities of practice, lesson study, mentoring, reflective supervision and technical assistance.[2]

http://en.wikipedia.org/wiki/Professional_development

That is pretty close to what I was thinking especially skills\knowledge obtained for personal development and career advancement. Here are some PD topics IMO:

  • Social networking\Blogging
  • Handing tough situations\office politics\obnoxious coworkers in the work place
  • Negotiating offers and raises
  • Career path decision trees
  • Job seeking skills
  • Presentation delivery techniques
  • Management(Time, People, Project etc.)
  • Development of organizational skills
  • Volunteering in the SQL Server community

And the list could go on for a long time. Hit Arnie, Mark or myself up if this is something you would like to help out with.

What do you think should be covered in a Professional Development virtual chapter?

This content is published under the Attribution-Share Alike 3.0 Unported license.

Filtered Stats to Counter Data Skew Issues

October 13, 2009 · Posted in SQL Server · Comments 

There are a handful of things that can make SQL Server choose a “bad plan”. They usually revolve around variable sized result sets. Some problematic scenarios are parameter sniffing, data skew, local variables, table variables, UDF’s or complex calculations in the where clause. All of these and more are detailed in my favorite and recently updated white paper.

We are going to be talking about data skew in this post. Picture a table where you have 1 billion  rows where IsShipped = 1 but 50k rows where IsShipped=NULL. Now, let’s say you have a query that returns orders that are not shipped(NULL). During compilation, the optimizer asks the stats how many rows are IsShipped=NULL. If the stats sampling touched, few, if any of the pages, where IsShipped = NULL, then SQL Server will guess that only 1 row will be returned. This may surpass the key lookup threshold(Part 1, Part 2) and a scan would not be optimal but better than a key lookup. Get it? We will look at an example soon if not.

In the past, your options have been limited. Update stats with fullscan was usually the best option unless the table was just too large and volatile. If it was, you probably had to use index hints or OPTIMIZE FOR the lesser of the two evils. You have covering indexes and is probably a good solution for the example above unless the result set has to be very wide.

Filtered statistics are a significant improvement and can be used nicely when dealing with this problem. They are smaller so cheaper to maintain. Their scope can be very narrow so they are more accurate.

There are a couple of caveats. I wrote about the need to use full scan on very selective stats here and Kimberly Tripp wrote about the fact that they will not auto update until the entire table crosses the threshold.

Let’s look at an example.

CREATE TABLE filtered_stats
(
c1 INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
c2 datetime DEFAULT GETDATE(),
c3 CHAR(500),
c4 INT
)

--lets add 1 million rows with c4=1
--lets add 1 row with c4 = 2
--Lets add a million more with c4 =3
--Finally, lets add 50 more with c4 =2 so the estimates are skewed.
DECLARE @ctr INT = 1000000
WHILE @ctr > 0
BEGIN
   INSERT INTO
filtered_stats(c3, c4)
  
VALUES
  
('blah', 1)
  
SELECT @ctr = @ctr-1
END
GO
INSERT INTO filtered_stats(c3, c4)
  
VALUES
  
('blah', 2)
DECLARE @ctr INT = 1000000
WHILE @ctr > 0
BEGIN
   INSERT INTO
filtered_stats(c3, c4)
  
VALUES
  
('blah', 3)
  
SELECT @ctr = @ctr-1
END
GO
DECLARE @ctr INT = 50
WHILE @ctr > 0
BEGIN
   INSERT INTO
filtered_stats(c3, c4)
  
VALUES
  
('blah', 2)
  
SELECT @ctr = @ctr-1
END

--Let's add a clustered index
CREATE CLUSTERED INDEX ix1 ON filtered_stats(c2)

--This index will be used for the key lookups
CREATE INDEX ix2 ON filtered_stats(c4

--The estimated rows returned are 1 but the actual rows are 51.

--This is a bad estimate due to uneven data distribution.
SELECT c1, c2
FROM filtered_stats
WHERE c4=2

 

underestimate

--Lets create narrow and more accurate filtered stats
--You still have to do full scan but you are touching just a fraction of the data – NOTE: Full scan is needed based on my observations here.
CREATE STATISTICS fstats ON filtered_stats(c4) WHERE c4 = 2 WITH fullscan

--This should estimate 51 no matter how many times you update stats since they are more accurate
SELECT c1, c2
FROM filtered_stats
WHERE c4=2

 

image

This is not bullet proof but I say it is an improvement. The downside include the fact that filtered stats are manually created. Manual stats updates will need to be done with a full scan. You may also say that a covering index(or filtered covering index) would kill this. That is true but there may be an actual need to return all columns of a table.

A quick wrap up: Put it in the tool box. It will be like that $50 wrench that you use once a year but it saves you 4 hours every time you touch it.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Sampling Problem with Filtered Statistics

October 12, 2009 · Posted in SQL Server · Comments 

I was writing a blog post to show how to solve the problem of bad estimations due to uneven data distribution with filtered statistics. After an hour or so of banging my head against the wall, I discovered that I ran into a bug feature. It appears that if the predicate on the filtered stats is very selective then a non-null histogram will not be generated unless they are created\updated with full scan. This is true for automatically generated stats for filtered indexes as well but they are created with full scan when the index is created.

Let’s look at demo data created with this code:


CREATE TABLE filtered_stats
(
c1 INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
c2 DATETIME DEFAULT GETDATE(),
c3 CHAR(500),
c4 INT
)

--lets at 1 million rows with c4=1
--lets add 1 row with c4 = 2
--Lets add a million more with c4 =3
--Finally, lets add 50 more with c4 =3 so the estimates are skewed.
DECLARE @ctr INT = 1000000
WHILE @ctr > 0
BEGIN
   INSERT INTO
filtered_stats(c3, c4)
  
VALUES
  
('blah', 1)
  
SELECT @ctr = @ctr-1
END
GO
INSERT INTO filtered_stats(c3, c4)
  
VALUES
  
('blah', 2)
DECLARE @ctr INT = 1000000
WHILE @ctr > 0
BEGIN
   INSERT INTO
filtered_stats(c3, c4)
  
VALUES
  
('blah', 3)
  
SELECT @ctr = @ctr-1
END
GO
DECLARE @ctr INT = 50
WHILE @ctr > 0
BEGIN
   INSERT INTO
filtered_stats(c3, c4)
  
VALUES
  
('blah', 2)
  
SELECT @ctr = @ctr-1
END

So now we have 1 million rows with c4 = 1 and 3. We have 51 rows with c4=2.

If we create this filtered stats, we have a null histogram: CREATE statistics fstats2 on filtered_stats(c4) where c4 =2

image

Updating or creating with fullscan give use the following data from dbcc show_statistics.

image

Bug or Feature? This is what BOL says:

“When only a few values are returned, however, sampling might not provide accurate statistics. For highly selective predicates, which return relatively few values, using the WITH FULLSCAN option will improve accuracy because it will scan all of the values instead of sampling.”

So they “recommend” that you use full scan on highly selective stats. However, I think that is a bug since we do not even get unfiltered rows or rows sampled back. Not to mention that inaccurate stats means to me that numbers are returned but they are most likely less than what is actually there. I will submit it as a bug unless one of you have thoughts otherwise.

So in a nut shell, ALWAYS update filtered stats with full scan or they may not actually be doing you any good. There are also some other considerations in regards to filtered stats that Kimberly Tripp has recently written about. There is some good conversion about it as well. Conor also posted about it here.

I will have more on filtered stats coming up in the next post.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Life

October 6, 2009 · Posted in Other · Comments 

Life happens and the next thing you know a month is gone. I know you have been wondering where I have been. Srsly, my ego allows me to think that the Internet takes notice when I am not in the usual hangouts. Anyway since I am that cool(in my head), I figure I would share the mundane details of my life. Besides, I figure if I leave anything to imagination I am bound to attract a couple dozen stalkers. I am pretty sure I have some web stalkers now. They are just very discreet.

Well to start things off, my kids spilled soymilk on my laptop. I have accidental damage coverage through best buy so I have a get out of jail free card but I am going to wait on the public Win7 launch for the new models. I have been eyeing the quad core i7 Dells.

The other time leech has been Halo OSDT. It was a great game. I spent most of my time working my way through the solo campaign. The story is pretty good but the ending had me saying that is it? Fire fight is fun. Multiplayer is ok except people cheat I suck at it.

I had a major project I accidentally fell into. I took a customer to Geek Mecca when he was in town from Denmark. I decided that I was going to come back and get a home theater with blu-ray. I did and the wife made me take it back after a day or two. It was too late at that point. The electronaholic was loose and prowling electronics stores at lunch. So I researched and hunted for a deal and got an Onkyo HT-S6200. I spent a week wiring, rewiring, reading manuals, and tweaking settings in a hedonistic geekfest of primal proportions. I think I excreted nerd musk at one point.

There was just one problem. I HAD TO HAVE Dolby tru-hd or DTS-HD master audio to get the true experience out of a 7.1 sound system. Another trip to Frys and the problem was solved with a blu-ray player.

At least, I thought the problem was solved. It turns out that the blu-ray could not go through the Onkyo to my TV because it was connected HMDI to DVI and it was not HDCP compliant. About this time, I also discovered that my HD cable channels were not working either. You know where this is going…

I was planning on a new TV for the family for Christmas anyway. What was I supposed to do? Go to component and optical cables which would quadruple the amount of cables behind my TV? I had no choice. Right?

Here is the setup and I really did shop around. I got some good deals. I even checked Craig’s list for the blu-ray player and the same model was going for more used than a new one because the seller bought it 6-9 months ago when it was twice as much.

Here is the setup:

It is not video\audiophile worthy but it is an upgrade for me. Not to mention that with a wife and three kids, I can’t have an orgy of one in a pile of new electronics and bubble wrap very often.

The only thing that I am worried about is the monkey on my back whispering about the 73” of goodness in the WD73737. It is just a whisper but it is getting louder. Must. Resist. Impure. Urges.

Here are before and after pics.

Before:

IMG00081-20091004-1209

After:

IMG00090-20091006-2037

IMG00091-20091006-2038

This content is published under the Attribution-Share Alike 3.0 Unported license.

Professional Development: Internet Image

September 17, 2009 · Posted in SQL Server · Comments 

normal_Internet-SeriousBusiness Writing from the plane. I just wanted to let you know this because it sounds like I am cool. ;) On the way to Sacramento with a short stop in Vegas. Hopefully, I do not miss my plane and end up in a poker tournament for more $ than I can afford against people I have to get lucky to beat. *wink* Nah.. That is the first of many jokes that are not funny in this post. It is Grandpop’s funeral that I am headed to so I can’t miss my flight and if I do, I have to go on stand by. Before you bring all the sappy condolences etc, I am not down. He lived a LONG, prosperous, and happy life. He was a violin professor a university in Serbia before WWII. He escaped from a concentration camp in WWII. He came to America and gave up the violin to provide for his family which grew to three daughters, five grandchildren, and three great grandchildren. Oh yah, he outlived two wives and was steady pimpin’ into his 80’s.

Relevance, please?

So, you ask, what does all that have to do with Professional Development? I just advertised some things about the kind of person I am. It is now part of my Internet Image. That first paragraph is like a tattoo. It will never go away. This is what some say should be avoided when speaking of internet image in the SQL Server(and beyond) community. They say maintain the most upmost professionalism at all times. That is one strategy and I agree it is hard to get into trouble being conservative. However, you can also be yourself if assume the following:

  • (Today’s + affirmation) By acting myself, I can interact with people with the same interests as me, I am likable for who I am, and I want to associate with people like me even in a professional environment.
  • I would prefer to work in an environment where I fit the general culture.
  • By fitting in with the culture, I am a foot a head of an equal job candidate for same position that does not fit in the with culture.

Talking about myself again!

To use me as an example, I display my xbox gamer tag proudly on this site. I would not want to work for a company where most of the people do not own Xboxes. At the same time, I would love to work at a company where there are games in the break room.

Of course, you can cross the line and use poop or toilet analogies. At the same time, wouldn’t you prefer to work for a boss who appreciated crude internets geek humor if that is what you are into? Same goes for bowling, knitting, d&d, and wiccan magic. He77, if it was not for HR, my first question to potential job candidates would be “What is Goatse?”. I kid but as a hiring manager, I do look for technical ability first but cultural fit is a not too far behind. Look at it like this. I do not care if I walk by and one of my DBAs is reading the Onion because I know when a server explodes and the ʪĦÍŦ hits the fan, they will be there be it for three hours or thirty; what ever it takes to recover, boot, or function. That intangible quality is a job skill!

Where is the line?

I suggest that anytime you ever post anything online, you follow this decision tree.

  • Picture your dream job that you want to be in DBA century(10 human years)  from now.
  • Ask yourself if this will help or hurt your chances for that job?
  • If the answer is yes, then ask yourself if you would be happy working for such a company\manager?
  • If both answers are yes, then do not post!
  • If either are no, post away.
  • If drinking tasty cold alcoholic beverages, immediately post something “insightful” on your boss’s boss’s facebook wall.

So can this backfire?

There are also other circumstances that I should warn you against being yourself like if you are the only one who laughs at your jokes or is boring, obnoxious, always right, a trekkie, or a LOTR fanatic(7 hours of walking! Even the trees!).  These mostly hold true for the unforeseeable future too so if you are not confident in your future, you may want to exercise caution as well and remember that life happens. Here are some things that would make me be more conservative.

  • Your job market is small. Examples include a small town or a very small niche specialty like a Sybase to SQL Server conversion specialist in Bee-eF-eeVille, Texas.
  • Your experience lends to junior level jobs for now. We all started off some where. Until your technical experience outpaces any possible negative perceptions, then a conservative route made be prudent. As I said earlier, it is hard to get in trouble being conservative so when in doubt, lean tight right like a southern Baptist GOP senator in the closet.
  • Religion and politics(previous bullet aside) – This is more of my personal preference because a valid argument could be made for putting this out there. I, on the other hand, think that my dream job could be working for someone that I don’t agree with on Religion\Politics and I think the hiring manager may choose an equal but “image conservative” candidate over me because of it. However, if this is your cup of tea, +10 street cred homeboy!
  • You are conservative by nature. I am only making a case for being who you are. Do not portray anything you are not.
  • You think you may change careers over the next 10 years. Again, life happens. Lean conservatively unless you are confident in your future.

What am I trying to say...

I have not spoken much about the benefits of this openness on “The CPU” as my wife calls it. Let me wrap this up with that. Basically, it is more fun and more natural to be ourselves. I believe you are more interesting. Why hold back personality for dry dull dialog? That said, being anything but being the utmost professional on the internet should be done after careful thought so don’t log into alt.binaries.erotica.asian.furry.midgets.multimedia with your real name! Besides, I have worked for the same company for the last 10 years and I do not think I will be looking over the next 5-10+ years so take this with a grain of salt. Lastly, the golden rule is don’t do anything you would not be proud to show yo mamma.

Speaking of…. Yo mamma so nasty we tried to flush her dirty pages with DBCC DROPCLEANBUFFERS  but it kept failing because of the stench from that puss bubble excreting from her VLF. BOOM.

This content is published under the Attribution-Share Alike 3.0 Unported license.

OT: Music – What the Fork?

September 10, 2009 · Posted in Other · Comments 

A bit house music that I recorded this weekend. It is all 2009 stuff except for the first song. All music is available at http://juno.co.uk and other fine digital distribution sites.

This mix is bass heavy and upbeat. If you are not into electronic music or did not like my previous mixes this may be one to check out.

http://statisticsio.com/jmass%20-%20WhatTheFork.mp3

Lucien-N-Luciano – Future Pressure
TRADEMARK – Tropic Of Capricorn
CLARKE, Tuppy – Chicago (Mitch Windy City remix)
DANKE, Sean – Guiding Lights
EKKOHAUS – Thursday Morning OK
ESPOSITO, Emanuele/LUCA GIORDANO – Deepingo (Emanuele Esposito mix)
FERREIRA, Gabriel – Salmonella (Stefan Tretau remix)
Igor Krsmanovic & Andreja Z – Demolition
Pablo Bolivar – Paseo Polar
AK, Yavuz – Hoppa
REIG, Eddu – Attempt Of Cream
COENEN, Ludwig – Failover
COENEN, Ludwig – Clickoff
CERRONE, Rino – Exnovo (Exnovo Mazel Source remix)
CHOI, Jin – Everything Is Borrowed (original mix)
WIRETAPPEUR – 8 Days A Week
EKKOHAUS – Brian Said
NEUROXYDE – Yebo Yebo (Neuroxyde From Berlin rework)
DAMOLH33 – Untitled States
ELLIS, Tom – Anyhoooo (feat Suz)
007 AGENT aka THE BIG MOUNTAIN feat GONZALO HURTADO/PAULA – Our House (original mix)
Igor Krsmanovic – Missiles
WIRETAPPEUR – I Want You
CASCABEL GENTZ – Dirty Lips (Afrilounge remix)
Johnson & Haske – Breeze (Alex Young remix)
La Gosse We Don’y Know Yet
RAGANOVA – Antifonar
CARUSO, Samuele aka MANIMALISCH – Fortwegung
TRADEMARK – Elevator
DAAMEN, Camiel – Deep Love
SUPER FLU – Lyla Sucks Chestnuts
CRISS, Jay – Dark Triangles (Distaff remix)
IRON CURTIS – Peoples (Ekkohaus’ Down With The Sax mix)
JONAS & FABIAN – Beatcontrol
EKKOHAUS – Old Friends
ESSER, Darko Relieve (original)
Kruno Kereta – Portmanteau
RAGANOVA – Paplan
AK, Yavuz  – Change

This content is published under the Attribution-Share Alike 3.0 Unported license.

The DR Cloud: An Example

September 10, 2009 · Posted in SQL Server · Comments 

Disclaimer: The enterprise Infrastructure as a Service(IaaS) that I will be using is for offered by the company I work for. I have been there 10 years and I am proud of our product set so I may be a little bias. However, these steps should be similar for most IaaS offerings like that of Amazon, GoGrid and RackSpace. To learn more about the product that is demo’d from visit http://theenterprisecloud.com We also have a similar hour based service at http://vcloudexpress.terremark.com. All that said, pay more attention the concepts not so much the platform. This is not meant as a commercial.

In the first part, I detailed the benefits and reasons for DR in the cloud. I ended with a high level sample architecture.

In this post, we will look at an example of doing DR to the cloud with SQL Server. We will be using Database Mirroring but you could use home grown log shipping, replication, incremental nightly loads, or third party products that do block level replication to an offline instance. So lets get started.

Part one: Create the DR SQL Server

1. Hit the create server button in the web interface.

step1

2. Select the OS and template. Note: If you want to use SQL Server Enterprise Edition, just use a Windows template and install SQL from an ISO. You will need to do this if you want to use asynchronous mirroring and compression.

step2

3. Name and size the Server. You can skimp on the resources and add more RAM and VPU’s in the event of failover.

step3

4. Set the network. It will need to be DMZ but we will lock it down at the firewall.

step4

5. Organize as you wish.

step5

 

6. Add disk space and power on.

step7

Part two: Configure network access.

1. Set up an internet accessible port to the server. Do not fret. We are going to use a certificate and data will encrypted on the wire.

image

2. Create a node for the Mirror.

image

 

Part three: Configuring mirroring.

1. Backup your database and get it to your mirror. (I am just using RDP but if you DB is large, setup a more robust copy method)

2. Restore the database with norecovery at the mirror.

3. Send diff(if needed) and at least one log.

4. Restore diff(if needed) and the log with norecovery

5. Set up database mirroring on port 1521 using a certificate.

Please refer to this article for directions on how to set up mirroring with a cert.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Can you be too Thorough?

September 9, 2009 · Posted in SQL Server · Comments 

Can you take thoroughness to a fault? There are some things that cannot be taken too far like intelligence. In a business setting, have you ever met someone who was just too smart(competitors do not count)? How about too honest? Too skilled? Too good with customers? I do not think thoroughness fails into this category though my opinion has recently changed. I used to think that, as a DBA, I could not be through enough. In some scenarios this is true like backups and recovery testing. There are other areas where it just does not pay. For example, performance tuning. That last 10% might not be worth the 6 months that goes into it.

Let’s look at an extreme example of cycling the errorlog nightly. This could be as simple as setting the retention policy to greater than the default of 6 and configuring a job to run sp_cycle_errorlog. However, let’s get thorough.

  1. Lets write our own proc to  do this. It calls sp_cycle_errorlog but we want to add our own logic.
  2. Lets start off by writing to the windows application log: “Beginning Errorlog Cycling Process.”
  3. Lets dump the errorlog to a table in msdb in case we ever want to query it.
    1. Dump xp_read_errorlog to #table and do a binary_checksum() with the new rows in the errorlog table so we are sure that SQL Server inserted them correctly.
    2. Errorlog table cleanup process
    3. SSIS package to push errorlog entries of the past 24 hours to a centralized enterprise-wide errorlog repository.
      1. Checksum across linked server
      2. Extract to text file increase we ever want to view it like a real errorlog file.
        1. Create a file name creation function that created a “unique” name and a table that maps the “unique” name with something that a human can understand.
  4. Prechecks for sp_cycle_errorlog
    • Check disk space
    • Run checkdisk
    • Verify the previously cycled errorlog has a date of getdate()-1
      • If the previous errorlog is less than 24 hours old, check to see how long SQL has been up.
      • If the previous errorlog is older than 24 hours, halt processing. Page the DBA team with a critical alert because this should never happen EXCEPT once a year during the time change.
    • Run a custom process that steps through the previous steps to verify the completed successful.
    1. Run sp_cycle_errorlog
      • Set the step to retry 10 times.
      • Log the Output to a text file.
      • Write a custom windows email service to send the results of the job step to the DBA team encrypted with PGP.
    2. Post sp_cycle_errorlog steps
      • Zip errorlog
      • verify the archive with the compression software
      • Run a CRC on the zipped file
      • Copy the archive to the file server along with a text file containing the CRC.
        • Run a CRC on the file server and compare with CRC in the text file
    3. Write “Cycle Errorlog Process Complete” to the windows application log.
    4. Email the entire DBA that this process was successful.
    5. Move copies to the cloud at Amazon’s S3, Windows Live Drive, Drop.io.
    6. Pass the entire content of the errorlog to twitter CHAR(140) as a time for the internet can crowdsource your errors.

    Again, extreme example but that is being overly thorough to put it mildly. Where is the balance? I think I have come to realize that a trait of a great DBA is balancing thoroughness with discretion. Some tasks\process\queries need the utmost level of thoroughness yet some just need to work. Sometimes, you have something that needs to be thorough but it is more important for the query\task\process to be in place yesterday. In that case, you have to sacrifice thoroughness for speed.

    I am glad I was able to step back and learn from the real world example that got me thinking about this.
    What personality traits do you think tend to be common among the top 10% DBAs or geeks in general?

This content is published under the Attribution-Share Alike 3.0 Unported license.

Disaster Recovery in the Cloud

September 9, 2009 · Posted in SQL Server · Comments 

Disclaimer: The enterprise Infrastructure as a Service(IaaS) that I will be using is for offered by the company I work for. I have been there 10 years and I am proud of our product set so I may be a little bias. However, these steps should be similar for most IaaS offerings like that of Amazon, GoGrid and RackSpace. To learn more about the product that is demo’d from visit http://theenterprisecloud.com We also have a similar hour based service at http://vcloudexpress.terremark.com. All that said, pay more attention the concepts not so much the platform. This is not meant as a commercial.

DR is important. However, it is worth a presence in a separate data center(DC)? That is a business decision that will not be covered in this article. Let’s say that the business cannot justify the cost based on the likelihood of the risk but they want to do something. In the past, your options short of a full DR presence ranged from offsite tapes to logshipping within the same data center to DR services like Sun Guard. All of these have shortcomings but if the business is ok with the risks and you set proper expectations on the level of protection and time to recover then that is all you can do.

Enter Cloud Computing(Specifically IaaS)

With IaaS, you can keep an online copy of your data in the cloud with minimal latency. The main advantage is cost and the savings can be huge. You do not have to invest in the infrastructure that goes along with a DR site like cabinets, network gear, and support services like AD, DNS, backups, etc. There is also the CapEX that goes with the actual server hardware.

The speed of implementation can also allow you to have many less virtual machines than you would have to have if they were physical machines. You can just keep a master copy of a web and each kind of app server. If you fail over to the DR cloud, you can spin up 10 or 20 of these really fast. Of course to have low latency with your data, your DB tier will have to be fully implemented.

DR Cloud

This just a quick diagram of what you could do. In the event of catastrophe, you can flip DNS to the DR cloud in be up in much less time that not having a DR presence. In the second and final part of the series, we will look at the actual steps in setting up DR in the Cloud with SQL Server Database Mirroring.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Next Page »