Thursday, November 5, 2009

Azure CDN vs. Amazon CloudFront/S3

Earlier this summer we took a deep look at the CDN market, taking into account both established players (Limelight, Akamai, Level-3) and the emerging pay-as-you-go contenders (Amazon’s Cloud Front, Rackspace’s Cloud Files, SimpeCDN and the like). Today, ~1 week after Amazon’s Relational Database Service was announced, Microsoft responded with its own introduction of a new CDN that will offer 18 edge locations throughout the world [1] :

“Windows Azure CDN has 18 locations globally (United States, Europe, Asia, Australia and South America) and continues to expand. Windows Azure CDN caches your Windows Azure blobs at strategically placed locations to provide maximum bandwidth for delivering your content to users. You can enable CDN delivery for any storage account via the Windows Azure Developer Portal. The CDN provides edge delivery only to blobs that are in public blob containers, which are available for anonymous access.”

Note that the latest addition to the Azure family is in CTP release only. All we know is that, for the Windows Azure platform itself, PDC ‘09 (to be held later this month) is expected to announce new features and will be followed by an official launch in January and first billing cycle in February; it’s very likely that the CDN will be available along similar timelines as well.

Like CloudFront, Microsoft’s CDN doesn’t solve the HTTPS issue in its first release. In terms of pricing, if Windows Azure Platform Pricing is any indication, you can expect to pay ~0.17/Gb for each targeted zone.

As a refresher, Amazon uses 14 edge locations in major markets throughout worldwide: 8 in the United States (Ashburn, VA; Dallas/Fort Worth, TX; Los Angeles, CA; Miami, FL; Newark, NJ; Palo Alto, CA; Seattle, WA; St. Louis, MO), 4 in Europe (Amsterdam; Dublin; Frankfurt; London) and 2 in Asia (Hong Kong, Tokyo).

The North American edge locations are mapped below:

image

On the issue of TTLs, it’s still unclear whether the Azure CDN will support shorter TTLs required for niche applications:

“The TTL specifies that the blob should be cached for that amount of time in the CDN until it is refreshed by the Blob service. The CDN attempts to refresh the blob from Windows Azure Blob service only once the TTL has elapsed. The default TTL is 72 hours. At PDC 2009, we will allow you to specify the standard HTTP Cache-Control header for your Windows Azure blobs. If this value is specified for a blob, then the TTL period will be set to the value specified in Cache-Control header.”

And lastly, tools you need to get started with either service are the Azure Storage Explorer, the Amazon S3 Firefox Organizer (0.4.8) or CloudBerry Explorer (for S3 and Azure Blob Storage) – just note that, as Microsoft plays catch up here, the Azure tools don’t yet expose the same richness of features around CDN integration.

References:

[1] – Introducing the Windows Azure Content Delivery Network http://blogs.msdn.com/windowsazure/archive/2009/11/05/introducing-the-windows-azure-content-delivery-network.aspx

[2] – Using the New Windows Azure CDN with a Custom Domain http://blog.smarx.com/posts/using-the-new-windows-azure-cdn-with-a-custom-domain

Tuesday, November 3, 2009

LinkedIn vs. StackOverflow & The Shifting Landscape for Programming Careers

I’ve been curiously watching LinkedIn try to transform itself into an awkward Q&A-style forum. Subscribing to the Senior .NET Developers group recently, I was a bit surprised to see the type of tactical and specific implementation questions being posted.

image

This is a terrible pairing of technology, IMO. Sure, LinkedIn is great at building and exposing connections, but their Q&A offering is almost an after-thought with a feature set that’s at least 5 years dated.

Even worse, none of this dialogue is publically searchable!

We’ve written about the value of StackOverflow reputation rankings as an emerging compliment to traditional resumes and awards. Yes, the current scoring system leaves much to be desired: users are more incentivized to create ‘popular’ content that drives participation than accurate, factual content that doesn’t.

The score itself is less telling than a closer examination of an individual’s responses. Nonetheless, it is best-of-breed and it’s no surprise that they announced a new career site today specifically intended to link hands-on Q&A activity with public CVs:

Stack Overflow Careers

What is careers.stackoverflow.com? It's a few things: 

  • a completely free, public CV hosting service for programmers, to share the cool stuff you've coded and created with the world.
  • a way to explicitly link your Stack Overflow profile with your CV, to provide concrete examples of your communication skills and individual expertise to anyone who is interested.
  • a better way to connect great programmers with the best programming jobs, for those who opt into the small annual listing fee.

Prior to the announcement, bloggers were already starting to sport “My answers @StackOverflow” feeds alongside their content, courtesy of the user-specific RSS feed exposed by StackOverflow and tools like Yahoo Pipes – the new careers site formalizes this intention in a much cleaner way.

Joel had a great post this week on finding out what your company is all about and the theme of helping your users become awesome that really resonated with me – they’re certainly living this out loud with this launch:

If you love to code, too, I encourage you to create your own Stack Overflow CV. Keep it private, or make it public via the URL of your choice -- it's completely free either way. If you think you might be actively looking for a job in the next 3 years, take advantage of our outrageously low promotional pricing of $29 for a 3 year filing. That way, at any point in those 3 years, you can flip a switch and become visible to hiring managers. Or not. It's totally up to you.

It’s refreshing to watch these guys in action and I can’t wait to see how it evolves!

Sunday, November 1, 2009

CompilationMode=Never: What does it actually do?

I wanted to take CompilationMode=Never for a test-drive on an existing application that’s plagued by too many control/page assemblies – the goal was to see if this would help improve scalability (by reducing unrecoverable memory consumed by assemblies) and to observe its impact on a high and fluctuating ‘% time in JIT’. It turns out that it’s not as easy as throwing it into web.config.

Yes, you can apply this en masse to an existing website but doing so on pre-existing sites could generate:

  1. The attribute 'codefile' is not allowed in this page.
  2. The attribute 'autoeventwireup' is not allowed in this page.  

Why would CompilationMode interfere with AutoEventWireup? In automatically wiring up events, it looks like the framework looks for suitable signatures at runtime using reflection [1] – crazy, I know; and not having assemblies would presumably make this impossible to do. Scott Allen has a couple of good writes-ups on this and you can have a look for yourself in the auto-generated assemblies.

Can you think of any reasons why the events wouldn’t be wired up at compile time?

I tried to debug the framework to see this in action but threw in the towel after a few hours of fighting the symbol server – hours on this is pretty ridiculous, I know, and in complete defiance of Oscar’s #1 rule: stops are in, emotions are out! It was just a bit shocking to see how brittle the entire setup around source server is and how many continue to struggle with it [2] – anyone out there actually using it successfully for mscorlib and System.*.dlls? Have you you applied any service packs? Would love to hear from you. With VS2008 SP1 (9.0.307279.1), Vista SP2, the latest source code component (Dotnetfx_4016_VistaSP2), and following every instruction to a tee, I can step into most assemblies (e.g. System.Web, 2.0.50727.4016) but can’t step into mscorlib (2.0.50727.4200) which contains the reflection calls in question – for reference, the symbols for mscorlib are downloaded from /download/symbols/mscorlib.pdb/4D0B2695F5144B4D8F24004284FE26191/mscorlib.pd_.

So once you’ve manually resolved #2, code-behind files must also be pushed out to a fully-qualified class that can be reference by the Inherits attribute alone. 

After that, it works as expected and you see the runtime behaviour of CompilationMode=Never:

   1: // IWebObjectFactory.CreateInstance 
   2: public virtual object CreateInstance() {
   3:  
   4:     // Create the object that the aspx/ascx 'inherits' from
   5:     TemplateControl templateControl = (TemplateControl) HttpRuntime.FastCreatePublicInstance(_baseType);
   6:  
   7:     // Set the virtual path and TemplateSourceDirectory in the control 
   8:     templateControl.TemplateControlVirtualPath = VirtualPath;
   9:     templateControl.TemplateControlVirtualDirectory = VirtualPath.Parent; 
  10:  
  11:     // Give the TemplateControl a pointer to us, so it can call us back during FrameworkInitialize
  12:     templateControl.SetNoCompileBuildResult(this); 
  13:  
  14:     return templateControl;
  15: }

What’s a little surprising, if I’m reading this right, is it seems to generate the type on demand every time without any caching of the instance for re-use across different requests:

   1: /*
   2:  * Faster implementation of CreatePublicInstance.  It generates bits of IL
   3:  * on the fly to achieve the improve performance.  this should only be used 
   4:  * in cases where the number of different types to be created is well bounded.
   5:  * Otherwise, we would create too much IL, which can bloat the process. 
   6:  */ 
   7: internal static Object FastCreatePublicInstance(Type type) {
   8:  
   9:     // Only use the factory logic if the assembly is in the GAC, to avoid getting
  10:     // assembly conflicts (VSWhidbey 405086) 
  11:     if (!type.Assembly.GlobalAssemblyCache) { 
  12:         return CreatePublicInstance(type);
  13:     } 
  14:  
  15:     // Create the factory generator on demand
  16:     if (!s_initializedFactory) {
  17:  
  18:         // Devdiv 90810 - Synchronize to avoid race condition
  19:         lock (s_factoryLock) { 
  20:             if (!s_initializedFactory) { 
  21:                 s_factoryGenerator = new FactoryGenerator();
  22:  
  23:                 // Create the factory cache
  24:                 s_factoryCache = Hashtable.Synchronized(new Hashtable());
  25:  
  26:                 s_initializedFactory = true; 
  27:             }
  28:         } 
  29:     } 
  30:  
  31:     // First, check if it's cached 
  32:     IWebObjectFactory factory = (IWebObjectFactory)s_factoryCache[type];
  33:  
  34:     if (factory == null) {
  35:  
  36:         Debug.Trace("FastCreatePublicInstance", "Creating generator for type " + type.FullName);
  37:  
  38:         // Create the object factory 
  39:         factory = s_factoryGenerator.CreateFactory(type);
  40:  
  41:         // Cache the factory
  42:         s_factoryCache[type] = factory;
  43:     }
  44:  
  45:     return factory.CreateInstance();
  46: } 

Anyway, if SharePoint uses it successfully, I’m sure it can work for your CMS/cutting-edge-ASP.NET-stuff too.

Performance benchmarks though will have to wait another day.

References:

[1] – AutoEventWireup & Reflection
http://stackoverflow.com/questions/275965/asp-net-mvc-autoeventwireup-required/276385#276385
http://odetocode.com/articles/406.aspx

[2] – Troubleshooting Framework Debugging
http://stackoverflow.com/questions/1095202/why-cant-i-step-into-this-line
http://social.msdn.microsoft.com/Forums/en-US/refsourceserver/thread/1b74f60c-e961-425c-a38e-362406dd4cfe
http://social.msdn.microsoft.com/Forums/en-US/vsdebug/thread/a8441b7b-017b-4094-8788-6005aa8e69a3
http://social.msdn.microsoft.com/Forums/en/refsourceserver/thread/b22e044c-be8f-4650-98d6-b426193b7b2c
http://social.msdn.microsoft.com/Forums/en-US/refsourceserver/thread/e20ad5f3-3071-4ff6-9f2b-6f4ec22661b8
http://social.msdn.microsoft.com/Forums/en-US/refsourceserver/thread/ceb17913-a983-47a8-b15e-655a65c5f001/

[3] – Microsoft Reference Source Server
http://referencesource.microsoft.com/netframework.aspx
http://referencesource.microsoft.com/serversetup.aspx

[4] – KB944899 (This hotfix addresses a performance problem when stepping through Source Code downloaded via a Microsoft Reference Source Server)
https://connect.microsoft.com/VisualStudio/Downloads/DownloadDetails.aspx?DownloadID=10443&wa=wsignin1.0

Tuesday, October 27, 2009

The Silo Effect and Other Productivity Killers – A Practical Guide for IT

If we recognize that the greatest innovation comes from working across silos, then how do we reward and incentive this pattern and are companies actively doing this? Some thoughts this weekend triggered readings into the “Silo Effect” [1] and its impact on the pace of progress. In the SaaS world, managing close collaboration between Development, Q&A and Operations is no easy feat, especially for a growing platform. I have to remind myself regularly that a handful of people really can’t run it all. Intuitively, you look for the simplicity of a well-integrated team with common skills and experiences; in practise, teams are highly specialized with varied focuses.

The first stop was an excellent read courtesy of John Simpson and Eric Winquist (of Jama Software [6], Portland, Oregon) in their paper, entitled "Eliminate the Top 3 Productivity Killers and Build Great Products in Half the Time” [5] – some takeaways are highlighted below:

Productivity Killer #1 – The Great Scavenger Hunt is Costly
Information is growing over 66% every year and is constantly changing. The good news is we now have the opportunity to know more about our customers like never before. The bad news is we’re inundated with information – some of it valuable, much of it noise. Where do you store and organize the relevant product information? Do you have the right intelligence captured to make the right decisions and take the right actions?

It’s estimated that employees at U.S. companies waste over 5 billion unproductive hours annually just looking for information2. At $35 per hour for an average knowledge worker, that’s a $175 billion problem in the U.S. alone. As an executive friend used to say, “That’s no pocket change, that’s adult money.”

Did You Know?

  • Employees spend 25% of their time just searching for information
  • Employees spend 20 minutes per day recreating information that already exists
  • 42% of employees accidentally use the wrong information at least once per week
  • The average ramp up time is 45 days for a new employee, as high as 9 months for highly skilled jobs
  • The productivity loss of IT employee turnover can last from 3- 12 months

There’s Much Hype but the Social Trend is Real
If you’re like us, you might be tired of reading about “the revolution of social networking” – as if life or business wasn’t social before Twitter, Facebook or other Web applications existed. However you feel about this trend, there’s evidence to show the collaborative movement is real and is gaining momentum in the workplace. IBM estimates that within six years, workers will collaborate 80% of their time. You can’t open a prominent business or technology magazine without reading about open innovation and the impact that collaboration is having on business processes. With nearly $1 trillion being invested in R&D worldwide annually, you can understand why it’s a popular issue.

It’s a great read with emphasis on: a) how to better classify information to facilitate cross-silo collaboration, and b) how to cater to more natural forms of day-to-day dialogue. I must say, the 25% of time spent on search really resonated with me – what a powerful stat! I’m relentlessly “negotiating” with reluctant administrators on this dated restriction. I live in email and maliciously hold on to conversations, as I’m sure others do; the ability to instantly query across last decade’s worth of conversations is priceless. Trying to stitch together threads in separate mailboxes with crucial messages lost in some error-prone archiving scheme is, on the other hand, quite expensive. If never ceases to amaze me how these limitations can still be defended? Unlimited email (10-20-30GB, practically speaking), with foolproof-web-based-search-that-works, should be a minimum bar in any organization – what exactly are we waiting for??

(On the paper itself, yes, many of these data points are used in all sorts of collaboration solutions, which ultimately weakens the comparative evaluation for the sake of establishing a business case. I see this all the time and think: yes, it’s a problem, without question, but why is your solution better than X,Y,Z? The StrangeLoop team was a perfect example: yes, we know latency is an issue, but why should we address it with this appliance? I would imagine that would-be customers have identified and understand the problem and that this emphasis is lost on them. Well, here’s a scenario where justifying the business case got my attention.  Without question, these are the 3 productivity killers – but from a tool perspective, whether Contour is the answer or whether a TFS/SharePoint paring is an equally viable approach is another debate all together.)

Tearing Down Business Silos[2] was the next stop:

The foundation of a successful organization is an entire team focused on common goals. Silos erode this foundation. Being aware of the fundamental human behaviors that lead to silos and taking steps to overcome them offers fantastic benefits - including more relevant products and services, higher productivity, better use of resources, and more effective and engaged personnel.

So what can we do? Carol has some fantastic suggestions:

  • #1 –Reward Collaboration: define collaborative performance objectives, make it part of the preview, promote based on this and spread the story throughout the organization!
  • #2 – Focus on Innovation: triggered by a cross-pollination of ideas; bring together people with diverse perspectives and expertise when setting the agenda.
  • #3 – Communicate Transparently: enough said; it’s dead-easy and so often missed.
  • #4 – Encourage Networks: use social networking tools and create visual models – I haven’t seen this formalized well.
  • #5 – Mix it Up: rotate people, invite other managers to your meetings, make them a member of the group – also very rare.
  • #6 – Focus on the Customers: stay close to the end-user! Share marketplace information, customer-feedback, the good and the bad. In practise, customer opinions pass through layer upon layer before reaching the Development team.
  • #6 – Get Personal: collaborative relationships thrive in an environment of personal trust. Most companies do a decent job of this, as they should.

Do these things well and you will avoid scenarios like the one faced by the CIO of a mid-size company (125 reports) who wrote into a recent issue of InfoWorld [7]:

“From the way everyone behaves you'd think my head of application development was Bill Gates and the head of Operations was Larry Ellison. There is no trust at all, and they appear to be out to undermine each other more than they're out to be successful at their own responsibilities.

Worse, the attitude is contagious. The people reporting to them act just like they do. There's no trust, no ability to collaborate ... nothing except a lot of blaming, which seems to have become our core competency.

I've tried lecturing, coaching, berating and arguing, and nothing seems to work.
They really are two talented people and I don't want to get rid of either of them. On the other hand, the situation isn't sustainable. Any thoughts about what else I might try before firing one or both of them?”


InfoWorld’s Bob Lewis responds:

“Before you do anything else, look at the situation you've put them in. Usually, when organizations turn into silos, it's because goals, and any compensation tied to those goals, reinforces silo-oriented behavior.

My guess is that you'll find a lot of win/lose trade-offs. For example, if the budgeting process starts with a fixed pie and when one of them gets something it comes out of the other's budget, it's win/lose, which reinforces organizational rivalry. Result: Silos.

Or, maybe Applications is counted as successful when projects finish on time no matter what, while Operations is counted as successful based on system stability and performance. Which means Applications has an incentive to skimp on software quality assurance, while Operations has an incentive to keep all new code out of production as long as possible. Result: Silos.

While you're looking at structural sources of moat-building, you should also figure out if any of their responsibilities require collaboration. Very likely, they don't. Usually, organizational design starts by trying to draw clear boundaries between departments, to clarify who is responsible for what. The unintended consequence is that because managers have no reason to collaborate, they neither build the habit of doing so, nor see any reason to start.”

“People do what they do for reasons. If you want them to do something different, take away the reasons they're doing what they're doing, and give them reasons to do something different.”

References:

[1] – The Silo Effect (Seven Key Obstacles to Change):
http://www.edwardmorler.com/seven-key-obstacles-to-change.html

[2] – Tearing Down Business Silos
http://www.sideroad.com/Management/business-silos.html

[3] – Breaking Organizational Silos using Enterprise Architecture
http://it.toolbox.com/blogs/ea-for-cio/breaking-organizational-silos-using-enterprise-architecture-30764

[4] – Jama Software (Contour)
http://www.jamasoftware.com/contour/

[5] – Eliminate the Top 3 Productivity Killers and Build Great Products in Half the Time
http://www.jamasoftware.com/media/documents/Central_Hub_Product_Intelligence_Jama.pdf

[6] – Requirements and Bridging the Silos
http://www.requirementsnetwork.com/system/files/Requirements%20and%20Bridging%20the%20Silos%20(Part%202%20of%203).pdf

[7] – Stuck in Silos
http://www.infoworld.com/t/career-advice/stuck-in-silos-707

Thursday, October 15, 2009

High CPU in ASP.NET - Finding Large DataTables with the Public SOS

This is yet another one strictly for the dump analysis crowd – what can I say, it’s been a long week! Last episode, we highlighted the lack of support for many useful windbg commands. Well, chief among these was the DumpDataTables command that provides a nice summary of the largest tables and their column counts, typically a nice starting point for high-memory/high-CPU issues in traditional application. So how do we fend for ourselves and reproduce this with the SOS that we have?