Related Content
Tags
Architecture
Integrations

Accessing Volatile Resources

Posted by Joel Varty on March 7, 2013
0 people like this tutorial
Rate:

There are a few things we can do as developers to alleviate stress on our servers when we access volatile resources.  This can mean anything like REST services, database calls, or anything that might provide a bottle neck.

I would say they only thing that doesn't fall into this category is file system access, but even in those cases you might want to use cache to hold the object in memory if it isn't too large.

The Problem

When you access a slow resource, it causes a bottle neck: everything in your web request has to wait for that resource to respond before it can continue, and if you have multiple folks accessing that particular web page, your server could see a CPU spike, or even crash, as it juggles all those queued threads.

Most often what happens in this case is the web service or database server you are accessing will get progressively slower as it tries to deal with the extra load.  In the case of some services, like the Twitter or Facebook API you may even get denied service.

Can you spot the bottleneck?

 

Solution 1 - Cache and a Single Thread

One way around this issue is to ensure only a single thread accesses the resource at a time using an exclusive lock.  You could also use a semaphore to limit the access to only a few threads as well, but I have found that a single thread tends to work best here.

Pseudo-code

  • Resource is requested
  • Determine a cache key that identifies the parameters used to access the resource
  • Check for the object in cache
    • If found, return it
  • Start a critical section (lock())
    • Check cache again
      • If found return it
    • Access the resource
    • Put the result in cache for a specific amount of time
    • Return
  • End the critical section

 

What happens here is that only a single thread will get into the critical section at a time, meaning that the external resource won't have too much load on it. 

The trick here is to check if the object is in cache as the first operation inside the locked critical section, since it may have been put into cache by another thread while the current thread was waiting to get into the section.

The problem with this approach is that it doesn't help when the resource goes down entirely.

Solution 2 - Cache and a Single Thread plus File System

We can extend Solution 1 by adding a more persistent storage of the object on the file system, so that if the object is not in cache, we can use it from the file system until it expires, or if the external resource is not available.

The downside here is that the resource's timeout needs to be short enough to ensure we don't kill our own web server with a huge request queue waiting for it to response.

I've bolded the changes from Solution 1.

Pseudo-code

  • Resource is requested
  • Determine a cache key and filename that identifies the parameters used to access the resource
  • Check for the object in cache
    • If found, return it
  • Check for the object in the file system
    • If found, check the last write time - make sure it isn't too old
      • If not too old, return it
      • Else save it to temp variable
  • Start a critical section (lock())
    • Check cache again
      • If found return it
    • Access the resource with a reasonable timeout
      • *If it errors out, use the temp variable from above if it's set*
    • Write the object to the file system.
    • Put the result in cache for a specific amount of time
    • Return it
  • End the critical section

You can see from the bold parts that we don't have to change much from Solution 1 to add file system support.   Note the italicized line, though, as it could become a bottleneck for this logic if the timeout is too long.  On the flip side, if you pick too short of a timeout, you may never get a response from the resource at all.

If you are returning binary or string data, you can save it straight to file, but if you are working with objects, you probably want to serialize it, and I recommend binary serialization, as it is much faster than XML or JSON.  However, if you already have a good serialization mechanism in place for JSON or XML from a web service, I would stay with that and just save the text data in the file, as it doesn't introduce a second serialization scheme into things, which can be a pain.

Solution 3 - Offline Resource Access

If you have a finite and well defined set of resources that your site needs to access regularly, there is no point using Solution 1 or 2 and having web requests tied up waiting for those requests to finish. 

You may as well access those resources offline in another process or in a specific worker thread, saving the results to the file system.  Even better, if you have a web farm, you can propagate the files to each server on the farm from the single offline process.  This is usually best done in a windows service.  You could also spawn a thread inside your website code to run this from, though, from the global.asax, but run some tests first to make sure the app pool isn't being shut down by inactivity and killing your thread.

Pseudo-code - Offline Process

  • Start timer to kick off process on a fixed time span.
  • On timer interval:
    • Determine list of resources from config file (or possibly hard coded)
      • Loop each resource item if more than one
      • Determine file path for the resource on this machine
      • Access resource
        • On success, save the result to file system
        • On error, write to error log
    • When loop is complete
      • If any errors occurred, send an email reporting them
      • If in a web farm, copy the file to the other servers

Pseudo-code - Website

  • Resource is requested
  • Determine a cache key and filename that identifies the parameters used to access the resource
  • Check for the object in cache
    • If found, return it
  • Start a critical section (lock())
    • Check cache again
      • If found return it
    • Access the file that contains the resource
    • Put the result in cache
      • Use a sliding interval, so it stays in cache as long as it's being regularly requested
      • Use a CacheDependancy on the filepath so the cache is cleared when the file is overwritten
    • Return the result
  • End the critical section

Conclusion

Hopefully with a combination of the three techniques I've demonstrated above, you should be able to get your web server responding more reliably and with better performance when you are dealing with volatile resource from web services, databases and the like.

 

Add a Comment

   
  
 
 
   
Submit

LOGIN

Log in with your Agility username and password or
register for the site




Log In

REGISTER












Register

FORGOT PASSWORD