PSR-14: Example - layered caching

in #php5 years ago (edited)

So far we've looked at a number of complete, practical examples of using PSR-14 Events in various ways, both conventional and unconventional. In our final (probably) installment, I want to offer a highly unconventional but still practical use of PSR-14 that really shows off just how flexible Events can be: Layered caching.

"But wait, isn't caching the realm of PSR-6 and PSR-16?" Yes. Yes it is. But neither of those offer a built-in way to compose multiple cache backends together. It's certainly possible, but doing so is left as an exercise for the implementer. Let's use PSR-14 to get some exercise.

Define the problem

Suppose we have some data that comes from a remote URL endpoint. We don't always need it immediately up to date and the web service is kinda slow (because it's a lot of data, or just a distant server), so we want to cache it. A cache server like Redis or Memcached is the common tool here, but we also want very fast access to it. APCu is much faster and in-memory so we can use that. But if we have a multi-head configuration for our application (for redundancy or performance or both), we can't share APCu between webheads. How can we have our cake and eat it too?

That particular kind of cake is called "layered". (Mmm... layer cake.) That is, we have a series of cache layers with the "higher" layers being faster but smaller/shorter-lived and the "lower" layers slower but higher capacity and more stable. When looking up a cache item we check each layer in turn from high to low, and once we find the value we're looking for we update each of the layers above it.

That's the exact model used by CPU's, which have an L1 cache, L2 cache, and L3 cache of decreasing speed and increasing capacity, with RAM a far-away original source, and the file system the final layer. It's also how web caches work; your browser cache is local/fast, and then there's often a transparent reverse proxy between you and the origin server (such as a CDN, or just a single server like Varnish or Nginx).

An important aspect of such layered caches is that each layer is transparent. One layer doesn't inherently know how many layers are below it nor above it, and each can be removed without impacting the functionality of the program, just its performance.

How can we build such a layered cache system using events? It's a little tricky at first, but once the different parts come together I think it's actually pretty neat.

The tools

For simplicity we'll assume our remote data is a single large object; it's just a very slow server or data that doesn't need to be all that up to date. If the remote data were a large import of some kind the overall process would be the same but since data imports are not what we're showing off let's keep it simple. We'll also assume it's coming from a single known URL. That means we can access it using any reasonable HTTP client that implements PSR-18. That in turn will require PSR-17 to create a PSR-7 HTTP request object for us.

For the caching, we'll implement three layers:

  • A fast in-process cache (that doesn't survive past the current request, but that's fine)
  • An APCu cache that is host-specific but still very fast
  • A Redis cache server that is shared between all of the various web hosts

I'm going to use PSR-6 cache pools for ACPu and Redis. If PSR-16 is more your jam that would work equally well.

Because there's a decent number of classes involved, assume the following use header applies to all classes below:

use Psr\Cache\CacheItemPoolInterface;
use Psr\EventDispatcher\ListenerProviderInterface;
use Psr\EventDispatcher\StoppableEventInterface;
use Psr\Http\Client\ClientInterface;
use Psr\Http\Message\RequestFactoryInterface;
use Fig\EventDispatcher\ParameterDeriverTrait;

The Event

The Event class we need looks like this:

class CacheLookup implements StoppableEventInterface
{
   /** @var string */
   protected $cacheKey = '';

   /** @var mixed */
   protected $value;

   /** @var callback[] */
   protected $callbacks = [];

   public function __construct(string $cacheKey)
   {
       $this->cacheKey = $cacheKey;
   }

   public function key() : string
   {
       return $this->cacheKey;
   }

   public function setValue($value) : void
   {
       $this->value = $value;
       foreach ($this->callbacks as $callback) {
           $callback($value);
       }
   }

   public function addCacheCallback(callable $callback) : void
   {
       $this->callbacks[] = $callback;
   }

   public function getValue()
   {
       return $this->value;
   }

   public function isPropagationStopped() : bool
   {
       return !is_null($this->value);
   }
}

Most of that should be fairly self-explanatory by now. The Event is created with a cache key that it carries along, and it will continue getting passed to Listeners until one of them calls setValue() with the found value; after that it stops and the Dispatcher will return it.

The interesting part is this cache callback stuff. Recall that after each cache layer is checked, if the value isn't found it moves onto the next. When the item is found it needs to then populate all of the cache layers above it, much like a Varnish server will add a page to its own cache after the first time a request passes through it to reach the origin server.

In this design, each time a Listener says "nope, haven't found the cache item" it passes the Event a callback function, which the Event will then call later with the value. That allow each Listener to indicate "but when you do find it, here's how to update this layer in the cache." Thus when setValue() is called the Event first updates its own copy of the value, then calls whatever update callbacks it was given.

Listen carefully

There's technically 4 Listeners in play: The in-process cache, APCu, Redis, and the Listener that fetches from the remote URL. However, as long as APCu and Redis are both using the same cache API (PSR-6) they're functionally identical. That means we have 3 different Listeners to write. Let's start at the bottom with the remote fetcher. For simplicity we'll make it a class that implements __invoke():

class RemoteFetchListener
{
   /** @var \Psr\Http\Client\ClientInterface */
   protected $client;

   /** @var RequestFactoryInterface */
   protected $requestFactory;

   protected const URL = 'http://api.example.com/some-expensive-data';

   public function __construct(ClientInterface $client, RequestFactoryInterface $requestFactory)
   {
       $this->client = $client;
       $this->requestFactory = $requestFactory;
   }

   public function __invoke(CacheLookup $event) : void
   {
       $key = $event->key();
       $request = $this->requestFactory->createRequest('GET', static::URL . '/' . $key);
       $response = $this->client->sendRequest($request);

       $value = DomainObject::fromHttp($response->getBody());
      
       $event->setValue($value);
   }
}

Nothing especially exciting. The Listener composes a PSR-18 HTTP Client and a PSR-17 Request Factory, which it uses to just send off a request to a known URL. A real-world implementation would not want to put the URL in a constant but it serves our purposes for now. When it gets the result it decodes it from the HTTP body blob into a domain object of some kind relevant to our application and sets that value on the Event.

Moving up the stack we have two cache layers. As mentioned both are functionally identical so we need only one Listener class, which we can instantiate and configure twice.

class CacheListener
{
   /** @var CacheItemPoolInterface */
   protected $pool;

   /** @var \DateInterval */
   protected $ttl;

   public function __construct(CacheItemPoolInterface $pool, \DateInterval $ttl)
   {
       $this->pool = $pool;
       $this->ttl = $ttl;
   }

   public function __invoke(CacheLookup $event) : void
   {
       $key = $event->key();

       $item = $this->pool->getItem($key);
       if ($item->isHit()) {
           $event->setValue($item->get());
       }
       else {
           $event->addCacheCallback(function($value) use ($item) {
               $item->set($value)
                   ->expiresAfter($this->ttl)
                   ;
               $this->pool->save($item);
           });
       }
   }
}

The CacheListener class again uses an __invoke() method as the actual listener to keep things simple. It composes a PSR-6 cache pool object and a TTL (Time-to-Live) for its own cache items. Then when the Listener is called it first looks the key up in its cache. If it's found it just sets that value on the Event and everything's done. If not, it adds a callback function to the Event. That callback function expects the value that was eventually found (an instance of DomainObject in this case) and will save it to the cache pool with the TTL it was configured with.

Finally let's look at the top-most cache layer, which just stores cached values in process memory. Its logic is essentially the same as CacheListener but using a simple array rather than a cache pool.

class MemoryCacheListener
{
   protected $values = [];

   public function __invoke(CacheLookup $event) : void
   {
       $key = $event->key();
       if (isset($this->values[$key])) {
           $event->setValue($$this->values[$key]);
       }
       else {
           $event->addCacheCallback(function($value) use ($key) {
               $this->values[$key] = $value;
           });
       }
   }
}

As an aside, one could certainly implement an entirely in-memory cache pool and then use the exact same CacheListener class. Either way works, but I decided to show off this method to give an alternate cache callback implementation.

Putting it all together

With all of our parts now ready let's put them to use. For this first example I'm going to assume Tukio's OrderedListenerProvider, but any order-capable Provider would work. (It does have to be an ordered Provider, as the order in which Listeners are called is crucial in this case.)

First off, assume we have two PSR-6 cache implementations available to us:

class ApcCache implements CacheItemPoolInterface { /* ... */ }
class RedisCache  implements CacheItemPoolInterface { /* ... */ }

Implementing those is left as an exercise for the reader. (But really, don't implement your own; just download your favorite existing one using Composer.)

Next, create the 4 Listeners we're going to need:

$memoryListener = new MemoryCacheListener();
$apcListener = new CacheListener(new ApcCache(), new \DateInterval('5M'));
$redisListener = new CacheListener(new RedisCache(), new \DateInterval('24H'));
$fetchListener = new RemoteFetchListener($httpClient, $requestFactory);

As mentioned, both APCu and Redis use the same CacheListener class, wrapped around 2 different cache backends. They also are configured with different TTLs. Substitute your own favorite PSR-18 HTTP client and PSR-17 Request Factory for $httpClient and $requestFactory as desired.

Next, assemble those Listeners into the right order in a Provider:

$provider->addListener($memoryListener, 3);
$provider->addListener($apcListener, 2);
$provider->addListener($redisListener, 1);
$provider->addListener($fetchListener);

$dispatcher = new Dispatcher($provider);

Recall that OrderedListenerProvider uses higher-number-wins priority so we assign the faster/more-volatile Listeners higher priority. Then we put that Provider into a Dispatcher. Finally, we can call it with this delightful little one-liner:

$domainObject = $dispatcher->dispatch(new CacheLookup('something'))->getValue();

Boom! That simple one line command will now create a new CacheLookup Event and trigger it. Whichever cache layer Listener first finds the something value will return it, and populate all of the cache layers above it. Subsequent requests will not go as deep, just hitting the higher layer caches. And should the item not be found at all (or be expired) then the final Listener will reach out over HTTP to retrieve the original value, then cache it at all layers. The application code can just carry on doing its thing, and that single line will ensure that all cache layers do their thing at the appropriate time.

A Dedicated Provider?

We can go one better, though. Using OrderedListenerProvider works fine, but it's not particularly domain-aware. It's still using generic terminology. Fortunately, because Listener Providers are so light-weight we can purpose-build our own.

Let's build a caching-domain-aware object that we can configure and then use as a Provider. We'll want it to support multiple cache layers but also optimize the experience of using it so that it makes sense in context. Here's what I came up with:

class CacheProvider implements ListenerProviderInterface
{
   use ParameterDeriverTrait;

   /** @var callable[] */
   protected $fetchers = [];

   /**
    * @var array
    *
    * Array of cache definitions.
    */
   protected $cachers = [];

   /** @var MemoryCacheListener */
   protected $memoryListener;

   public function __construct()
   {
       $this->memoryListener = new MemoryCacheListener();
   }

   public function addFetcher(string $key, callable $fetcher) : void
   {
       $this->fetchers[$key] = $fetcher;
   }

   public function addCacheLayer(CacheItemPoolInterface $pool, \DateInterval $ttl, array $keys = [])
   {
       $cache = new CacheListener($pool, $ttl);
       $this->cachers[] = [
           'cache' => $cache,
           'keys' => $keys,
           'type' => $this->getParameterType($cache),
       ];
   }

   public function getListenersForEvent(object $event): iterable
   {
       if (! $event instanceof CacheLookup) {
           return [];
       }

       yield $this->memoryListener;

       $key = $event->key();
       foreach ($this->cachers as $cacher) {
           if ($event instanceof $cacher['type']
               && ($cacher['keys'] == '' || in_array($key, $cacher['keys']))) {
               yield $cacher['cache'];
           }
       }

       yield $this->fetchers[$key];
   }
}

This class maintains 2 internal arrays: A collection of fetchers (the base Listeners that will generate the data to be cached) and a collection of cache pools. Both cache pools and fetchers can be associated to specific keys; that allows us to use the same Provider instance to handle many different cache key lookups, each with its own fetcher. Some fetchers will, as the name implies, make an HTTP request to get some data. However, they could also just do some expensive local calculation, say off of an SQL database or collecting data from modules using its own Event, as a Plugin Registration implementation.

Because the actual Listener for the caches is identical for any pool, the Provider internalizes wrapping the pool up into a Listener. That also lets it take the TTL as a parameter itself.

It also creates a MemoryCacheListener instance on its own. There's no need for us to bother configuring it; it will just happen automatically.

Now let's have a look at getListenersForEvent(), as I think it shows off just how powerful and flexible the Provider design really is. While it enforces that the Event passed be an instance of CacheLookup, it doesn't really do anything with sub-types. You could subclass CacheLookup if you wanted to but since CacheListener is pre-coded it wouldn't really have any impact. (You could make a more robust implementation that did care if you wanted to, but that's optional.)

The first Listener it returns is always the memory cache Listener. That has first crack, always, so requests never leave process if they don't have to. Then it loops over the CacheListener instances it has in the order they were added and returns those, but only if the cache key applies. In this example I'm just using exact string matching with an optional "all" case, but a more robust implementation could do some sort of key-regex matching or whatnot. A real implementation would probably also want to allow explicit ordering of cache Listeners.

Recall that if any of those Listeners finds the corresponding value then the process is complete; the yield keyword means that this function stops each time a Listener is returned and no further work is done until the next Listener is requested.

Finally, if none of the cache layers found the requested value then the key-appropriate fetcher is returned, which will do, well, whatever it's going to do to produce the actual value.

Nothing else we've written so far needs to change! Configuring that Provider is then straightforward, and easily configured via your Dependency Injection Container:

$provider = new CacheProvider();
$provider->addFetcher('something', new RemoteFetchListener($httpClient, $requestFactory));
$provider->addFetcher('other', new DatabaseFetcher($pdoInstance));
$provider->addCacheLayer(new ApcCache(), new \DateInterval('10M'));
$provider->addCacheLayer(new RedisCache(), new \DateInterval('24H'));

Here, we create a new instance of our Provider and give it two fetchers: One is the RemoteFetchListener we discussed before, and the other is a new, hypothetical fetcher that builds some complex value out of a PDO-accessible database. (Maybe it's a very slow query?) Then it configures two cache layers as we saw before, one for APCu, the other for Redis. Note that CacheListener is nowhere to be found; it's been entirely subsumed by the CacheProvider.

And... we're done. That Provider can now be passed into any Dispatcher, or more likely incorporated into a DelegatingProvider to handle just CacheLookup events.

A caller can now run the exact same command as before:

$domainObject = $dispatcher->dispatch(new CacheLookup('something'))->getValue();

To retrieve the something value from the remote URL and cache it, or can call:

$other = $dispatcher->dispatch(new CacheLookup(other'))->getValue();

To retrieve the other value as produced from the database. All of that is fully abstracted away from the caller.

The upshot

While not the immediately obvious solution, this design really stretches the capabilities of PSR-14. What I really love about it is how decoupled it is. Consider:

  • The caller, which is asking for the item to be found, has zero concrete dependencies: It depends only on a PSR-14 Dispatcher; you can use any implementation you like.
  • Each Listener layer has zero dependencies, period. It's necessarily coupled only to its Event, which it would be anyway by design.
  • The CacheListener special case is coupled only to PSR-6. Any PSR-6 compatible cache backend will work fine. If you don't like PSR-6 then a PSR-16 implementation is simple enough to write, or use your own cache implementation.
  • There is zero coupling to the source of the remote data; it can be anything, from any source.
  • If you want to use the remote provider style here, it's coupled to no concretion at all, just to PSR-17/18. Include your own favorite HTTP client implementation.
  • You could easily add Events to the pipline that are... not cache lookups. They're just for instrumentation, say logging (via PSR-3) to determine how often a cache miss goes all the way to the fetcher. (That would require a small modification to the CacheProvider, but is already a feature of the version that just uses Tukio directly.)

If you have only one web head so the extra Redis layer is unnecessary? Comment out one line. If you want to use Memcaced instead of Redis? Change that one line to pass in a different PSR-6 pool implementation. Want both some reason? Add another cache layer. Need different TTLs? Pass in different values.

Want to wire everything up at once and not have to handle it manually? Pull your favorite PSR-11 Dependency Injection Container off the shelf and do so.

I love this design, because it really shows off both the flexibility of PSR-14 itself and the benefits of the Framework Interoperability Group's efforts in recent years. Not one piece of the above system is coupled to any concretion of anything. Anyone wanting to use it can use a BYO-cache, BYO-HTTP client, BYO-HTTP Factory, even BYO-Event Dispatcher.

I think that's cool. :-)

Conclusion

That just about wraps up this series on PSR-14, the PHP FIG Event Dispatcher specification. Over the course of this series we've looked in depth at its design and the reasoning behind it. We've also shown several examples of putting PSR-14 into practice, both the conventional and the far-out-but-still-totes-legit.

I want to again thank the rest of the PSR-14 Working Group for all their hard work, and to the many other people who will be implementing PSR-14 in their libraries and frameworks. Work on that has already begun, which is great to see. As with any other PSR it will take a bit of time for it to fully roll out, but I expect that within the next year or so PSR-14 will be the standard way to make libraries extensible through any arbitrary framework or application.

And that's great for the whole PHP ecosystem.

PSR-14: The series

Sort:  

Congratulations @crell! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You received more than 100 as payout for your posts. Your next target is to reach a total payout of 250

You can view your badges on your Steem Board and compare to others on the Steem Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Do not miss the last post from @steemitboard:

SteemitBoard to support the german speaking community meetups
Vote for @Steemitboard as a witness to get one more award and increased upvotes!

Coin Marketplace

STEEM 0.30
TRX 0.12
JST 0.033
BTC 64386.10
ETH 3142.17
USDT 1.00
SBD 3.98