Diving into enrichment services

Written by Erik Mogensen updated: Monday August 15 2016 07:37

Enrichment services are a new breed of extension capabilities that are coming to CUE, allowing CUE to be extended and modified in a myriad of ways. Enrichment services are simple HTTP services that follow a particular flow, that CUE discovers along the way. These enrichment services will then guide CUE, telling it what to do next.

In this post we’ll introduce enrichment services, by showing how they could be used to solve a particular business requirement, including the complete source code and the technical details behind how it all works. In part 2 of this series on enrichment services, we explore the possibilities for enrichment services making changes on behalf of the user.

The business requirement: Verification before publishing

The story goes as follows:

It should not be possible to publish a story with less than three tags. An error message should be shown if the user tries to publish such a story.

This might seem like a complicated thing to implement, requiring lots of client side development and the knowledge of lots of custom JavaScript APIs, but in fact CUE already does the heavy lifting. All we need to do is tell CUE what to show and when, by telling CUE about an enrichment service we'll create.

What is an enrichment service?

An enrichment service is an HTTP endpoint that accepts atom entries using POST. When CUE is told about an enrichment service, CUE "saves" the story being edited, but to the enrichment service URL instead. When CUE gets the response, it will then do whatever the enrichment service tells it to do. To make it more concrete, if you're editing a story about the world cup, and you trigger the enrichment service, CUE will send the entire story using HTTP POST to the enrichment service. The enrichment service's response could then tell CUE to open up an error dialog box or an informational message, or even modify the story in the editor.

Here's an example of such a POST

POST /my/enrichment-service HTTP/1.1
Content-Type: application/atom+xml

<entry xmlns="....">
  <title>My story</title>
  ...

This atom entry would be more or less identical to the one you see when you access the web service directly. The content type, all of the meta-data, all of the fields, and related items, tags and so on will be there, giving the enrichment service abundant information about the story being edited.

The enrichment service can operate in a few modes:

This list isn't exhaustive: The enrichment service can also respond in ways that tell CUE to open up a browser tab with a specific URL, or to open up another CUE editor, all using a declarative syntax. In addition, the enrichment service may respond with 204 NO CONTENT, indicating that nothing further should happen, or 400 BAD REQUEST telling CUE to stop doing what it was doing, along with a message.

How is an enrichment service triggered?

Enrichment services are, at the moment, configured in CUE using the top level enrichmentServices property. An enrichment service is configured using a trigger, and CUE will have a documented list of possible triggers. Triggers range from completely manual triggers, for example a button in the UI somewhere, to plain background jobs that trigger "every now and then" or maybe when certain fields change. It's also possible to get CUE to trigger enrichment services just before saving in particular states. This is the type of trigger we'll be using in our example.

Verification before publishing

Let's circle back to our business requirement:

It should not be possible to publish a story with less than three tags. An error message should be shown if the user tries to publish such a story.

We now know that

All we need to do now, is make a little enrichment service that checks if the story has the required number of tags. The response could either be 204 or 400, the latter including a plain text error message to show the user:

400 BAD REQUEST
Content-Type: text/plain

Your story doesn't include enough tags, please add some tags and try again!

Granted, this isn't the best way to handle this particular problem, but it shows how enrichment services can extend CUE's functionality without having to deal with the nitty gritty of GUI programming.

Let's build this enrichment service. I'll be using PHP for this example, as it is easy to follow and self contained. To start off, it's a good idea to add some rudimentary CORS pre-flight request handling.

<?php

if ($_SERVER['REQUEST_METHOD'] == "OPTIONS") {
  http_response_code(200);
  header("Access-Control-Allow-Origin: *");
  return;
}

With that out of the way, we now need to check that the request is indeed a POST and that we're getting an Atom entry:

if ($_SERVER['REQUEST_METHOD'] != "POST") {
  echo "Go away";
  http_response_code(405);
  return;
}

if ($_SERVER['HTTP_CONTENT_TYPE'] != "application/atom+xml") {
  echo "I only speak atom";
  http_response_code(415);
  return;
}

Next we need to grab the actual POST data and parse it using an XML parser.

$xmlData = simplexml_load_string(file_get_contents('php://input'));

Now we need to check if the story has three tags. The tags are in the com.escenic.tags field of the story, nested in a <vdf:list> as shown below:

<vdf:field name="com.escenic.tags">
  <vdf:list>
    <vdf:payload>
      <vdf:field name="tag">
        <vdf:origin href="https://server/webservice/escenic/classification/tag/tag:topics.example.com,2011:My-tag"/>
        <vdf:value>My tag</vdf:value>

It is enough to count the number of <vdf:payload> elements in the list. First off we need the XML namespace prefixes for atom and vdf:

$xmlData->registerXPathNamespace("atom", "http://www.w3.org/2005/Atom");
$xmlData->registerXPathNamespace("vdf", "http://www.vizrt.com/types");

We need to count the number of <vdf:payload> elements in the com.escenic.tags field. The XPATH for this is /atom:entry/atom:content/vdf:payload/vdf:field[@name="com.escenic.tags"]/vdf:list/vdf:payload, so we just need to count them and check if they're the right number. If there are less than three tags, return a 400 error (which will stop CUE from saving), otherwise return a 204 response will tell CUE that this enrichment services is done.

if (count($xmlData->xpath('/atom:entry/atom:content/vdf:payload/vdf:field[@name="com.escenic.tags"]/vdf:list/vdf:payload')) < 3) {
  http_response_code(400);
  header("Content-Type: text/plain");
  header("Access-Control-Allow-Origin: *");
  echo "Stories should have three or more tags.  Please add some tags and try again!";
}
else {
  http_response_code(204);
  header("Access-Control-Allow-Origin: *");
}

Finally, we threw in a CORS header for good measure.

The full PHP file is shown below:

<?php

if ($_SERVER['REQUEST_METHOD'] == "OPTIONS") {
  http_response_code(200);
  header("Access-Control-Allow-Origin: *");
  return;
}

if ($_SERVER['REQUEST_METHOD'] != "POST") {
  echo "Go away";
  http_response_code(405);
  return;
}

if ($_SERVER['HTTP_CONTENT_TYPE'] != "application/atom+xml") {
  echo "I only speak atom";
  http_response_code(415);
  return;
}

$xmlData = simplexml_load_string(file_get_contents('php://input'));

$xmlData->registerXPathNamespace("atom", "http://www.w3.org/2005/Atom");
$xmlData->registerXPathNamespace("vdf", "http://www.vizrt.com/types");


if (count($xmlData->xpath('/atom:entry/atom:content/vdf:payload/vdf:field[@name="com.escenic.tags"]/vdf:list/vdf:payload')) < 3) {
  http_response_code(400);
  header("Content-Type: text/plain");
  header("Access-Control-Allow-Origin: *");
  echo "Stories should have three or more tags.  Please add some tags and try again!";
}
else {
  http_response_code(204);
  header("Access-Control-Allow-Origin: *");
}
?>

Test run

To try this out on your own machine, you can run the follwing command from the directory that contains your tags.php file:

php -S localhost:8080

Then to try it out:

$ curl http://localhost:8080/tags.php
Go away
$ curl -X POST -d 'foo' http://localhost:8080/tags.php
I only speak atom
$ curl -X POST -d '<fake/>' -H "Content-Type: application/atom+xml" http://localhost:8080/tags.php
Stories should have three or more tags.  Please add some tags and try again!

the PHP process shows a simple access log.

[Wed Aug 17 11:23:22 2016] 127.0.0.1:58931 [405]: /tags.php
[Wed Aug 17 11:23:26 2016] 127.0.0.1:58932 [415]: /tags.php
[Wed Aug 17 11:23:33 2016] 127.0.0.1:58934 [400]: /tags.php

Configuring in CUE

To configure this in CUE, you need to tell it that the enrichment service should trigger before a story is published. This is done by opening up your CUE configuration yaml file (often called config.yml, located in /etc/escenic/cue-web-2.0/), and add the following:

enrichmentServices:
  - name: Three tags
    href: http://server:8080/tags.php
    title: Check minimum tags
    triggers:
     - name: before-save-state-published
       properties: {}

If you now regenerate the CUE configuration (sudo dpkg-reconfigure cue-web-2.0), and log in, CUE should no longer allow you to publish stories with less than three tags. This is what should appear instead

Screenshot of a dialog box with the text "Stories should have three or more tags".

A note on CORS

Since CUE is a browser based app, it needs to adhere to the rules of the same origin policy. This means that CUE is by default only allowed to make requests to the same origin as CUE itself resides on. An origin is the scheme, host name and optional port number, for example if CUE is loaded from https://example.com/cue-web/ then https://example.com is its origin. Requests from CUE to other origins are called cross origin requests.

The request to the enrichment service might be a cross origin request if the origin of CUE is different from the origin of the enrichment service. Due to the fact that enrichment services are POST requests, the browser is required to make a pre-flight request to ensure that the resource allows CUE to make the POST request in the first place. This is called Cross-origin resource sharing, or CORS for short. For the request to succeed, you either need to ensure that the origins are the same (i.e. that the browser sees both CUE and the configured enrichment services on the same origin), or that you include sufficient CORS headers (Access-Control-Allow-Origin and Access-Control-Allow-Method) in the enrichment service responses to instruct the browser to allow the request to proceed.

Next steps

As an exercise to the reader, I challenge you to extend the enrichment service so that it considers:

Conclusion

This has been a gentle introduction to enrichment services. This post only scratches the surface of what you can do with them. Enrichment services are a simple way to extend CUE in ways you can only start to imagine. They provide a generic interface to talk to CUE, and through it, your editorial staff.

Using the building blocks provided by enrichment services, you can integrate CUE with anything from editorial planning tools and proofreading systems to social media platforms and text analysis engines. Meanwhile, your editors can drive editorial workflow processes from within CUE, not knowing or caring how many systems are in play.

Enrichment services can be written in any language (as long as it speaks HTTP), and they do not require a large investment in understanding a large API. Being stateless, the enrichment services are easy to test, by simply making HTTP requests and expecting responses. Finally, they can be rolled out, scaled, operated and upgraded independently of each other, and of CUE.