6 minutes reading time
With apologies to Mary Shelley...
Adrian Odds, Marketing & Innovation Director
Dom Baker, Head of Innovation
Rich Iles, Front-End Tech Lead
What have we built, what can it do, and how can I get one?
October 2023 saw the launch of our first AI enabled tool. TalkCorp - A Retrieval Augmented Generation (RAG) tool written in Node that feels like a chatbot but is powered by a Generative Pre-trained Transformer Model (GPT) locked to a pre-determined corpus of information (currently employing the OpenAI API), and if I haven’t lost you already, using Pinecone for Vector retrieval and Azure for blob storage. TalkCorp while not sentient of course, returns human-sounding, sensible, well-formed responses to questions about the work of CDS, our clients and our partners. You can find it here www.cds.co.uk
We started with a simple question. Could we use a GPT as a ‘sales enablement or navigation’ tool to help our website users and sales prospects understand their business problems better, understand the services CDS offers and then help us connect those users with service guides, case studies and technology know-how to help them to begin to solve their problems?
The answer of course was Yes, but it’s well known that the power of GPTs come with several challenges, particularly when working in sensitive or regulatory environments, where security is of particular concern.
We started with a list of requirements and a list of what we knew were the potential challenges of Generative Transformer technology.
Some of the fundamental features and challenges as we saw them:
1. ‘SuperSearch’ - We immediately saw the benefit of using an LLMs ability to understand unstructured content and generate human-sounding, nuanced copy that is pleasant to read - particularly around complex, technical documentation or software user guides. Basically, thinking of it as ‘SuperSearch’ and an access point to existing documents, web pages or other data which may not be typically searchable or obviously identifiable in file systems, website heuristics or architecture.
2. LLMs want to ‘generate’… they basically want to be helpful, but this desire to create can be a problem if the tool doesn’t have the exact answer in its 'corpus' (information set) and can lead to what is known as ‘hallucination’. Along with setting ‘temperature’ which indicates how creative the tool should be, particular attention and limitation to the corpus of interest would be critical as would our approach to prompt engineering which would ensure that the tool only answers questions contained within the context provided.
3. Where is this data going? And who owns the answers? For regulated and sensitive markets, this would be a critical consideration. We needed an answer that would ‘ring fence' our, or our client’s information from the OpenAI (or anybody else’s) algorithm.
How does it work?
In simple terms, our RAG tool vectorises the corpus of information of interest (in this case the CDS website, all our case studies and pillar pages, partner guides and blogs) to provide context for the LLM. This vectorisation turns the words and phrases in the corpus into a string of 1s and 0s. We then vectorise a question as it is asked and can match this new set of vectors to those already in the database which the LLM will use as the context for, and source of, its answer.
The superpower of vectorisation is its ability to match questions and answers by meaning and sentiment rather than simple string matching, opening up a world of possibilities.
TalkCorp doesn’t require exact keyword or string matches so the tool is able to determine sentiment and nuance and using the pre-trained database will return a sensible, pithy, and useful answer to the question asked. If the answer doesn’t exist in the corpus of information, it will return a null rather than hallucinate a ‘helpful, but more often than not, wrong’ answer.
Prior to vectorisation being commonly available, keyword matching would be used to identify correct source content. However, this is a binary activity. If the keyword doesn’t exist, the return will be a null. As our experiment has progressed, we have discovered that vectorisation is not a magic bullet and can also have its limitations, so the ideal approach is probably a combination of the two.
Effectively, using a combination of content context, prompt engineering, and temperature we have told TalkCorp what it needs to know, to best answer the question asked. Think of it as giving a student a workbook and some guide-rails.
We learnt early on, that having the right content in the corpus is certainly critical and extensive testing quickly reveals any content gaps. Where such gaps exist (What does CDS stand for? Who sits on the leadership team of CDS?) pieces of dedicated or specific content can be created and added back into the corpus – classic training by exception, which we’re familiar with from the application of machine learning techniques.
As a reminder you can find the tool here https://www.cds.co.uk/ please do give it a test. Ask it anything about CDS and what we do, and it should return something worthwhile. Ask it anything existential and you’ll probably be a bit disappointed – we’ll leave those clever, slightly mawkish answers to ChatGPT…
Taking this rather ‘locked down’ approach suits our clients in regulated and sensitive markets. It gives us, and them, the control we need to ensure answers are clear and true and that data governance is maintained. Longer term, as the LLM’s improve and new ways of engineering them emerge, we may find limitations to this corpus approach. TalkCorp is already slightly ‘taciturn’ in some of its edge-case responses.
Questions like ‘What do you do?’ can cause it a bit of a problem because of their rather existential nature. As our experiment has matured, we’ve learnt that these questions require an additional step, a rephrasing into several similar but slightly different questions that give the GPT model more to work on and therefore a better chance of returning a correct (and not made up) answer.
“Nothing is so painful to the human mind as a great and sudden change.”
Frankenstein - Mary Shelley
What else have we learned and how can we apply it in support of our clients?
Like many organisations, CDS has been on a journey of discovery over the last few months. A journey not just to figure out what these new models are and how they work, but how they could and should be used. And importantly, what the role for CDS should be in how they are used?
Are we going to build applications ‘over the top’ like many of the new ‘AI companies’ seem to have done? (Some of which I note are already being disrupted themselves…) Are we going to re-engineer the raw models out there to tighten them up, lock them down and bend them to our will? Or are we going to apply the ‘genuinely useful, voice of reason’ paradigm that CDS seeks to bring to every one of our client engagements - specifically as we do with all our technology partners - to de-risk software investment, improve technology acceptance and adoption and drive user value – either in improved efficiency and cost saving, or in better digital experience?
With a technology still so new, many of these questions remain. However, if you too have these same questions, here are 4 ways that CDS can help you to grasp the nettle and figure out rapidly, what this change means for you:
1. AI assessment – Looking at your current operating model we can help you identify your candidate list for optimisation. The answer isn’t always going to be AI, often efficiency improvements can be found in optimisation, UX and journey planning and re-platforming – but its good to empirically know what should go on the list.
2. The answer is AI, what’s the question? The answer isn’t always going to be AI, but sometimes it is. So, if that’s the case, where are you going to apply it first for biggest impact, for smallest cost and risk? We can help you build that roadmap and let you know when AI may not be the answer!
3. The answer is AI – but who’s going to build it? Well, we will. One thing we have learnt ‘doing innovation’ is that there are a lot more people ‘talking’ about innovation than there are ‘doing’ it. Let us know what you are trying to build, and we can probably help – we’ll get you from green-field to proof of concept quick sharp.
4. It Lives! So, your AI implementation is live and running, but how are you going to maintain it, update it, and keep it in tip-top condition. Hand the managed service to CDS of course! But in all seriousness, it’s all very well having something shiny and new on day one, but if we know anything, we know this technology is not going to stand still, so let us take care of your day two onwards...
Reach out to Dom (dom.baker@cds.co.uk) or me (adrian.odds@cds.co.uk) if you’d like to learn more about what we’ve built or have a particular innovation challenge for us to get our teeth into. We’re open for business.