Experimenting with Chatbots

Oliver Jack Dean

Much like the rest of the world, I have been spending a significant amount of time exploring the applications of GPT-4 (Generative Pre-trained Transformer).

Naturally, I have fallen into a few "rabbit holes" exploring its capabilities and reading about others' experiences online.

After spending some time recently with a small team exploring and experimenting with Vector based search and GPT for various service-level use cases, here are some personal takeaways:

Economic Opportunities

LLM-based technology provides significant economic opportunities and will continue to grow exponentially over the next 5-10 years in ways we can't be sure about. Will it make economies more productive and efficient? Will it improve overall GDP? Too difficult to tell.

The Social Equalizer

AI is, in many ways, a great social equaliser. Previously, running large models would require ten containers, 10 data scientists, extensive domain knowledge and high salaries. Now, so much can be accomplished by somebody with an internet connection and some motivation. GPT is a huge technological breakthrough. All that heavy machinery has been abstracted away.

Not Production Ready

But let's be clear, LLM-based tech is excellent (right now) for experimentation but not for live production enviornments. Technology always starts as a toy, but it can quickly become like the next Twitter if deployed at an insane scale and with limited guardrails in place. Best to experiment first, using a "toy-to-deploy" cycle approach.

Narrow LLMs

There are "very, very large" LLMs that are good at a broad range of tasks and smaller models that are more narrow and specialised.

Prompt Engineering Oversight

As a result of the ambiguous nature of natural languages, and partially because prompt engineering is still immature as a practice, LLM performance limitations are real and should not be overlooked.

Rapid Ecosystem Progress

Progress in AI is continual, and there is no one moment where LLMs "progress from not happening to happening". Everything right now is in flux, and the speed of maturity is incredible to watch.

Armchair Experts

Approach "experts" with caution. Unless you are in the boiler room with core product teams and experienced data scientists, the so-called "experts" across Twitter and different markets may lack oversight and have limited complete visibility regarding best practices and ecosystem readiness. The knowledge that will be most important to have established and standardised to implement AI effectively still needs to be learned.

Bad vs Good Data

LLMs are only as good as the data or information ingested. Finding high-quality data or information takes a lot of work. Be prepared to discover new things about the incorrect data and have to come up with mechanisms to convert wrong into good.

Level Playing Ground

Luckily, as is often the case with new technology, there is no organisational advantage because everyone starts from the same level. Sure, the big three or four tech companies are the gatekeepers right now, but we are all thinking about the strategy and positioning for the rest of us. But as mentioned above, best to experiment and approach the topic carefully.

Synthetic Data

What do you do if you run out of good data or have no data to run an LLM or embed context into an LLM? It's difficult to say right now, without experience, the best approach is to follow examples out there whereby LLMs utilise "synthetic" data.

Benchmarking Standards

Difficult to honestly evaluate or quality check LLMs. OpenAI is working to improve its understanding of how an LLM works. Fingers crossed, a framework and approach will arrive soon.

Monopoly Risk

Huge potential for prominent big tech players to horde GPT/LLM tech and future services. Essential to support and facilitate startups and innovators in the space, such as Mosaic, who are building tools to give a wide range of users access without any strings attached. Suppose LLMs and the surrounding ecosystem were to become centralised under one or two, or three players? This will create a massive power dynamic and political economy over AI.

Emerging Tech Stack

Most popular use cases for LLMs seem to be Q&A style discussions over embedded data such as PDFs, Markdown Files, Powerpoints, and other corpus data. The approach uses different hybrid search tactics, like Vector based search and Generative Search. The emerging tech stack (as of right now) for Q&A-style applications looks something like this:

1. Transform knowledge base(s) into an "embedding" (i.e. PDF > text)

2. Host "embedding" on Vector/Graph/SQL DB service (i.e. Pinecone DB).

2. Search the "embedding" across the Vector DB via Backend/API.

3. Summarise the results recursively with the LLM until query+summary fit into the LLM's token length.

4. Use LLM to "get" an answer + return it in Frontend.

Whatever is embedded dictates the quality of the returned answer.

Of course, there are many other use cases right now, like AI Assistants, Programming, Scripting, Audio-to-Text, Text-to-Audio, Text-to-Video, Content Summarization, Transcripting, Content Writers, Content Composers Etc.

Performance vs Cost

Customers have different priorities regarding performance and cost, depending on application and approach. Large enterprises with significant user traffic must carefully consider the tradeoffs. No doubt, LLM providers like OpenAI and Mosaic will need to create new tools for enterprises to help analyse such tradeoffs accurately.

Theory of Mind

GPT and other like tech have opened up old-age debates and discussions around consciousness and the "theory of mind". So, if you talk to a human being they have a theory of physical space, they have a theory of New York City, they have a theory of a particular conversation, they have a theory of the interactions and context we are sharing. Interestingly, GPT-4 and other like models also demonstrate such capabilities. Where will this lead us? No idea.

Internet of Things

IoT has been a stop/start area of interest for many and GPT and other LLMs will unlock new investments across IoT. In particular, Chatbots will be active in future IoT projects. It comes with risk and will take some time before LLMs are leveraged in highly-sensitive environments or platforms, but IoT will benefit tremendously from LLMs and GPT.

Productivity Booster

LLMs and GPT is significantly impacting productivity, and enterprises and companies alike need to be aware of this potential power if they want to take advantage of it. It's a big bet, but I think most employees have been using GPT for some specific tasks in secret. Sure, the technology is familiar and has been circuling around academic institutions for a while but the way it is being used is new, and all of us need to be prepared to adapt. How do companies and enterprises do this? It's up to them to start consolidating resources, generating knowledge, and experimenting.

Security Threats

Attackers can easily tamper with the training data or tinker with the prompts used to create LLMs. In other words, it could affect a Chatbot LLM's decision-making.

Moreover, if training data is not correctly tracked or observed, LLMs can quickly become "poisoned" without anyone detecting them. Suppose an attacker could gain control over a web resource indexed by a particular data set. In that case, they could go to that web resource and poison the collected data, making it inaccurate and potentially negatively affecting the whole LLM algorithm.

Considering actual GPT and Chatbot use cases have flourished in a short time, far more than crypto and web 3.0 - I am worried about the potential security threats and deploying such LLMs "in the wild".

DISCLAIMER: I doubt my opinions will evolve over the coming weeks. So, take my thoughts at face value. See this as part one.