Potential of Generative AI Solutions

Beyond Chatbots: The Potential of Generative AI Solutions

January 23, 2024

Christina Cardoza

GenAI solutions

AI has been making headlines for quite some time already. But as we head into 2024, generative AI emerges as the latest breakthrough. Much of the attention has revolved around ChatGPT, and with it a lot of misunderstanding and misconceptions about exactly what generative AI is.

Breaking it down, we spoke to Waleed Kadous, Chief Scientist at Anyscale, an AI application platform provider; Teresa Tung, Cloud First Chief Technologist at global management consulting and technology company Accenture; and Ramtin Davanlou, AI and Analytics Principal Director as well as CTO in the Accenture and Intel partnership. They discuss the business opportunities around generative AI solutions, the challenges involved, and what comes next (Video 1). Because generative AI is here to stay, and there’s a lot to look forward to.

Video 1. Industry experts from Anyscale and Accenture discuss the implications and opportunities of generative AI solutions. (Source: insight.tech)

Please explain generative AI, the business opportunities, and challenges.

Ramtin Davanlou: In summary, companies like OpenAI, Google, and AWS use their massive compute resources and massive data sets to train AI models—or LLMs, large language models—to generate new content, to build net new knowledge. This content comes in different forms: text, images, video, voice, or even computer code. But text is especially important since it is the main means of communication for most businesses.

Many of these AI models are able to generate responses that are really good on any given topic—better than an average person or even an average expert on that topic. Companies can then take these models and fine-tune them a little bit so that the model behaves in certain ways and gains more knowledge about a specific context. That creates a lot of opportunities.

Companies can use generative AI to do things like send emails or create slides—all of this content we’re creating to communicate with each other—or to enhance those things. This has huge implications for service industries, and also for manufacturing when combined with robotics.

But what LLMs cannot do now, but may soon be able to do, is create net new knowledge from scratch.

What considerations should businesses think about when developing GenAI solutions?

Waleed Kadous: One consideration is the quality of the output from these models. There’s a problem with LLMs called hallucination, where they confidently assert things that are completely untrue. So how do you evaluate to make sure that the system is actually producing high-quality results? What base data are you using? So over the last six months we’ve seen developments in an area called retrieval augmented generation that helps to minimize the hallucination problem.

A second consideration is data cleanliness, which is in regard to the information these LLMs have access to. What are they disclosing? What do they have the power to inform people of? Is there leakage between different users? Can someone reverse-engineer the data that was used to train the models? It’s still a new frontier, so we’re still seeing issues that crop up in that front.

And then the final one is that LLMs are expensive. I mean, really expensive. You can easily blow a hundred thousand dollars in a month on GPT-4.

How can businesses get started and take GenAI solutions to the next level?

Teresa Tung: Most companies have their proofs of concept, and many are starting with managed models like OpenAI. And these amazing general-purpose models address many use cases and can offer a really great way to begin. But, as Waleed mentioned, cost in the long term is a factor; it could be an order of magnitude bigger than many companies are willing to pay. So companies now need to look at rightsizing that cost and rightsizing it for the performance.

When AI models become more critical to a business, we’re also seeing companies want to take ownership of them. Rather than using a managed model, they might want to create their own task-specific, enterprise-specific model. There are sub-10 billion parameter models that can be customized for different needs. There will still be the general-purpose models available but fit-for-purpose models as well.

Waleed Kadous: One of the experiments we did at Anyscale was in translating natural language to SQL queries. The general-purpose model, GPT-4, was able to produce an accuracy of around 80%. But by training an SSM—a small specific model—that was only 7 billion parameters, which was about one one-hundredth of the cost, we were able to achieve 86% accuracy in conversion. So small specific models versus large language models is an evolving discussion that’s happening in the industry right now.

Where are the biggest generative AI opportunities for your customers right now?

Waleed Kadous: The first kind of use case opportunity is summarization. Are there areas where you have a lot of information that you can condense and where condensing it is useful?

The second is the retrieval-augmented-generation family, which I mentioned before. That’s where you don’t just ask the LLM questions naively, you actually provide it with context—with an existing knowledge base of answers—that helps answer those questions.

Another interesting application is what you might call a “talk to the system” application. Imagine it as a dashboard you can talk to, a dashboard on steroids. This is especially interesting in IoT. I’ve seen a company that does this expertly; it does Wi-Fi installations for retailers. You can ask this dashboard questions like, “Where are we seeing the routers working too hard?” And it will query that information in real time and give you an update.

The final one is an in-context application development. Perhaps the best-known one is Copilot, where when you’re writing code, it will give you suggestions about how to write even better, higher-quality code. In-context applications are the most difficult, but they also have the highest potential.

Teresa Tung: Waleed gave a great overview, so I’m going to bring a different perspective—in terms of things you can buy, things you can boost, and things you can build. “Buying” is being able to buy generative AI-powered applications for software development, for marketing, for enterprise applications. They use a model trained on third-party data and enable anyone to capture these efficiencies. This is quickly becoming the new normal.

“Boosting” is applying a company’s first-party data—data about your products, your customers, your processes. To do that you’re going to need to get your data foundation in order, and retrieval-augmented generation is a great way to start with that.

“Building” is companies maintaining their own custom models. This would likely be starting with a pre-trained, open model and adding your own data to it. It gives you a lot more control and a lot more customization within the model.

Where do partnerships like the one Accenture has with Intel come in?

Ramtin Davanlou: Partnerships are very important in this area, because companies that are trying to build an end-to-end GenAI application typically have to solve for things including infrastructure and compute resources. For example, you need a very efficient ML Ops tool to help you handle everything you do—from development to managing, monitoring, and deploying the models in production.

“Partnerships are very important in this area, because companies that are trying to build an end-to-end #GenAI application typically have to solve for things” – Ramtin Davanlou, @Accenture via @insightdottech

We’ve used some of the Intel software, like cnvrg.io, an ML Ops tool that allows data scientists and engineers to collaborate in the same environment. It also allows you to use different compute resources across different cloud platforms—like in your on-prem environment, on Intel^® Developer Cloud, and on AWS.

Partnerships are also an effort to reduce the total cost of ownership, especially the cost when you scale. Instead of building new platforms for every new use case, why not build a platform that you can reuse? For example, with Intel we have built a generative AI playground using Intel Developer Cloud along with GaudiTools, an AI accelerator specifically built to fine-tune the models for deep-learning applications. And then for deploying those models in scale, you can use AWS.

Another example is needing a tool to help distribute the workloads. There is a library called TGI from Hugging Face that is very helpful. So you can see that there are a lot of different components and pieces that need to come together so that you can have an end-to-end GenAI application.

Waleed Kadous: Another thing that has come up is the idea of open source—both open-source models and, of course, open-source software. One example that Meta has released is a model called Llama 2 that we’ve seen very, very good results with. It’s maybe not quite on par with GPT-4, but it’s definitely close to GPT-3.5, the model one notch down. There is vLLM out of Berkeley, a really high-performance deployment system; and also Ray LLM. vLLM manages a single machine; Ray LLM gives you that kind of scalability across multiple machines, to deal with spikes and auto-scaling and so on.

We’re seeing a flourishing of open source because not everybody likes entrusting all their data to one or two large companies, and vendor lock-in is a real concern. Also for flexibility: I can deploy something in my data center or I can deploy it in my own AWS Cloud, and nobody has access to it except me.

And for cost reasons—open-source solutions are cheaper. We did a profile of what it would take to build an email-summarization engine, where if you used something like GPT-4, it would cost $36,000, and if you used open-source technologies, it would be closer to $1,000.

We’ve seen a lot of interest in open-source models—from startups that tend to be more cost-focused to enterprises that tend to be more privacy- and data-control focused. It’s not that open-source models and technologies are perfect, it’s just that they’re flexible and less expensive. And there is availability of models at every size—from 180 billion down to 7 billion and below. It’s just a really dynamic space right now.

What needs to happen to make generative AI more mainstream?

Waleed Kadous: One of the increasing trends is an effort to make LLMs easier to use. But another is that we haven’t completely worked out yet how to make them better over time. If an LLM makes a mistake, how do you correct it? That sounds like such a simple question, but the answer is actually nuanced. So we’re seeing a massive improvement in the evaluation and monitoring stages.

And then, so far the focus has been on large language models—text in, text out—because every business in the world uses language. But we’re starting to see the evolution of models that can process or output images as well. Just as there is Llama for text, there’s now LLaVA for video and vision-based processing, even though not every business in the world needs to process images.

What should business leaders be conscious of on the topic of generative AI?

Teresa Tung: Hopefully the takeaway is realizing how easy it is to begin owning your own AI model. But it does start with that investment of getting your data foundation ready—remember that AI is about the data. The good news is that you can also use generative AI to help get that data supply chain going. So it is a win-win.

Ramtin Davanlou: I think regulatory and ethical compliance, as well as dealing with hallucinations and other topics under what we call responsible AI, are the biggest challenges for companies to overcome. And navigating the cultural change that’s needed to use GenAI at scale is really key to its success.

Waleed Kadous: It’s important to get started now, and it doesn’t have to be complicated. Think about it as a staged process. Build a prototype and make sure that users like it. Then come at cost and, to some extent, quality as secondary issues.

And you can give people tools to optimize their own workflows and to improve the LLMs themselves. I think that’s really one of the most exciting trends—rather than seeing GenAI as a substitute technology, seeing it as an augmentative technology that helps people do their jobs better. Help people to use LLMs in a way that makes them feel empowered rather than eliminated.