Skip to main content

ARTIFICIAL INTELLIGENCE

Generative AI Solutions: From Hype to Reality

generative AI solutions

Last year was a breakout year for generative AI (GenAI), with businesses in all industries interested in the technology and its advanced capabilities. As we head deeper into 2024, its evolution is expected only to grow.

Organizations have learned over the past year that while general-purpose models are a good place to start, to really capitalize on GenAI solutions, they need more task-specific, custom AI models. And as GenAI solutions mature into production, these custom models will be key to hitting performance, accuracy, and cost targets for a particular deployment.

Healthcare offers a great example of the benefits of these approaches: A general-purpose model might struggle with medical terminology, and that could put patient data at risk. Instead, businesses can create a custom GenAI model that can pinpoint a specific healthcare discipline, anonymize it sufficiently, optimize it for deployment, and so on.

There are several ways to create these custom AI models to solve all kinds of problems, such as fine-tuning a large language model (LLM) or creating a small specific model (SSM). Whatever the approach, there is a growing ecosystem of tools, platforms, and services to streamline the effort.

“When these models become more critical to the business, we’re also seeing companies want to take ownership and to control their own destiny,” says Teresa Tung, Cloud First Chief Technologist at Accenture, a global management consulting and technology company.

Tools for Taking GenAI to Market

To help businesses get a jump-start on AI model development, there are solutions like the AI Playground—which came out of a collaboration between meldCX, the University of Southern Australia, and Intel—that gamifies the AI developer learning experience and simplifies getting started with model creation.

On the other end of the spectrum is the Anyscale Platform, which adds enterprise-ready management to the open-source Ray project. The Ray project is a framework for scaling and productionizing AI workloads that provides a robust environment for training, fine-tuning, and inferencing in which some impressive efficiencies can be achieved. In one recent example, Anyscale experimented with translating natural language to SQL queries and achieved similar performance to GPT-4 with a much smaller model of only 7 billion parameters—which equates to about 1/100th of the cost.

The Ray project is just one example of a trend toward open-source GenAI tooling that will accelerate in 2024. Others include LlaMA 2 from Meta (delivering promising results) and LAVA Realtime AI for video and vision-based processing.

One reason these models get so much attention is that many companies want the flexibility to deploy models in private data centers or cloud environments like Amazon Web Services (AWS). Companies also think more about who owns their models and how to build platforms that can be adapted to multiple industry-specific applications.

“By creating smaller, custom models that are cost- and energy-efficient, these models can help aid domain experts and other stakeholders in completing complex tasks that require retrieving/summarizing information from multiple sources, generating new customer-facing content, brainstorming new ideas, and more,” says Ria Cheruvu, AI SW Architect and Evangelist at Intel.

As #GenAI advances at speeds impossible for any organization to match, collaborative #partnerships will be important to addressing infrastructure and compute resource requirements. @anyscalecompute and @Accenture via @insightdottech

This desire for flexibility highlights the importance of collaboration. As GenAI advances at speeds impossible for any organization to match, collaborative partnerships will be important to addressing infrastructure and compute resource requirements and managing the development, monitoring, and deployment of models through machine learning operations (MLOps). For example, Accenture uses cnvrg.io, an MLOps tool, to facilitate collaboration among data scientists and engineers.

Accenture is also an interesting example because it leverages industry collaborations to help deliver on the promise of cutting-edge technology. For example, Intel and Accenture have come together to create a set of AI reference kits designed to accelerate digital transformation journeys.

Libraries and Tools Optimize GenAI

The platforms and reference kits we’ve looked at so far are just a small sample of a larger trend that will undoubtedly accelerate throughout this year and beyond. The spread of libraries and optimizers is also part of this trend. For example, the Optimum Habana Library and the Optimum Intel Library help make Intel’s deep learning accelerators easily accessible to the Hugging Face open-source community.

In terms of optimization, two noteworthy examples come to mind. On the model creation side, the computer vision AI platform Intel® Geti is designed to create highly accurate vision models with limited input data and computational resources. On the deployment side, the Intel® Distribution of OpenVINO Toolkit compresses AI models to a size that’s suitable for edge computing.

As AI development increasingly focuses on cost reduction, there will be increased use of tools like these within collaborative supplier ecosystems that enable comprehensive GenAI applications.

The Big GenAI Challenges for 2024

Although there are many reasons to be excited about the future of GenAI, there are also major challenges ahead. First and foremost, consumers and customers alike can be skeptical of AI, so it is critical to avoid delivering disappointing solutions. One of the biggest problems is the difficulty of eliminating “hallucinations” where models generate false or irrelevant responses.

The issue is particularly important for applications with regulatory or ethical implications. Developers continue to face difficulties aligning their models with regulatory standards and ethical norms. And it’s worth pointing out that this is not just a moral issue: GenAI systems have the potential to violate laws and cause real harm.

There isn’t a complete solution for these challenges today, but developers should look to responsible AI practices that are beginning to emerge. Among other things, they should strive to develop and use AI in a manner that is ethical, transparent, and accountable. As a practical example, an AI can be built to explain how it makes decisions.

Of course, responsible AI is not just a technical matter; it also implies cultural and societal change. At a company introducing AI into its workflows, for example, this could involve directly addressing concerns about AI’s impact on jobs, training employees to work with AI, or adjusting business processes, among other steps.

Exciting Year Ahead

In 2024, the GenAI landscape will be reshaped by increased collaboration and innovation, with a focus on custom model development for specific industry needs. The trend toward open-source projects, the continued growth of tools and platforms, and the growing awareness of ethical concerns will all come together in ways that will continue to surprise us.

“It’s exciting to see the breadth of technologies in this space making it convenient and fast to create optimized AI models that can best fit a business’s needs and values,” says Cheruvu.
 

This article was edited by Christina Cardoza, Editorial Director for insight.tech.

About the Author

Brandon is a long-time contributor to insight.tech going back to its days as Embedded Innovator, with more than a decade of high-tech journalism and media experience in previous roles as Editor-in-Chief of electronics engineering publication Embedded Computing Design, co-host of the Embedded Insiders podcast, and co-chair of live and virtual events such as Industrial IoT University at Sensors Expo and the IoT Device Security Conference. Brandon currently serves as marketing officer for electronic hardware standards organization, PICMG, where he helps evangelize the use of open standards-based technology. Brandon’s coverage focuses on artificial intelligence and machine learning, the Internet of Things, cybersecurity, embedded processors, edge computing, prototyping kits, and safety-critical systems, but extends to any topic of interest to the electronic design community. Drop him a line at techielew@gmail.com, DM him on Twitter @techielew, or connect with him on LinkedIn.

Profile Photo of Brandon Lewis