Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

IBM TechXchange

watsonx Community Webinar


Scaling generative AI with watsonx- The
power of choice and flexibility

May 2, 2024 (11AM EST)

Steven Sawyer Luv Aggarwal Angela Jamerson


Senior Product Manager Global Sales Leader Program Director, Product Mgmt
watsonx.ai watsonx.ai watsonx.ai

Join the watsonx community


2
Agenda
watsonx.ai Model POV

• Enterprise client needs

• IBM Generative AI Technology stack incl. watsonx

• IBM's differentiated approach to delivering enterprise-grade foundation models

• watsonx model library : What's new?

• Granite models

watsonx.ai Client Stories

• Use Cases

• Win Stories

watsonx.ai Platform Updates

• New tooling releases

• A look head

3
watsonx.ai Model PoV

4
Surpassed Driver of growth Governance, Risk
2024, the experimental and competitive and Compliance
breakout-year phase differentiation are key concerns
for enterprise
Generative AI 55% 75% 58%
adoption of organizations are
already in piloting or production
of CEOs believe generative AI is
a source of competitive
of business executives think
major ethical risks abound with
mode with generative AI, advantage and 50%2 are now generative AI3
reveals recent Gartner poll1 integrating the tech into their
products and services.
The Challenge

Scale and operationalize ... while mitigating foundation


generative AI faster… model related risk and
governance issues

CEOs say they feel over 6x more pressure from their 79% say of business executive say AI ethics is important to
boards and investors to accelerate generative AI their enterprise-wide AI approach.
adoption rather than to slow it down.
Client Needs Enterprise-grade Trusted platform to scale Reliable partner with
foundation models AI with confidence deep AI expertise
As clients move from
‘Trust’ in AI models Model customization Enterprise AI leader
exploration to investigation and
Hinges on transparency in data Tailor models with proprietary data and A partner who has a successful
production with generative AI, management, methodical training expertise to unique use cases, company track-record of bringing AI
they are looking for the right procedures and rigorous evaluation and industry domain, with easy technologies and solutions to
model choices, robust platform standards. integrations to ready-to-use AI apps. Fortune 500 clients and partners.
to infuse AI into applications,
and a reliable partner who can ‘Performance’ Robust governance Champion of Responsible AI
Deliver optimal performance Make AI safe and secure at scale with Thoughtfully applies AI Ethics,
help scale and operationalize AI
measures for accuracy and latency AI guardrails, continuous risk privacy and regulatory preparedness
with minimal risks. with targeted enterprise business monitoring and integrated governance. across the generative AI lifecycle.
domains and use cases.
Flexible deployment
'Cost-Effective' Work with the infrastructure of choice
Achieve lower inferencing costs and with hybrid multi-cloud and on-prem
total cost of ownership while options to avoid vendor lock-in and
meeting performance requirements. reduce total cost of ownership.
IBM's differentiated approach to delivering enterprise-grade foundation models

Open Trusted Targeted Empowering


→ Bring best-in-class IBM and → Train models on trusted and → Designed for the enterprise → Empower clients with
popular open-source models governed data for confident and optimized for targeted competitively priced model
to our watsonx foundation applications downstream with business domains and use choices to build AI that best
models library. minimal risks. suits their unique business
cases.
needs and risk profiles.
→ IBM is committed to open → IBM Granite models are → IBM Granite models that are
innovation, supporting and trained in accordance with trained on domain-specific, → IBM empowers AI builders to
contributing to open IBM AI Ethics code and enterprise relevant data scale generative AI faster with
communities. integrated governance performs on-par with 3-5x tools for training, validating,
approach. tuning, and bring your own
larger models in accuracy
measures at lower latencies. models (BYOM).
→ Clients benefit from IP
indemnification for IBM-
developed models.
IBM's differentiated approach to delivering enterprise-grade foundation models

Models developed by IBM Research, open source, 3rd


party through open collaborations and partnerships.
OPEN MODEL INNOVATION
IBM Granite models are trained on enterprise relevant
content with data transparency per IBM AI Ethics code.

IBM Research LAB Alignment technique to instruct tune


TRUSTED MODEL ALIGNMENT models with new skills and knowledge.

Models are tested and benchmarked with IBM


TARGETED PERFORMANCE
FM_EVAL datasets that simulate real-world enterprise
BENCHMARKING
Gen AI applications for specific domains /use cases.

Empower clients to put Gen AI to work with watsonx


ENTERPRISE-
EMPOWERING GRADE MODELS
Access to 'Trusted, Performant and Cost-effective'
models purpose built for their enterprise.

Feedback from clients and ENTERPRISE Scale generative AI for targeted business domains
AI communities rolls up PLATFORM
and use cases on watsonx with trust and confidence.
9
IBM watsonx.ai Foundation Models Library – available today
IBM Granite Llama 3 models LAB Aligned models

Model in deprecation
What is IBM Granite ? Trusted, Performant, Cost-effective AI foundation models purpose built for enterprises.

➢ Granite is IBM's flagship series granite-13b-v2 ( English LLM ) granite-20b-multilingual


of LLM foundation models based -chat-v2.1, -instruct-v2
on decoder-only transformer
architecture. 13B parameters in size 20B parameters in size
2.5T tokens of data 2.6 T tokens of Data
➢ Granite language models are
trained on trusted enterprise
data spanning internet,
academic, code, legal and
finance.

(v1 breakdown)

➢ Chat derivative model is optimized for granite-8b-japanese


dialogue use cases and works well with
virtual agent and chat applications. 8B parameters in size
1.6T tokens of Data
➢ Instruct derivative model was designed to
perform well on natural language tasks and
can be customized for specific industries
and domains via prompt-tuning.

11
Trusted

IBM Granite models


were trained to adhere
to AI Ethics code and
governance approach
and optimized with
alignment techniques.
Alignment Bias Harmlessness

Granite-13b-V2.1 outperformed Granite-13b-V2.1 demonstrates granite-13b-v2.1 consistently


much larger models in MT Bench some of the least biased behavior scores high on harmlessness across
Measure, a set of challenging multi- of studied models, according to attack domains, according to AttaQ
turn questions that evaluate model Bias measures in open-ended dataset evaluation [1]
alignment with human judgement [2] Language Generation Dataset
(BOLD) evaluation [1]

[1] Granite Report


[2]Based on IBM Internal Evaluation
Performant
According to IBM's internal benchmark, Granite-13b-chat has been consistently
IBM Research demonstrating improved performance in the past few months.

developed LLM
alignment technique
→ Large-scale Alignment of
chatBots, it is a new training
paradigm for LLMs where IBM
does very large-scale targeted
alignment on granite LLMs

Impact:
→ dramatic improvement in
Granite RAG performance

[1] Internal benchmarks


Performant
ConFinQA[1] is a standard benchmark measuring accuracy of multi-turn numeric
IBM Granite models are reasoning in financial services Q&A.
highly performant for
targeted industry domains
like finance and legal
ConFinQA Evaluation
0.4

0.3

Accuracy
0.2

0.1

[1] ConFinQA Benchmark, Granite Report


Cost Effective Base FM - Inference
Estimated # of Inference Price/1K
Our competitive prices Model Offerings Parameters tokens
watsonx granite-13B 13B $0.0006
help clients keep
granite-20b-multilingual 20B $0.0006
inferencing costs under
granite-8b-japanese 8B $0.0006
control.
llama 2-13B 13B $0.0006
llama 2-70B 70B $0.0018
flan-ul2-20b 20B $0.0050
mt0-xxl-13b 13B $0.0018
mixtral 8X7b 46.7B $0.0006
flan-t5-xxl-11b 11B $0.0018

Pricing as of March 2, 2024


watsonx.ai Use Cases + Client Stories

16
IBM is actively HR, Finance,
Customer Service and Supply Chain IT Operations
engaging with
enterprise clients Customer service HR automation App modernization, migration Threat management
Reduce manual work and Generate code, tune code Reduce incident response
across a broad
Empower customers to find
solutions with easy, automate recruiting, sourcing generation response times from hours to

set of business compelling experiences and nurturing job candidates in real time minutes or seconds

Automate answers Reduce employee mobility Deliver faster Contain potential


domains with 95% accuracy processing time by 50% development output threats 8x faster
Non-exhaustive
Marketing Supply chain IT automation Asset management
Automate source to pay Identify deployment issues, Optimize critical asset performance
Increase personalization,
processes, reduce resource avoiding incidents, optimize and operations while delivering
improve efficiency across
needs and improve cycle times application demand to supply sustainable outcomes
the content supply chain
Reduce content creation Reduce cost per Reduce mean time to Reduce unplanned
costs by up to 40% invoice by up to 50% repair (MTTR) by 50%+ downtime by 43%

Content creation Planning and analysis AIOps Product development

watsonx Use Cases Ex. Enhance digital sports


viewing with auto-generated
Make smarter decisions, focus
on higher value tasks with
Assure continuous, cost-
effective performance and
Ex. Expedite drug discovery by
inferring structure with AI from
spoken AI commentary automated workflows and A. connectivity across applications simple molecular representations

Scale live viewing Process planning data Reduce application Faster and less
experiences cost effectively up to 80% faster support tickets by 70% expensive drug discovery

Find more Client Knowledge worker Regulatory compliance Data platform engineering Environmental intelligence
Stories & Use-Cases on Enable higher value work, Support compliance based on Redesign the approach for data Provide intelligence to proactively
improve decision making, requirements / risks, proactively integration using generative AI plan and manage impact of
Seismic! and increase productivity respond to regulatory changes severe weather and climate

Source: IBM internal data Reduce 90% of text reading Reduce time spent Reduce data integration Increase manufacturing
and analysis work responding to issues time by 30%+ output by 25%
Use case: App Modernization

Edger Finance: Accelerating


the collection and analysis of
investment information with
generative AI

Challenge Solution
A fintech startup and IBM Business Partner The collaboration resulted in the creation of
headquartered in Sweden, Edger Finance aims three AI-assisted processes that are offered
to be the go-to solution that investors can use in Swedish and English and were explored
to navigate the stock market and make better during a four-week minimum viable product
investment decisions. (MVP) pilot:

In 2023, Edger joined the IBM® Fintechx – The first accelerates and simplifies
program and began collaborating with IBM the creation of a CEO summary from
Client Engineering and the IBM Innovation corporations’ quarterly reports.
Studio. The goal for the engagement was to – The second automates the extraction
strengthen the firm’s processes and platform of data points that are within each report. 90%
by piloting generative AI (gen AI). – The third allows investors to interact with
improvement in the turnaround time
the data in the report through a question-
answer chat flow. for quarterly report data extracts

Each assistant relies on IBM watsonx.ai , ~96%


an integrated suite of AI tools designed
for security-rich, collaborative data improvement in time to summarize
management and process automation. The quarterly reports
third assistant also utilizes IBM watsonx
Assistant, a conversational AI platform that
delivers automated self-service support.

Case study 18
Use case: App Modernization

Dun & Bradstreet and IBM collaborate


to bring trustworthy business insights
to fuel responsible generative AI
solutions powered by watsonx

Challenge Solution
Dun & Bradstreet, a leading global provider Dun & Bradstreet and IBM, have
of business decisioning data and analytics, announced a strategic collaboration that
seeks to build AI use cases, implement will bring together Dun & Bradstreet’s
watsonx, and develop applications that Data Cloud and IBM’s watsonx to help
help address employee productivity, organizations responsibly expand their
enhance customer experiences, mitigate use of generative AI. Dun & Bradstreet
business-to-business risks, automate also intends to leverage watsonx for its
workflows, and optimize efficiency. workflows and solutions, supported by
IBM Consulting.

Minutes
instead of days procurement
process using Ask Procurement

Press release

Above copy is an excerpt from


the press release and video. Video
1
Use case: Customer Service

Sicredi: Using generative


AI to support improved
customer service and
employee satisfaction

Challenge Solution
Each support representative at Sicredi is Sicredi chose to partner with IBM® Client
responsible for answering questions on a Engineering to augment its support
wide range of products. When a member representatives’ efforts using generative AI.
reaches out in person or over the phone, Sicredi spent three weeks co-creating the
the support rep is accountable for promptly new assistant with IBM, and then spent 20
and thoroughly resolving their query. Given days testing it. Because the new assistant
the wide range of products they support, is enabled by the IBM watsonx.ai , IBM
these representatives rely on a digital Watson® Discovery and IBM watsonx
assistant to compile information to answer Assistant solutions, Sicredi’s team can
each member query. Given the previous submit a wide range of questions (varying
configuration of the assistant, support in complexity) in natural language. Then 10%–12% Seconds
representatives often needed to escalate a the assistant will query Sicredi’s support improvement in query for new assistant to generate
query to a product specialist in order to get documentation and generate an answer
resolution without escalation an answer
it fully resolved. This contributed to longer within a matter of seconds.
wait times for members and a frustrating
experience for support reps. 8%
Case study
decrease in abandoned
support calls

Client testimonial

Above copy is an excerpt


from the case study. LinkedIn with video 20
watsonx.ai Platform Updates

21
watsonx.ai
Train, validate, tune, and deploy AI models A next generation enterprise studio for AI builders
to train, validate, tune, and deploy generative AI,
foundation models, and machine learning capabilities.

The watsonx.ai components include:

• Foundation Model Library with


IBM and open-source models

• Prompt Lab to experiment with


foundation models and build prompts
for various use cases and tasks

• Tuning Studio to tune your foundation


models with labeled data

• Data Science and MLOps to build


machine learning models automatically
with model training, development, and
visual modeling

Learn What’s New with watsonx.ai


watsonx.ai Supported Model Architecture Types
Bring your own custom foundation model
Model Criteria
Provides greater flexibility in how generative AI solutions are created.
• Must be TGI compatible and tested by IBM
Thousands of potential models.
• Must have a config.json file
Common use cases that will benefit:
• Additional language support Supports
Model Supported
• Leverage a fine-tuned model that caters to industry or business parallel
architecture quantization
domain tensors
type method
(multiGpu)

Note: Available in software as of 4/28, targeted for SaaS in June bloom N/A Yes
codegen N/A No
falcon N/A Yes

Simple process Learning resources gpt_bigcode gptq Yes

• Technical blog gpt_neox N/A Yes


• Tutorial video gptj N/A No
• Product documentation llama2 gptq Yes
mixtral gptq No
Usage limits mistral N/A No
mt5 N/A No
• will not appear in the Tuning Studio mpt N/A No
(target Q3) t5 N/A Yes
• prompt templates will not be
evaluated/tracked by
watsonx.governance (target Q3) 23
watsonx.ai Embeddings API API spec preview for generating embeddings

• Generate embeddings endpoint to produce vector embeddings


based on input strings (learn more)
• REST API and Python SDK accessible
• Initial release on SaaS (4/18/2024), Software in June release
1
IBM Slate models available today (knowledge distilled):
Model Origin Context Dim Price per 1k
length tokens
slate.125m.english.rtrvr IBM 512 768 $0.0001 2
slate.30m.english.rtrvr IBM 512 384 $0.0001
5

Coming soon 3
• Additional API endpoints for similarity search and reranking
• LangChain and additional orchestration framework 4
integration support
• Multilingual Slate embeddings models (Q3), and other model
additions
• Fine-tuning and BYOM support for embedding models 1. IAM authorization token for IBM Cloud
2. String inputs within the request body
Model Origin Context length Dim Price per 1k tokens 3. Embeddings model ID specification
4. Watsonx.ai Project/Spaces ID for resource association
bge-large-en-v1.5 Open Source 512 1024 $0.0001
5. Response:
multilingual-e5-large Open Source 512 1024 $0.0001 a) Embeddings for each input string
b) Token count for consumption tracking
all-MiniLM-L12-v2 Open Source 512 384 $0.0001
watsonx.ai Chat Mode

• New tab in prompt lab

• A chat application to demo and


explore art of possible

• Experiment with different models to


help narrow down selection for your
use case

• Save a notebook to jump start your


application

• Also available as a standalone trial


from the watsonx.ai product page

25
watsonx.ai Taxonomy Explorer

• Displays skills and knowledge


taxonomy in a hierarchical
format

• Easy navigation - click on any


skill to see details of what was
used to train the base model.

• Provides a degree of
transparency to the model’s
training and potential behavior

• Available on the model card for


IBM granite 13b chat and other
LAB trained models

26
Looking ahead…

• IBM opensource models (more on the way!)

• Tuning enhancements

• Additional RAG support (e.g. chat with documents)

• Enhanced AI-builder / developer experience (e.g. node.js support, additional SDK integrations)

• Exciting partnerships for multi-modal and additional multi-lingual models

• Exciting new LAB features from IBM Research – Announcements coming on May 6th and at THINK

• Data center expansions

IBM Software / © 2023 IBM Corporation / IBM


Confidential 27
watsonx.ai
Demo
Try the chat demo,
explore foundation models

https://www.ibm.com/events/think

https://www.ibm.com/community/ibm-
techxchange-conference/

28

You might also like