Snowflake Cortex AI

Snowflake Summit new R.A.G. automated pipeline features you need to learn.

Jun 14, 2024

Welcome back to Snowbits AI Kickstart, your go-to source for leveraging generative AI on Snowflake. Today, we'll explore the best roadmap for building a complete AI application using Snowflake Cortex AI, progressing from basic integration to advanced, scalable solutions. Each level builds on the previous one, culminating in a robust AI-driven application. Let’s get started!

Level 1: Basic AI Scoring

TL;DR: Learn how to integrate simple AI functions like sentiment analysis into your data workflows using Snowflake Cortex AI. This is foundational to understanding how AI can add value to your data processing.

Key Takeaways:
- Familiarize with Cortex AI Functions: Learn the basics of Snowflake Cortex AI and its pre-built functions.
- Implement Basic AI Use Cases: Use simple AI functions for sentiment analysis and anomaly detection.

Example: Sentiment Analysis

------------------------------------------------------------------------
-- Create a table with sample customer feedback:
------------------------------------------------------------------------
CREATE OR REPLACE TEMPORARY TABLE customer_feedback (
    feedback_id INT,
    feedback_text STRING
);

------------------------------------------------------------------------
-- Insert sample data
------------------------------------------------------------------------
INSERT INTO customer_feedback VALUES
(1, 'The product is amazing!'),
(2, 'I am very disappointed with the service.'),
(3, 'Excellent quality and fast delivery.');

------------------------------------------------------------------------
-- Perform sentiment analysis using Cortex AI function:
------------------------------------------------------------------------
SELECT feedback_id, 
    CASE WHEN SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) >= 0 
        THEN 'Positive'
        ELSE 'Negative'
    END AS sentiment
FROM customer_feedback;

------------------------------------------------------------------------
-- Example Output:
------------------------------------------------------------------------

| feedback_id | sentiment |
|-------------|-----------|
| 1           | Positive  |
| 2           | Negative  |
| 3           | Positive  |

Level 2: Natural Language Processing

TL;DR: Leverage Snowflake’s GenAI functions to integrate large language models (LLMs) for tasks like text summarization and generation. This enhances your ability to process and understand natural language data.

Key Takeaways:
- Integrate LLMs in Snowflake: Use Snowflake's GenAI functions to process natural language data.
- Perform Generative Tasks: Generate text and perform summarization.

Example: Text Summarization

------------------------------------------------------------------------
-- Create a table with sample documents:
------------------------------------------------------------------------
CREATE OR REPLACE TEMPORARY TABLE documents (
    doc_id INT,
    doc_text STRING
);

------------------------------------------------------------------------
-- Insert sample data:
------------------------------------------------------------------------
INSERT INTO documents VALUES
(1, 'Snowflake is a cloud-based data warehousing company...'),
(2, 'Artificial Intelligence is transforming the technology landscape...');

------------------------------------------------------------------------
-- Summarize the documents using Snowflake's GenAI LLM:
------------------------------------------------------------------------
SELECT doc_id, SNOWFLAKE.CORTEX.SUMMARIZE(doc_text) AS summary
FROM documents;

------------------------------------------------------------------------
-- Example Output:
------------------------------------------------------------------------

| doc_id | summary                                    |
|--------|--------------------------------------------|
| 1      | Snowflake is a cloud data warehouse...     |
| 2      | AI is transforming technology...           |

Level 3: Intermediate AI Building

Level 3.1: Prompt Engineering

TL;DR: Refine AI responses by adding contextual information to your prompts. This helps you get more relevant and accurate outputs from your AI models. This is used to reduce hallucinations (when your LLM model responds with a made-up or incorrect answer).

Key Takeaways:
- Add Context to Prompts: Prefix context to user queries to improve AI responses.

Example: AI-driven FAQ System Using Prompt Engineering

------------------------------------------------------------------------
-- Create a table for FAQ prompts:
------------------------------------------------------------------------
CREATE OR REPLACE TEMPORARY TABLE faq_prompts (
    prompt_id INT,
    user_query STRING
);

------------------------------------------------------------------------
-- Insert sample data:
------------------------------------------------------------------------
INSERT INTO faq_prompts VALUES
(1, 'How do I reset my password?'),
(2, 'What is the return policy?');

------------------------------------------------------------------------
-- Define the context:
------------------------------------------------------------------------
SET context = 'You are a customer support assistant. Provide helpful and concise responses: ';

------------------------------------------------------------------------
-- Generate responses with context:
------------------------------------------------------------------------
SELECT prompt_id, user_query,
       CONCAT($context, user_query) AS full_prompt
FROM faq_prompts;

------------------------------------------------------------------------
-- Example Output:
------------------------------------------------------------------------

| prompt_id | user_query                 | full_prompt                 |
|-----------|----------------------------|-----------------------------|
| 1         | How do I reset my password?| You are a customer support assistant. Provide helpful and concise responses: How do I reset my password? |
| 2         | What is the return policy? | You are a customer support assistant. Provide helpful and concise responses: What is the return policy?  |

Bakeoff Between Naive and Prompt-engineered Version

Now, let’s try both the naive and prompt-engineered versions. To do a naive, AI-complete function, pass the question to the model, and the model will research the answer from all the information it was trained on. (NOTE: I am showing this example using ‘llama3-8b.’ Check what models are available in your region.

SELECT SNOWFLAKE.CORTEX.COMPLETE('llama3-8b', 'How do I reset my password?');

------------------------------------------------------------------------
-- Example Output:
------------------------------------------------------------------------
"""I'd be happy to help you reset your password! However, I need a bit more information from you. Could you please provide me with the following details:

1. What is the account you're trying to reset the password for (e.g. email, social media, bank, etc.)?
2. Do you have access to the account's recovery email or phone number?
3. Do you have any security questions or answers set up for the account?

Once I have this information, I can guide you through the password reset process.

If you're trying to reset a password for a specific service or platform, you can usually find the password reset option on the login page. Here are the general steps:

1. Go to the login page of the service or platform.
2. Click on the "Forgot password" or "Reset password" link.
3. Enter your username or email address associated with the account.
4. Follow the prompts to reset your password. This may involve answering security questions, providing additional information, or verifying your identity.

If you're still having trouble, feel free to provide more details and I'll do my best to assist you!
"""

Now let’s look at the improvement from using simple prompt engineering:

SELECT 
  SNOWFLAKE.CORTEX.COMPLETE('llama3-8b', CONCAT($context, user_query))
FROM faq_prompts
WHERE prompt_id = 1;

------------------------------------------------------------------------
-- Example Output:
------------------------------------------------------------------------
"""
To reset your password, please follow these steps:

1. Go to our website and click on the "Forgot Password" link at the login page.
2. Enter your email address associated with your account and click "Submit".
3. You will receive an email with a password reset link. Click on the link to proceed.
4. Create a new password and confirm it. Make sure it meets our password requirements (at least 8 characters, including uppercase and lowercase letters, numbers, and special characters).
5. Click "Reset Password" to save your new password.

If you're still having trouble, feel free to reach out to me directly and I'll be happy to assist you further!
"""

This answer is so much better…and it is also still wrong. We haven’t told the LLM how to reset passwords, so it looked up the most common online instructions stored in the model’s training data. It was not very helpful, but seeing the pieces working together was powerful.

Next, we will see a great way to control hallucinations by setting the context from which the model can pull information to answer more accurately.

Level 3.2: Retrieval-Augmented Generation (RAG)

TL;DR: Enhance AI accuracy by combining embeddings with context retrieval. This method, known as Retrieval-Augmented Generation (RAG), improves the relevance of AI-generated responses.

Key Takeaways:
- Create Simple Embeddings: Generate embeddings for your data using Snowflake's embedding model.
- Build RAG Context: Retrieve relevant context for a query.
- Integrate LLM with RAG Pipeline: Use Snowflake Cortex for a complete RAG system.

Example: Creating Embeddings and Building RAG Context

https://quickstarts.snowflake.com/guide/asking_questions_to_your_own_documents_with_snowflake_cortex/index.html#0

Try this to see how RAG can enhance your AI's accuracy and relevance! This is pulled from the Snowflake Quickstart: Build a Retrieval Augmented Generation (RAG) based LLM assistant using Streamlit and Snowflake Cortex.

------------------------------------------------------------------------
-- Create a database and schema for the quickstart
------------------------------------------------------------------------
CREATE DATABASE CC_QUICKSTART_CORTEX_DOCS;
CREATE SCHEMA DATA;

create or replace stage docs ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE') DIRECTORY = ( ENABLE = true );
 

------------------------------------------------------------------------
-- Create a Streamlit app for your RAG pipeline
------------------------------------------------------------------------
create or replace function pdf_text_chunker(file_url string)
returns table (chunk varchar)
language python
runtime_version = '3.9'
handler = 'pdf_text_chunker'
packages = ('snowflake-snowpark-python','PyPDF2', 'langchain')
as
$$
from snowflake.snowpark.types import StringType, StructField, StructType
from langchain.text_splitter import RecursiveCharacterTextSplitter
from snowflake.snowpark.files import SnowflakeFile
import PyPDF2, io
import logging
import pandas as pd

class pdf_text_chunker:

    def read_pdf(self, file_url: str) -> str:
    
        logger = logging.getLogger("udf_logger")
        logger.info(f"Opening file {file_url}")
    
        with SnowflakeFile.open(file_url, 'rb') as f:
            buffer = io.BytesIO(f.readall())
            
        reader = PyPDF2.PdfReader(buffer)   
        text = ""
        for page in reader.pages:
            try:
                text += page.extract_text().replace('\n', ' ').replace('\0', ' ')
            except:
                text = "Unable to Extract"
                logger.warn(f"Unable to extract from file {file_url}, page {page}")
        
        return text

    def process(self,file_url: str):

        text = self.read_pdf(file_url)
        
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size = 4000, #Adjust this as you see fit
            chunk_overlap  = 400, #This let's text have some form of overlap. Useful for keeping chunks contextual
            length_function = len
        )
    
        chunks = text_splitter.split_text(text)
        df = pd.DataFrame(chunks, columns=['chunks'])
        
        yield from df.itertuples(index=False, name=None)
$$;

------------------------------------------------------------------------
-- Build a Vector Store
------------------------------------------------------------------------
create or replace TABLE DOCS_CHUNKS_TABLE ( 
    RELATIVE_PATH VARCHAR(16777216), -- Relative path to the PDF file
    SIZE NUMBER(38,0), -- Size of the PDF
    FILE_URL VARCHAR(16777216), -- URL for the PDF
    SCOPED_FILE_URL VARCHAR(16777216), -- Scoped url 
    CHUNK VARCHAR(16777216), -- Piece of text
    CHUNK_VEC VECTOR(FLOAT, 768) );  -- Embedding Vectors

insert into docs_chunks_table (relative_path, size, file_url,
                            scoped_file_url, chunk, chunk_vec)
    select relative_path, 
            size,
            file_url, 
            build_scoped_file_url(@docs, relative_path) as scoped_file_url,
            func.chunk as chunk,
            SNOWFLAKE.CORTEX.EMBED_TEXT_768('e5-base-v2',chunk) as chunk_vec
    from 
        directory(@docs),
        TABLE(pdf_text_chunker(build_scoped_file_url(@docs, relative_path))) as func;

------------------------------------------------------------------------
-- Create a Streamlit app for your RAG pipeline
------------------------------------------------------------------------
import streamlit as st # Import python packages
from snowflake.snowpark.context import get_active_session
session = get_active_session() # Get the current credentials

import pandas as pd

pd.set_option("max_colwidth",None)
num_chunks = 3 # Num-chunks provided as context. Play with this to check how it affects your accuracy

def create_prompt (myquestion, rag):

    if rag == 1:    

        cmd = """
         with results as
         (SELECT RELATIVE_PATH,
           VECTOR_COSINE_SIMILARITY(docs_chunks_table.chunk_vec,
                    SNOWFLAKE.CORTEX.EMBED_TEXT_768('e5-base-v2', ?)) as similarity,
           chunk
         from docs_chunks_table
         order by similarity desc
         limit ?)
         select chunk, relative_path from results 
         """
    
        df_context = session.sql(cmd, params=[myquestion, num_chunks]).to_pandas()      
        
        context_lenght = len(df_context) -1

        prompt_context = ""
        for i in range (0, context_lenght):
            prompt_context += df_context._get_value(i, 'CHUNK')

        prompt_context = prompt_context.replace("'", "")
        relative_path =  df_context._get_value(0,'RELATIVE_PATH')
    
        prompt = f"""
          'You are an expert assistance extracting information from context provided. 
           Answer the question based on the context. Be concise and do not hallucinate. 
           If you don´t have the information just say so.
          Context: {prompt_context}
          Question:  
           {myquestion} 
           Answer: '
           """
        cmd2 = f"select GET_PRESIGNED_URL(@docs, '{relative_path}', 360) as URL_LINK from directory(@docs)"
        df_url_link = session.sql(cmd2).to_pandas()
        url_link = df_url_link._get_value(0,'URL_LINK')

    else:
        prompt = f"""
         'Question:  
           {myquestion} 
           Answer: '
           """
        url_link = "None"
        relative_path = "None"
        
    return prompt, url_link, relative_path

def complete(myquestion, model_name, rag = 1):

    prompt, url_link, relative_path =create_prompt (myquestion, rag)
    cmd = f"""
             select SNOWFLAKE.CORTEX.COMPLETE(?,?) as response
           """
    
    df_response = session.sql(cmd, params=[model_name, prompt]).collect()
    return df_response, url_link, relative_path

def display_response (question, model, rag=0):
    response, url_link, relative_path = complete(question, model, rag)
    res_text = response[0].RESPONSE
    st.markdown(res_text)
    if rag == 1:
        display_url = f"Link to [{relative_path}]({url_link}) that may be useful"
        st.markdown(display_url)

#Main code

st.title("Asking Questions to Your Own Documents with Snowflake Cortex:")
st.write("""You can ask questions and decide if you want to use your documents for context or allow the model to create their own response.""")
st.write("This is the list of documents you already have:")
docs_available = session.sql("ls @docs").collect()
list_docs = []
for doc in docs_available:
    list_docs.append(doc["name"])
st.dataframe(list_docs)

#Here you can choose what LLM to use. Please note that they will have different cost & performance
model = st.sidebar.selectbox('Select your model:',(
                                    'mixtral-8x7b',
                                    'snowflake-arctic',
                                    'mistral-large',
                                    'llama3-8b',
                                    'llama3-70b',
                                    'reka-flash',
                                     'mistral-7b',
                                     'llama2-70b-chat',
                                     'gemma-7b'))

question = st.text_input("Enter question", placeholder="Is there any special lubricant to be used with the premium bike?", label_visibility="collapsed")

rag = st.sidebar.checkbox('Use your own documents as context?')

print (rag)

if rag:
    use_rag = 1
else:
    use_rag = 0

if question:
    display_response (question, model, use_rag)

Congrats, you built a complete RAG Snowflake Streamlit app!

Level 4: Advanced AI Integration

TL;DR: Develop advanced AI applications by chaining LLMs, fine-tuning custom models, and deploying AI agents. These techniques allow for more sophisticated and autonomous AI solutions.

Key Takeaways:
- Chain LLMs: Use multiple LLMs to process complex queries and generate more refined responses.
- Fine-Tune Custom Models: Train and fine-tune models on domain-specific data.
- Deploy AI Agents: Create AI agents that can handle specific tasks autonomously.

Use Case: Chaining LLMs

Chaining LLMs involves using multiple LLMs sequentially to process complex queries and generate more accurate and refined responses. This approach can improve the quality of outputs, especially for nuanced tasks.

Use Case: Fine-Tuning Custom Models

Fine-tuning involves training AI models on your domain-specific data to enhance their accuracy and relevance. This is crucial for industries with specialized vocabularies or unique data structures.

Use Case: Deploying AI Agents

Deploying AI agents allows you to automate tasks like customer support or data analysis, creating autonomous solutions that enhance efficiency and user experience.

Level 5: [Coming Soon] Snowflake Cortex Hybrid Search

Snowflake is building a hybrid search (coming soon) that will simplify the setup of these advanced use cases by providing production-grade RAG pipelines. This is a game changer, and I put it as level 5, not because it’s more advanced (actually, it’s simpler). I put it last because it represents a quick, accessible way to build production-grade R.A.G. pipelines in minutes.

Overview — Cortex Search Overview Page

https://docs.snowflake.com/LIMITEDACCESS/cortex-search/cortex-search-overview

Cortex Search enables low-latency, high-quality search over your Snowflake data. Cortex Search powers a broad array of search experiences for Snowflake users, including Retrieval Augmented Generation (RAG) applications leveraging Large Language Models (LLMs).

Cortex Search gets you up and running with a vector and keyword-based search engine on your text data in minutes, without worrying about embedding, infrastructure maintenance, search quality parameter tuning, or ongoing index refreshes. This means you can spend less time on infrastructure and search quality tuning and more time developing high-quality chat and search experiences using your data.

Example

------------------------------------------------------------------------
-- Create a Cortex Search Service on a sample customer support
--   transcript dataset:
------------------------------------------------------------------------
CREATE OR REPLACE TABLE support_transcripts (
    transcript_text VARCHAR,
    region VARCHAR,
    agent_id VARCHAR
);

------------------------------------------------------------------------
-- Insert your example transcripts
------------------------------------------------------------------------
INSERT INTO support_transcripts VALUES
    ('My internet has been down since yesterday, can you help?', 'North America', 'AG1001'),
    ('I was overcharged for my last bill, need an explanation.', 'Europe', 'AG1002'),
    ('How do I reset my password? The email link is not working.', 'Asia', 'AG1003'),
    ('I received a faulty router, can I get it replaced?', 'North America', 'AG1004');

------------------------------------------------------------------------
-- Create your search service in one command!
------------------------------------------------------------------------
CREATE OR REPLACE CORTEX SEARCH SERVICE transcript_search_service
  ON transcript_text
  ATTRIBUTES region
  WAREHOUSE = mywh
  TARGET_LAG = '1 day'
  AS (
    SELECT
        transcript_text,
        region,
        agent_id
    FROM support_transcripts
);

These are the only commands you need to use to trigger the building of the search service for your data.

In this example:
- Queries to the service will search for matches in the transcript_text column.
- The TARGET_LAG parameter dictates that Cortex Search Service will check for updates to the base table support_transcripts approximately once per day.
- The columns region and agent_id will be indexed so that they can be returned along with the results of queries on the transcript_text column.
- The column region will be available as a filter column when querying the transcript_text column.
- The warehouse `mywh` will initially be used to materialize the results of the specified query, and each time, the base table is changed.

Grant Usage Permissions

GRANT USAGE ON DATABASE support_db TO ROLE customer_support;
GRANT USAGE ON SCHEMA services TO ROLE customer_support;
GRANT USAGE ON CORTEX SEARCH SERVICE transcript_search_service TO ROLE customer_support;

Query the Service
Query the service via either the Python API or REST API.

------------------------------------------------------------------------
-- Here is an example using the Python API:
------------------------------------------------------------------------
import os
import json
from snowflake.core import Root
from snowflake.connector import connect
# replace with hardcoded values if you wish; otherwise, ensure all values are in your environment.

CONNECTION_PARAMETERS = {
    "account": os.environ["SNOWFLAKE_ACCOUNT"],
    "user": os.environ["SNOWFLAKE_USER"],
    "password": os.environ["SNOWFLAKE_PASSWORD"],
    "role": os.environ["SNOWFLAKE_ROLE"],
    "database": os.environ["SNOWFLAKE_DATABASE"],
    "schema": os.environ["SNOWFLAKE_SCHEMA"],
}

svc = os.environ["SNOWFLAKE_CORTEX_SEARCH_SERVICE"]

# create a SnowflakeConnection instance
connection = connect(**CONNECTION_PARAMETERS)

# Replace with your search parameters
query = "internet issues"

columns = ["transcript_text", "region"]
limit = 1

try:
    # create a root as the entry point for all objects
    root = Root(connection)
    response = (
        root.databases[CONNECTION_PARAMETERS["database"]]
        .schemas[CONNECTION_PARAMETERS["schema"]]
        .cortex_search_services[svc]
        .search(
            query,
            columns,
            limit=limit
        )
    )

    print(f"Received response with `request_id`: {response.request_id}")
    print(json.dumps(response.results, indent=4))

finally:
    connection.close()

------------------------------------------------------------------------
-- Sample Successful Query Response:
------------------------------------------------------------------------

{
  "results": [
    {
      "transcript_text": "My internet has been down since yesterday, can you help?",
      "region": "North America"
    }
  ],
  "request_id": "5d8eaa5a-800c-493c-a561-134c712945ba"
}

Cortex Search Services returns all columns specified in your query's column field. I hope you enjoyed this sneak peek into the Cortex Search service. As the details can change from its current beta stage of LimitedAccess to when it goes General Access, please check back soon for more examples of how to use this promising new feature.

Conclusion

Deploying AI and LLMs using Snowflake Cortex AI can transform your data workflows, making them more intelligent and efficient. Following this roadmap, you can incrementally enhance your AI capabilities from basic integration to advanced, scalable AI-driven applications. Stay tuned for more tips and tutorials to keep your AI journey on track!

Join our community of data professionals and founders building AI-powered data products. Share your experiences, challenges, and successes as we innovate with AI.

Don’t miss out on future posts—subscribe to Snowbits AI Kickstart for your weekly dose of AI knowledge and insights!

Snowbits AI Kickstart