Generative AI > Create external LLMs with code

Create external LLMs with code¶

The following, designed for use with DataRobot Notebooks, outlines how you can build and validate an external LLM using the DataRobot Python client. DataRobot recommends downloading this notebook and uploading it for use in the platform.

Note: For self-managed users, code samples that reference app.datarobot.com need to be changed to the appropriate URL for your instance.

Setup¶

The following steps outline the configuration necessary for integrating an external LLM with the DataRobot platform.

Verify that the following feature flags are enabled. Contact your DataRobot representative or administrator for information on enabling these features.
- Enable Notebooks Filesystem Management
- Enable Proxy models
- Enable Public Network Access for all Custom Models
- Enable the Injection of Runtime Parameters for Custom Models
- Enable Monitoring Support for Generative Models
- Enable Custom Inference Models
Create a new credential in the DataRobot Credentials Management tool:
- Set it as an "API Token" type credential.
- Set the display name as OPENAI_API_KEY.
- Place your OpenAI API key in the Token field.
Add the notebook environment variables OPENAI_API_BASE, OPENAI_API_KEY, OPENAI_API_VERSION, and OPENAI_DEPLOYMENT_NAME; set the values with your Azure OpenAI credentials.
Set the notebook session timeout to 180 minutes.

Install libraries¶

Install the following libraries:

In [ ]:

Copied!

!pip install "langchain==0.0.244" \
             "openai==0.27.8" \
             "datarobotx==0.1.25"
!pip install "langchain==0.0.244" \
             "openai==0.27.8" \
             "datarobotx==0.1.25"

In [ ]:

Copied!

import datarobot as dr
import datarobotx as drx
from datarobot.models.genai.custom_model_llm_validation import CustomModelLLMValidation
import datarobot as dr
import datarobotx as drx
from datarobot.models.genai.custom_model_llm_validation import CustomModelLLMValidation

Connect to DataRobot¶

Read more about different options for connecting to DataRobot from the Python client.

In [ ]:

Copied!





endpoint = "https://app.datarobot.com/api/v2"
token="<ADD_VALUE_HERE>"
dr.Client(endpoint=endpoint, token=token)
drx.Context(token=token, endpoint=endpoint)
endpoint = "https://app.datarobot.com/api/v2"
token=""
dr.Client(endpoint=endpoint, token=token)
drx.Context(token=token, endpoint=endpoint)

Define hooks for deploying a text generation custom model¶

The following cell defines the methods used to deploy a text generation custom model. These include loading the custom model and using the model for scoring.

In [ ]:

Copied!





import os
import pandas as pd

OPENAI_API_BASE = os.environ.get('OPENAI_API_BASE', "<ADD_VALUE_HERE>")
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "<ADD_VALUE_HERE>")
OPENAI_API_TYPE = os.environ.get('OPENAI_API_TYPE', "azure")
OPENAI_API_VERSION = os.environ.get('OPENAI_API_VERSION', "<ADD_VALUE_HERE>")
OPENAI_DEPLOYMENT_NAME = os.environ.get('OPENAI_DEPLOYMENT_NAME', "<ADD_VALUE_HERE>")

PROMPT_COLUMN_NAME = "prompt"
COMPLETION_COLUMN_NAME = "completion"
ERROR_COLUMN_NAME = "error"

def load_model(*args, **kwargs):
    """Custom model hook for loading our LLM."""
    import os
    from langchain.chat_models import AzureChatOpenAI
    try:
        import datarobot_drum as drum
        api_key = drum.RuntimeParameters.get("OPENAI_API_KEY")["apiToken"]
    except Exception:
        api_key = os.environ.get('OPENAI_API_KEY', '<ADD_VALUE_HERE>')
    
    llm = AzureChatOpenAI(
        deployment_name=OPENAI_DEPLOYMENT_NAME,
        openai_api_type=OPENAI_API_TYPE,
        openai_api_base=OPENAI_API_BASE,
        openai_api_version=OPENAI_API_VERSION,
        openai_api_key=api_key,
        model_name=OPENAI_DEPLOYMENT_NAME,
        temperature=0.4,
        verbose=True,
        max_retries=0,
        request_timeout=20
    )
    return llm


def score(data: pd.DataFrame, model, **kwargs):
    """Custom model hook for making predictions with our llm.

    When requesting predictions from the deployment,
    pass a pandas DataFrame with the PROMPT_COLUMN_NAME column:

    datarobot-user-models (DRUM) handles loading the model and calling
    this function with the appropriate parameters.
    """
    import pandas as pd
    llm = model
    completions = []
    errors = []
    prompts = data[PROMPT_COLUMN_NAME].tolist()
    for prompt in prompts:
        completion = None
        error = None
        try:
            completion = llm.predict(prompt)
        except Exception as e:
            error = f"{e.__class__.__name__}: {str(e)}"
        completions.append(completion)
        errors.append(error)
    return pd.DataFrame({PROMPT_COLUMN_NAME: prompts, COMPLETION_COLUMN_NAME: completions, ERROR_COLUMN_NAME: errors})
import os
import pandas as pd

OPENAI_API_BASE = os.environ.get('OPENAI_API_BASE', "")
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "")
OPENAI_API_TYPE = os.environ.get('OPENAI_API_TYPE', "azure")
OPENAI_API_VERSION = os.environ.get('OPENAI_API_VERSION', "")
OPENAI_DEPLOYMENT_NAME = os.environ.get('OPENAI_DEPLOYMENT_NAME', "")

PROMPT_COLUMN_NAME = "prompt"
COMPLETION_COLUMN_NAME = "completion"
ERROR_COLUMN_NAME = "error"

def load_model(*args, **kwargs):
    """Custom model hook for loading our LLM."""
    import os
    from langchain.chat_models import AzureChatOpenAI
    try:
        import datarobot_drum as drum
        api_key = drum.RuntimeParameters.get("OPENAI_API_KEY")["apiToken"]
    except Exception:
        api_key = os.environ.get('OPENAI_API_KEY', '')
    
    llm = AzureChatOpenAI(
        deployment_name=OPENAI_DEPLOYMENT_NAME,
        openai_api_type=OPENAI_API_TYPE,
        openai_api_base=OPENAI_API_BASE,
        openai_api_version=OPENAI_API_VERSION,
        openai_api_key=api_key,
        model_name=OPENAI_DEPLOYMENT_NAME,
        temperature=0.4,
        verbose=True,
        max_retries=0,
        request_timeout=20
    )
    return llm


def score(data: pd.DataFrame, model, **kwargs):
    """Custom model hook for making predictions with our llm.

    When requesting predictions from the deployment,
    pass a pandas DataFrame with the PROMPT_COLUMN_NAME column:

    datarobot-user-models (DRUM) handles loading the model and calling
    this function with the appropriate parameters.
    """
    import pandas as pd
    llm = model
    completions = []
    errors = []
    prompts = data[PROMPT_COLUMN_NAME].tolist()
    for prompt in prompts:
        completion = None
        error = None
        try:
            completion = llm.predict(prompt)
        except Exception as e:
            error = f"{e.__class__.__name__}: {str(e)}"
        completions.append(completion)
        errors.append(error)
    return pd.DataFrame({PROMPT_COLUMN_NAME: prompts, COMPLETION_COLUMN_NAME: completions, ERROR_COLUMN_NAME: errors})

Test hooks locally¶

Before proceeding with the deployment, use the cells below to test that the custom model hooks function correctly.

In [ ]:

Copied!





import pandas as pd

# Test the hooks locally
score(
    pd.DataFrame(
        {
            PROMPT_COLUMN_NAME: ["What is a large language model (LLM)?"],
        }
	),
    load_model()    
)
import pandas as pd

# Test the hooks locally
score(
    pd.DataFrame(
        {
            PROMPT_COLUMN_NAME: ["What is a large language model (LLM)?"],
        }
	),
    load_model()    
)

Deploy the LLM¶

The cell below uses a convenience method that does the following:

Deploys a text generation external model (LLM) to DataRobot.
Returns an object that can be used to make predictions.

This example uses a pre-built environment. You can also provide an environment_id and instead use an existing custom model environment for shorter iteration cycles on the custom model hooks.

See your account's existing pre-built environments from the model workshop.

In [ ]:

Copied!





deployment = drx.deploy(
    model=None,
    name="External Azure OpenAI LLM",
    hooks={
        "score": score,
        "load_model": load_model
    },
    runtime_parameters=["OPENAI_API_KEY"],
    extra_requirements=["langchain==0.0.244", "openai==0.27.8"],
    environment_id=dr.ExecutionEnvironment.list("Python 3.9 GenAI")[0].id,
    target_type="TextGeneration",
    target=COMPLETION_COLUMN_NAME
)
deployment = drx.deploy(
    model=None,
    name="External Azure OpenAI LLM",
    hooks={
        "score": score,
        "load_model": load_model
    },
    runtime_parameters=["OPENAI_API_KEY"],
    extra_requirements=["langchain==0.0.244", "openai==0.27.8"],
    environment_id=dr.ExecutionEnvironment.list("Python 3.9 GenAI")[0].id,
    target_type="TextGeneration",
    target=COMPLETION_COLUMN_NAME
)

Test the deployment¶

Test that the deployment can successfully provide responses to prompts.

In [ ]:

Copied!





deployment.predict(
    pd.DataFrame({
        PROMPT_COLUMN_NAME: [
            "Give me some context on large language models and their applications?",
            "What is AutoML?"
        ],
    })
)
deployment.predict(
    pd.DataFrame({
        PROMPT_COLUMN_NAME: [
            "Give me some context on large language models and their applications?",
            "What is AutoML?"
        ],
    })
)

Validate the external LLM¶

These methods execute and validate the external LLM.

This example associates a Use Case with the validation and creates the vector database within that Use Case. Set the use_case_id to specify an existing Use Case or create a new one with that name.

In [ ]:

Copied!





use_case_id = "<ADD_VALUE_HERE>"
use_case = dr.UseCase.get(use_case_id)
# UNCOMMENT if you wish to create a new UseCase
#use_case = dr.UseCase.create()
use_case_id = ""
use_case = dr.UseCase.get(use_case_id)
# UNCOMMENT if you wish to create a new UseCase
#use_case = dr.UseCase.create()

CustomModelLLMValidation.create executes the validation of the external LLM. Be sure to provide the deployment ID.

In [ ]:

Copied!





external_llm_validation = CustomModelLLMValidation.create(
    prompt_column_name=PROMPT_COLUMN_NAME, 
    target_column_name=COMPLETION_COLUMN_NAME,
    deployment_id=deployment.dr_deployment.id,
    name="My External LLM",
    use_case=use_case,
    wait_for_completion=True
)
external_llm_validation = CustomModelLLMValidation.create(
    prompt_column_name=PROMPT_COLUMN_NAME, 
    target_column_name=COMPLETION_COLUMN_NAME,
    deployment_id=deployment.dr_deployment.id,
    name="My External LLM",
    use_case=use_case,
    wait_for_completion=True
)

In [ ]:

Copied!

assert external_llm_validation.validation_status == "PASSED"
assert external_llm_validation.validation_status == "PASSED"

In [ ]:

Copied!

print(f"External LLM Validation ID: {external_llm_validation.id}")
print(f"External LLM Validation ID: {external_llm_validation.id}")

This external LLM can now be used in the GenAI E2E walkthrough, for example to create the LLM blueprint.

Create external LLMs with code¶

Setup¶

Install libraries¶

Connect to DataRobot¶

Define hooks for deploying a text generation custom model¶

Test hooks locally¶

Deploy the LLM¶

Test the deployment¶

Validate the external LLM¶

Was this page helpful?

Great! Let us know what you found helpful.

What can we do to improve the content?