In the field of artificial intelligence, the success of a model is not only measured by its ability to solve a specific problem, but also by its reliability and the consistency of its performance over time. As AI is increasingly integrated into critical applications, continuous model monitoring has become an essential element. LangSmith is a platform that provides developers with all the tools they need for problem determination or performance analysis on their applications integrated with NPL.
In this article, we will see how it is possible to integrate LangSmith into any LangChain application and we will analyze the potential of the tool.
How does LangSmith work?
LangSmith is based on two basic concepts: logging and tracing. These two elements combined allow you to have a deep insight into the behavior of the model in order to monitor every decision made to quickly identify any anomalies or unexpected behaviors.
The term “traces” refers to detailed records of the operations performed internally by the system. These traces provide information about how the program behaved based on the context (data provided as input), allowing developers to analyze the internal functioning, identify any anomalies or evaluate its performance.
In the case of NPL models, traces can record the decisions made during processing. This includes not only the final output, but also all the intermediate stages of the decision-making process.
Prerequisites
For this tutorial we will start from the code that we have already implemented in our previous article on LangChain in which we saw how to generate structured outputs. This very simple program uses the GPT model to generate posts to be published on social media.
For convenience, we report the code below:
from langchain_openai import ChatOpenAI
from langchain_core.pydantic_v1 import BaseModel, Field
class SocialPost(BaseModel):
"""Posts for social media"""
tags: str = Field(description="Post tags")
text: str = Field(description="Plain text of the post")
llm = ChatOpenAI(model="gpt-3.5-turbo")
structured_llm = llm.with_structured_output(SocialPost)
response = structured_llm.invoke("Can you write a post about a beach holiday?")
print(response)
Before continuing with the tutorial, you need to register on the LangSmith console: https://smith.langchain.com. The tool can be used for free as long as you do not exceed the maximum limit of 5k traces per month. For more information on costs, you can consult the following page: https://docs.smith.langchain.com/pricing.
Configuring LangSmith
Fortunately, LangChain is natively integrated with LangSmith so to start collecting metrics you just need to configure the following environment variables:
export LANGCHAIN_API_KEY='...'
export LANGCHAIN_TRACING_V2='true'
export LANGCHAIN_PROJECT='social-post'
The value of the LANGCHAIN_API_KEY
variable must contain the key to authenticate to LangSmith which can be generated from the web console by clicking on the “Settings” menu item and then accessing the “API Keys” section:
API Keys can only be read and saved when they are created. Later, it will be possible to delete and recreate them, but it will no longer be possible to view the key code.
Using the LANGCHAIN_TRACING_V2
variable, it will be possible to enable and disable the tracing functionality. To start using the tool, it is therefore necessary to set it to “true”.
Finally, the LANGCHAIN_PROJECT
variable is optional and is used to specify the name of the project for which the traces are being collected. If not valued, it will assume the “default” value.
Once these three variables have been valued, it will already be possible to start collecting data by running the python program.
Trace analysis
By accessing the LangSmith console, it will be possible to consult the traces that will be collected within the previously defined project.
By clicking on the “Projects” tab and then selecting the project to analyze, it will be possible to view and consult the individual traces:
In this screen, for each track, some very useful informations are available:
- Input/Output
- Execution time (Latency)
- Tokens consumed
- Cost estimate
It is also possible to view the aggregate information on the total in the left bar.
If the trace is composed of multiple operations performed in sequence, it will be possible to view the details of each single step by navigating them using the menu on the right.
The @traceable decorator
For more precise monitoring, it is possible to isolate some portions of code using the @traceable
decorator to obtain a detailed trace of the executions. For example:
from langchain_openai import ChatOpenAI
from langchain_core.pydantic_v1 import BaseModel, Field
from langsmith import traceable
class SocialPost(BaseModel):
"""Posts for social media"""
tags: str = Field(description="Post tags")
text: str = Field(description="Plain text of the post")
llm = ChatOpenAI(model="gpt-3.5-turbo")
structured_llm = llm.with_structured_output(SocialPost)
@traceable
def invoke_llm():
response = structured_llm.invoke("Can you write a post about a beach holiday?")
return response
@traceable
def print_output(output):
print(output)
@traceable
def execute():
response = invoke_llm()
print_output(response)
execute()
The result will be as follows:
As you can see from the image, every time the decorator was used, an ad-hoc trace was recorded inside the stack. In this specific case, you can also appreciate the possibility of defining a hierarchy between the traces (for example, inside the “execute” function, both the “invoke_llm” and the “print_output” functions were invoked). In this way, you can easily identify what generated anomalies or possible performance bottlenecks.
With its advanced debugging tools and a user-friendly interface, LangSmith is positioned as one of the most promising platforms in the field of model monitoring and allows you to produce reliable and quality software.