Using the openai client with paradigm, you can leverage the power of asynchronous programming in Python to enhance the speed of your API requests.
Utilizing asyncio
and the AsyncOpenAI
client, you can send multiple requests concurrently, significantly reducing wait time and improving response time for bulk operations.
Asynchronous requests allow for non-blocking operations, making your application more efficient and scalable, especially when dealing with multiple simultaneous API calls.
Prerequisites
- Ensure you have the
openai
library installed with support for asynchronous operations. - An API key stored securely, preferably in an environment variable
PARADIGM_API_KEY
.
Asynchronous Programming with asyncio
asyncio
is a Python library that provides a framework for writing concurrent code using the async
and await
syntax. It is particularly useful for IO-bound and high-level structured network code. You can find more information about Asyncio here.
Setting Up AsyncOpenAI Client
To utilize asynchronous features, use AsyncOpenAI
instead of OpenAI
. Initialize the client with your API key and the appropriate endpoint.
from openai import AsyncOpenAI
import os
# Asynchronously initialize the client
# if you have a private Paradigm deployment,
# switch the base_url argument with your own
async_client = AsyncOpenAI(api_key=os.getenv("PARADIGM_API_KEY"), base_url="https://paradigm.lighton.ai/api/v2")
names = ["Alice", "Bob", "Charlie", "David", "Emma", "Frank", "Grace", "Hannah", "Ian", "Jessica", "Kevin", "Linda", "Michael", "Nancy", "Olivia", "Peter", "Quincy", "Rachel", "Samuel", "Tiffany"]
#as an example we will use this batch of messages
messages_batch = [[{"role": "user", "content": f"Say hello to the new user on Paradigm! Give a one sentence short highly personalized welcome to: {name}"}] for name in names]
Asynchronous API Calls
Here's how to implement asynchronous API calls:
- Define an Async Function for Sending Messages:
Create anasync
function that sends a message to the API and awaits the response.
async def send_message(messages, *args, **kwargs):
response = await client.chat.completions.create(messages=messages, *args, **kwargs)
return response
- Sending Requests Concurrently:
Useasyncio.gather
to send multiple requests concurrently. This function waits for all futures (asynchronous operations) to complete.
async def main():
start_time = time.time()
tasks = [send_message(messages, model="alfred-40b-1123", temperature=0.4) for messages in messages_batch]
responses = await asyncio.gather(*tasks)
duration = time.time() - start_time
print(f"Async execution took {duration} seconds.")
for response in responses:
print(response.choices[0].message.content)
return responses
- Running the Asynchronous Main Function:
Useasyncio.run()
to execute themain
function, which handles all asynchronous operations.
if __name__ == "__main__":
asyncio.run(main())
Comparison with Synchronous Execution
When comparing asynchronous execution to traditional synchronous (sequential) execution, asynchronous operations generally complete in significantly less time—up to 3 times faster in this example, with potential for even greater improvements depending on the lenght of the different requests. This is particularly true for I/O-bound tasks such as API requests. The efficiency gains from asynchronous execution stem from its non-blocking nature, which allows other tasks to proceed without waiting for I/O operations to finish.
- Always use
await
with async functions to avoid runtime errors. - Use
AsyncOpenAI
for asynchronous operations to ensure non-blocking calls. - For Jupyter notebooks, run asynchronous code via a separate Python script using the
!python file_to_execute.py
magic command in a cell to avoid event loop issues.
By incorporating asynchronous requests into your application, you can achieve greater efficiency and scalability, particularly when dealing with large numbers of API calls.
Full example for comparision
To compare synchronous and asynchronous API calls in a practical scenario, you can use the following snippet. This snippet will create a Python file, test.py, implementing synchronous and asynchronous API requests, respectively. You can then run this script to observe the difference in execution time, demonstrating the efficiency of asynchronous programming for batched requests.
%%writefile speed_test.py
import os
import time
from openai import OpenAI, AsyncOpenAI
import asyncio
# Synchronous Client Setup
# if you have a private Paradigm deployment,
# switch the base_url argument with your own
sync_client = OpenAI(api_key=os.getenv("PARADIGM_API_KEY"), base_url="https://paradigm.lighton.ai/api/v2")
# Asynchronous Client Setup
# if you have a private Paradigm deployment,
# switch the base_url argument with your own
async_client = AsyncOpenAI(api_key=os.getenv("PARADIGM_API_KEY"), base_url="https://paradigm.lighton.ai/api/v2")
names = ["Alice", "Bob", "Charlie", "David", "Emma", "Frank", "Grace", "Hannah", "Ian", "Jessica", "Kevin", "Linda", "Michael", "Nancy", "Olivia", "Peter", "Quincy", "Rachel", "Samuel", "Tiffany"]
# Synchronous function to send messages
def sync_send_message(name):
messages = [{"role": "user", "content": f"Say hello to the new user on Paradigm! Give a one sentence short highly personalized welcome to: {name}"}]
response = sync_client.chat.completions.create(messages=messages, model="alfred-40b-1123", temperature=0.4)
return response
# Asynchronous function to send messages
async def async_send_message(messages, *args, **kwargs):
response = await async_client.chat.completions.create(messages=messages, *args, **kwargs)
return response
def sync_main():
responses = []
start_time = time.time()
for name in names:
response = sync_send_message(name)
responses.append(response)
duration = time.time() - start_time
print(f"Synchronous execution took {duration} seconds.")
async def async_main():
start_time = time.time()
tasks = [async_send_message([{"role": "user", "content": f"Say hello to the new user on Paradigm! Give a one sentence short highly personalized welcome to: {name}"}], model="alfred-40b-1123", temperature=0.4) for name in names]
responses = await asyncio.gather(*tasks)
duration = time.time() - start_time
print(f"Asynchronous execution took {duration} seconds.")
if __name__ == "__main__":
async_start_time = time.time()
asyncio.run(async_main())
async_times=time.time() - async_start_time
sync_start_time = time.time()
sync_main()
sync_times=time.time() - sync_start_time
improvement_factor = sync_times / async_times
print(f"Improvement factor: {improvement_factor}")
To execute the comparison:
-
Run the above code snippet in a Jupyter notebook cell to create speed_test.py.
-
Execute the script within the Jupyter notebook or a terminal using the command
!python speed_test.py
.
This script will first run the assynchronous version, outputting the total execution time. It will then run the synchronous version, doing the same. Comparing the two execution times will illustrate the efficiency gains achievable with asynchronous API calls.
In our case we got the following output:
Asynchronous execution took 7.048963308334351 seconds.
Synchronous execution took 19.48901128768921 seconds.
Improvement factor: 2.7641254222771914
Conclusion
Leveraging asynchronous API requests via the AsyncOpenAI client can drastically improve application performance and scalability. As demonstrated, asynchronous execution can be nearly 3 times faster than synchronous methods, offering significant efficiency gains. This approach is essential for handling high-volume API interactions, ensuring applications are efficient.