Skip to main content
By using asyncio and the aiohttp library, you can send multiple requests simultaneously, significantly reducing wait time and improving response time for batch operations.
Efficiency and ScalabilityAsynchronous requests enable non-blocking operations, making your application more efficient and scalable, especially when making multiple simultaneous API calls.

Prerequisites

  • Ensure you have the aiohttp library installed: pip install aiohttp
  • An API key stored securely, preferably in a PARADIGM_API_KEY environment variable.

Asynchronous Programming with asyncio

asyncio is a Python library that provides a framework for writing concurrent code using the async and await syntax. It is particularly useful for high-level structured network code and I/O-bound operations. You can find more information about Asyncio here.

Configuring the Asynchronous HTTP Client

To use asynchronous features, use aiohttp.ClientSession. Initialize the session with your API key and the appropriate endpoint.
import aiohttp
import asyncio
import os

# Retrieve authentication information
api_key = os.getenv("PARADIGM_API_KEY")
base_url = os.getenv("PARADIGM_BASE_URL", "https://paradigm.lighton.ai/api/v2")

names = ["Alice", "Bob", "Charlie", "David", "Emma", "Frank", "Grace", "Hannah", "Ian", "Jessica", "Kevin", "Linda", "Michael", "Nancy", "Olivia", "Peter", "Quincy", "Rachel", "Samuel", "Tiffany"]

# as an example we will use this batch of messages
messages_batch = [[{"role": "user", "content": f"Say hello to the new user on Paradigm! Give a short one-sentence highly personalized welcome to: {name}"}] for name in names]

Asynchronous API Calls

Here’s how to implement asynchronous API calls:
Create an async function that sends a message to the API and awaits the response.
async def send_message(session, messages, model="alfred-4.2", temperature=0.7, max_tokens=150):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }

    async with session.post(f"{base_url}/chat/completions", headers=headers, json=payload) as response:
        data = await response.json()
        return data
Use asyncio.gather to send multiple requests simultaneously. This function waits for all futures (asynchronous operations) to complete.
async def main():
    start_time = time.time()

    async with aiohttp.ClientSession() as session:
        tasks = [send_message(session, messages, model="alfred-4.2", temperature=0.4) for messages in messages_batch]
        responses = await asyncio.gather(*tasks)

    duration = time.time() - start_time
    print(f"Asynchronous execution took {duration} seconds.")

    for response in responses:
        print(response["choices"][0]["message"]["content"])

    return responses
Use asyncio.run() to execute the main function, which handles all asynchronous operations.
if __name__ == "__main__":
    asyncio.run(main())

Complete Example

Here is the complete example code:
import aiohttp
import asyncio
import time
import os

# Retrieve authentication information
api_key = os.getenv("PARADIGM_API_KEY")
base_url = os.getenv("PARADIGM_BASE_URL", "https://paradigm.lighton.ai/api/v2")

names = ["Alice", "Bob", "Charlie", "David", "Emma", "Frank", "Grace", "Hannah", "Ian", "Jessica", "Kevin", "Linda", "Michael", "Nancy", "Olivia", "Peter", "Quincy", "Rachel", "Samuel", "Tiffany"]

# as an example we will use this batch of messages
messages_batch = [[{"role": "user", "content": f"Say hello to the new user on Paradigm! Give a short one-sentence highly personalized welcome to: {name}"}] for name in names]

async def send_message(session, messages, model="alfred-4.2", temperature=0.7, max_tokens=150):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }

    async with session.post(f"{base_url}/chat/completions", headers=headers, json=payload) as response:
        data = await response.json()
        return data

async def main():
    start_time = time.time()

    async with aiohttp.ClientSession() as session:
        tasks = [send_message(session, messages, model="alfred-4.2", temperature=0.4) for messages in messages_batch]
        responses = await asyncio.gather(*tasks)

    duration = time.time() - start_time
    print(f"Asynchronous execution took {duration:.2f} seconds.")

    for response in responses:
        print(response["choices"][0]["message"]["content"])

    return responses

if __name__ == "__main__":
    asyncio.run(main())

Comparison with Synchronous Execution

When comparing asynchronous execution with traditional synchronous (sequential) execution, asynchronous operations generally complete in much less time, with potential for even greater improvement depending on the length of different requests. This is particularly true for I/O-bound tasks like API requests. The efficiency gains from asynchronous execution come from its non-blocking nature, which allows other tasks to continue without waiting for I/O operations to complete.
Best Practices
  • Always use await with async functions to avoid runtime errors.
  • Reuse the same ClientSession for multiple requests to improve performance.
  • Always close the session properly using the async with context manager.
  • For Jupyter notebooks, run asynchronous code via a separate Python script using the magic command !python file_to_execute.py in a cell to avoid event loop issues.
By incorporating asynchronous requests into your application, you can achieve greater efficiency and scalability, particularly when handling a large number of API calls.

Complete Comparison Example

To compare synchronous and asynchronous API calls in a practical scenario, you can use the following snippet. This snippet will create a Python file, speed_test.py, implementing synchronous and asynchronous API requests, respectively. You can then run this script to observe the difference in execution time, demonstrating the efficiency of asynchronous programming for batch requests.
speed_test.py
import os
import time
import requests
import aiohttp
import asyncio

# Configuration
api_key = os.getenv("PARADIGM_API_KEY")
base_url = os.getenv("PARADIGM_BASE_URL", "https://paradigm.lighton.ai/api/v2")

names = ["Alice", "Bob", "Charlie", "David", "Emma", "Frank", "Grace", "Hannah", "Ian", "Jessica", "Kevin", "Linda", "Michael", "Nancy", "Olivia", "Peter", "Quincy", "Rachel", "Samuel", "Tiffany"]

# Synchronous function to send messages
def sync_send_message(name):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    messages = [{"role": "user", "content": f"Say hello to the new user on Paradigm! Give a short one-sentence highly personalized welcome to: {name}"}]

    payload = {
        "model": "alfred-4.2",
        "messages": messages,
        "temperature": 0.4,
        "max_tokens": 150
    }

    response = requests.post(f"{base_url}/chat/completions", headers=headers, json=payload)
    return response.json()

# Asynchronous function to send messages
async def async_send_message(session, messages, model="alfred-4.2", temperature=0.4, max_tokens=150):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }

    async with session.post(f"{base_url}/chat/completions", headers=headers, json=payload) as response:
        data = await response.json()
        return data

def sync_main():
    responses = []
    start_time = time.time()
    for name in names:
        response = sync_send_message(name)
        responses.append(response)
    duration = time.time() - start_time
    print(f"Synchronous execution took {duration:.2f} seconds.")

async def async_main():
    start_time = time.time()

    async with aiohttp.ClientSession() as session:
        tasks = [
            async_send_message(
                session,
                [{"role": "user", "content": f"Say hello to the new user on Paradigm! Give a short one-sentence highly personalized welcome to: {name}"}],
                model="alfred-4.2",
                temperature=0.4
            )
            for name in names
        ]
        responses = await asyncio.gather(*tasks)

    duration = time.time() - start_time
    print(f"Asynchronous execution took {duration:.2f} seconds.")

if __name__ == "__main__":
    async_start_time = time.time()
    asyncio.run(async_main())
    async_times = time.time() - async_start_time

    sync_start_time = time.time()
    sync_main()
    sync_times = time.time() - sync_start_time

    improvement_factor = sync_times / async_times

    print(f"Improvement factor: {improvement_factor:.2f}")
To run the comparison:
  1. Execute the code snippet above in a Jupyter notebook cell to create speed_test.py.
  2. Run the script in the Jupyter notebook or a terminal using the command !python speed_test.py.
This script will first run the asynchronous version, displaying the total execution time. It will then run the synchronous version, doing the same. Comparing the two execution times will illustrate the efficiency gains achievable with asynchronous API calls. In our case, we obtained the following output:
Asynchronous execution took 1.86 seconds.
Synchronous execution took 10.33 seconds.
Improvement factor: 5.55

Error Handling and Timeout

It’s important to add robust error handling for asynchronous API calls:
async def send_message_with_retry(session, messages, model="alfred-4.2", temperature=0.7, max_tokens=150, max_retries=3):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": model,
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }

    for attempt in range(max_retries):
        try:
            async with session.post(
                f"{base_url}/chat/completions",
                headers=headers,
                json=payload,
                timeout=aiohttp.ClientTimeout(total=30)
            ) as response:
                response.raise_for_status()
                data = await response.json()
                return data
        except (aiohttp.ClientError, asyncio.TimeoutError) as e:
            if attempt == max_retries - 1:
                raise
            await asyncio.sleep(2 ** attempt)  # Exponential backoff

Conclusion

Leveraging asynchronous API requests via aiohttp can significantly improve application performance and scalability. As demonstrated, asynchronous execution can be nearly 5 times faster than synchronous methods, offering significant efficiency gains. This approach is essential for handling high-volume API interactions, ensuring application efficiency.