O.RI AF [ AGENTIC FRAMEWORK ]

Overview

The current landscape of agentic frameworks in the industry faces significant challenges, primarily due to inconsistent agent performance and communication bottlenecks between agents. To address these issues, we are developing a novel approach that utilizes routing intelligence to select the most suitable model for each task in the chain steps, coupled with an evaluation layer to ensure optimal output quality. This method contrasts with existing frameworks that rely on a single language model for all tasks without intermediate evaluation. Additionally, the new approach leverages advanced hardware, specifically the ATLAS Cerebras cluster, to dramatically increase inference speed by 7-15 times, reaching up to 1,300 tokens per second. This improvement effectively tackles the communication bottleneck caused by slower inference speeds in traditional setups. To support this enhanced framework, a comprehensive suite of services, tools, and APIs is being integrated, covering areas such as search, social media, image and video processing, web scraping, music, sports, finance, and various utility functions. Furthermore, the ecosystem encourages community contributions through an API Hub and a Confidential Cloud Functions Hub, incentivizing rapid expansion of capabilities while maintaining user privacy and data security.

  • Missing pieces in the current industry agentic frameworks
  • Current agentic frameworks does not work well due to two reasons:
    .iAgent performances in the chain steps are not consistent.
    .iiAgent-to-agent communication is the bottleneck.
  • How we do differently and why our approach works:
    .iCurrent agentic frameworks uses single LLM to handle each task in the chain steps and there is no evaluation layer in between the agent’s output to the next agent task’s input, which cause agent performance to be inconsistent. Here we utilize our routing intelligence that selects the best model to hand each task in the chain steps, plus an evaluation layer, to ensure the each task output in the chain steps has the best quality.
    .iiThe main reason that causes agent-to-agent communication the bottleneck is the inference speed limitation. The average inference speed for a 7b LLM hosted in Nvidia GPU (H100) is 80-200 token/ sec. With our ATLAS Cerebras cluster, we are able to solve this bottleneck by having 7-15x faster inference (1,300 token/sec).

Services Tools APIs

  • Search:  https://exa.ai/ ,  Serp API,   You.com  API
  • Social media: Twitter API, Instagram API, LinkedIn API, TikTok API
  • Image: Image generation API (Stable Diffusion, Midjourney)
  • Video:  Video Summary API , Video generation API, like Pika and Runway
  • Web scrapping: Diffbot API
  • Music: Deezer, Spotify API, Apple Music API, Shazam API
  • Sports:  Sports Data API , LiveSore API
  • Financial: Yahoo Finance API,  Quandl API ,  Bloomberg Open API ,  Tiingo API ,  AlphaSense API 
  • Life/ utility: Weather API, currency conversion API, Google Translate API, Google Map API
  • Natural disaster: Natural Disaster API,
  • News and events:  GDELT 

APIS HUB & Confidential Cloud Functions Hub

This Hub will grow fast because O will reward each new API integration. Anyone in the community can build API integrations for any service in the world and contribute it to the  o.xyz  ecosystem in exchange for $O Coin, triggering a network effect of fast implementations via the community
  • The Confidential Cloud Functions Hub provides secure cloud functions that protect user data and ensure privacy.
  • These confidential functions enable users to perform sensitive operations with confidence, knowing that their data is safeguarded.