arrow-up icon

Performance Engineering Platform

Performance Engineering Platform

Fixstars AI Booster

Install on your GPU server and let it gather runtime data from your AI workloads. It identifies bottlenecks, automatically enhances performance, and gives you detailed insights to achieve even greater manual optimization.

Download Free Now decoration Live Demo

You can download the Quick Start Guide, case studies, and other resources here.

What is Fixstars AI Booster ?

Whether in the cloud or on-premises, simply install Fixstars AI Booster on your GPU servers to gather detailed performance data from active AI workloads, visualizing bottlenecks clearly.

Use these insights to drive performance improvements, creating a continuous cycle of monitoring and optimization—accelerating AI training and inference while significantly reducing infrastructure costs.

Free for permanent use
Performance Observability
  • Checkmark icon
    Monitors and visualizes performance of AI training and inference.
  • Checkmark icon
    Identifies bottlenecks and performance issues.
Paid with free trial
Performance Intelligence
  • Checkmark icon
    Provides a suite of tools for automatic acceleration based on collected performance observation data.
  • Checkmark icon
    Based on data provided by Performance Observability, users can manually accelerate their AI workloads for further performance improvements.

Processing Speed up to
0.0
Faster
(based on our actual project)
GPU Costs up to
0
Savings
(based on our actual projects)
Free for permanent use

Performance Observability

Continuous Monitoring of Hardware Usage and AI Workloads
  • Efficiently collects hardware and AI workload data as time-series.
  • Supports multiple platforms (AWS, Azure, GCP, and on-premises), seamlessly monitoring diverse system architectures in one place.
Profiling of running applications.
  • Continuously saves flame graphs, breaking down application processing time to visualize internal processing details.
  • Identifies which functions or libraries in the program are bottlenecks.
  • Analyzes differences in application configurations under varying hardware utilization conditions.
Paid with free trial

Performance Intelligence

Workflow
  • 1
    Data Analysis
    • Calculates training efficiency (identifies potential for acceleration)
    • Identifies areas needing acceleration from performance data
  • 2
    Acceleration
    • Provides a suite of tools for automatic acceleration based on performance analysis.
    • Offers necessary documentation to assist users in achieving manual acceleration.
  • Performance Engineering Services (Contact us for details)
    Fixstars acceleration experts will improve your performance based on AI Booster analysis data, tailored to your environment and requirements.
Example of acceleration methods
  • Model optimization
  • Hyperparameter tuning
  • Selecting optimal parallelization techniques
  • Kernel parameter optimization
  • Communication library optimization
  • File systems optimized for the workload
  • Improving memory efficiency
  • OS, system and driver optimizations

Performance Engineering Cycle

Performance is not constant—it evolves due to new model adoption, parameter changes, and infrastructure updates. By continuously running the performance improvement cycle, you can prevent degradation and always achieve peak performance.

Factors Contributing to Performance Degradation
  • decoration
    Adoption of New Models/Methods
    Updates to Transformer architectures and multimodalization change computation patterns, disrupting the balance of GPU utilization and memory bandwidth.
  • decoration
    Changes in Hardware Configuration/Cloud Plans
    Changes in instance types, price revisions, and region migrations can make previously cost-optimized configurations obsolete, leading to over-provisioning or performance bottlenecks.
  • decoration
    Library/Framework Updates
    Version updates of CUDA, cuDNN, PyTorch, etc., can alter internal algorithms and memory management, causing unexpected increases in latency or deterioration of memory footprint.
By incorporating a continuous performance engineering cycle, you can consistently achieve optimal performance.

Proven Performance Improvements

Broadcasting Company - LLM 70B Continued Pre-training
Telecom Company - LLM 70B Continued Pre-training
LLM7B Model Training
LLM Single-batch Inference
LLM Multi-batch Inference

Note: These results include both automatic accelerations by Fixstars AI Booster and additional manual accelerates based on collected performance data.

Software Configuration

An example of a multi-node configuration

Fixstars AI Booster (AI Booster) consists of two main components:

  • Checkmark icon
    AI Booster Agent: Collects performance telemetry data from individual nodes.
  • Checkmark icon
    AI Booster Server: Stores data and provides clear visualizations via an intuitive dashboard.

Typically, one AI Booster Server is installed on a management node, while multiple AI Booster Agents run on compute nodes. This configuration enables comprehensive monitoring of multiple nodes from a single management point, visualizing performance across your entire infrastructure on one unified dashboard.

You can centrally visualize server groups distributed across multiple locations, whether in a multi-cloud environment with multiple cloud vendors or a hybrid environment combining on-premise and cloud.

For simpler setups, AI Booster also supports a local configuration, where both Server and Agent run together on a single node.

Software Configuration Example

Single Node Configuration 1 - Direct AI Booster Usage on a Local Workstation

Install both AI Booster Server and AI Booster Agent on a single GPU-equipped workstation or server. Connect a monitor and check performance information directly via the dashboard. This setup offers the quickest route when you want to "just try running it" on offline test machines or benchmarking systems. No network configuration is required.

Single Node Configuration 2 - Multi-User Performance Dashboard Viewing

Install both AI Booster Server and AI Booster Agent on a single GPU-equipped workstation or server.
Users can access the dashboard provided by the server through their personal PCs using a browser via TCP port 3000.
This configuration is ideal for small-scale proof-of-concept (PoC) projects requiring dashboard viewing by multiple users.

Multi-Node Configuration 1 - Centralized Performance Monitoring with Dedicated Management Node

Install AI Booster Server on a dedicated management node, and install AI Booster Agent on each GPU compute node.
Users can access the management node’s dashboard from their personal PCs via a browser through TCP port 3000.
This configuration is recommended for most GPU cluster server systems.

Multi-Node Configuration 2 - GPU Compute Node Serving as Performance Monitoring Node

If there is no dedicated management node, select one GPU-equipped node and install both AI Booster Server and its own AI Booster Agent. Install only the AI Booster Agent on the remaining GPU-equipped nodes.
Users can access the dashboard provided by the GPU node with AI Booster Server installed, via their personal PCs through a browser using TCP port 3000.

FAQ

Q. What's the overhead of Fixstars AI Booster? plus interface icon

The software runs as a Linux daemon, meaning it's always active with minimal overhead. We refer to it as having "near-zero overhead."

It runs on Debian-based Linux environments. We have verified operation on Ubuntu 22.04 LTS. It can also run without an NVIDIA GPU, but the available data and functionality will be limited.

Fixstars AI Booster is free to use. However, the Performance Intelligence (PI) feature is available at no cost for the first month after activation and becomes a paid feature thereafter. Please refer to the Fixstars AI Booster's End User License Agreement for details.

Fixstars does not collect user-specific data (such as your application data or detailed analysis results). We only gather general usage statistics for product improvement purposes. Contact us for more details.

Traditional tools (e.g., DataDog, NewRelic) show hardware utilization, but Fixstars AI Booster additionally captures detailed AI workload data. It analyzes this data to identify and resolve performance bottlenecks.

It optimizes performance by analyzing data from Performance Observability (PO). This includes changing infrastructure configurations, tuning parameters, and optimizing source code to maximize GPU utilization.

Profiling tools (like NVIDIA Nsight) capture "snapshots" triggered by specific commands. In contrast, AI Booster continuously captures detailed performance data, enabling historical analysis and identification of performance degradation. AI Booster’s automatic acceleration suggestions and implementations are unique features.

Yes. Because the underlying technology is broadly applicable, other AI or GPU-accelerated workloads can also benefit. The exact improvements depend on your specific workload—please contact us for details.

Any other questions? Please contact us.

Performance Engineering with
Fixstars AI Booster

Detect hidden bottlenecks and automatically accelerate your AI workloads.
Achieve further acceleration manually by utilizing acquired performance data.

Download Free Now decoration