Inside the Build: Bringing the nAG Library to Enterprise-Scale Arm Linux in the Cloud


Published 04/11/2025 By nAG

Bridging the Gap Between Numerical Precision and Cloud Efficiency

As demand grows for efficient, cloud-based computation, organisations are exploring new processor architectures that promise improved performance per cost. Among them, Arm-based instances are increasingly compelling for large-scale numerical workloads.

Following on from our previous insights in this series, this blog describes how the nAG Engineering Team developed a bespoke implementation of the nAG Library for a leading global financial institution migrating analytical workloads to Arm cloud infrastructure – in this instance a cloud hybrid of Azure – Ampere Ultra, Azure – Cobalt 100, and AWS – Graviton 2-4. The work exemplifies how disciplined numerical engineering can bridge the gap between the precision expected in high-performance computing and the efficiency demanded by modern enterprise infrastructure.

Both the client’s x86-64 and Arm-based systems conform to the same IEEE/ISO floating-point standards, and all produced mathematically correct results. The challenge in this project was the client’s requirement for bitwise-identical results across architectures and compiler configurations — a stricter requirement than typical numerical accuracy thresholds.

Technical Context

The client sought to maintain identical numerical results while reducing cloud compute expenditure. Because Arm instances can offer cost advantages of 20–40 % compared with traditional x86-64 hardware, an optimised and validated Arm build was essential. However, we encountered a small number of version-specific behaviours in the GNU Fortran toolchain that required investigation.

Engineering Approach

Our team began by analysing the standard x86-64 build of the nAG Library and mapping routines sensitive to compiler optimisation levels. Using internal tooling, we performed a systematic reduction of optimisation parameters on a per-routine basis until each example and stringent test matched baseline results. The process combined automated detection with manual review to eliminate subtle numerical discrepancies.

All examples and stringent validation programs were executed against the newly compiled library. A build was accepted only when no differences were observed relative to the reference outputs produced on x86-64 Linux. Particular care was taken to avoid instruction-set extensions, such as SVE/SVE2, in order to maintain a portable baseline across the client’s mixed Arm systems.

Collaboration and Testing

The client’s central development environment team provided compiler version specifications and defined the reproducibility requirements. Interim builds were tested both in nAG’s internal Arm environment and within the client’s cloud instances. Feedback cycles focused on validation of numerical equivalence rather than performance alone.

Through this collaboration, early tests indicate the approach achieves reproducible results across environments. Performance metrics will continue to be gathered as the client scales deployment, but initial benchmarks indicate competitive throughput with the added benefit of predictable, repeatable output.

Outcomes and Benefits

The project delivered:

  • A production-ready Arm Linux build of the nAG Library for use across the enterprise.
  • Updated “routine-specific optimisation tuning” tooling capable of supporting future exotic builds.
  • Validation methodology transferable to other architectures (e.g., Apple Silicon, hybrid CPU-GPU systems), and to linking against vendor performance libraries.
  • Demonstrated cost efficiency through architecture diversification without compromising result integrity.

Strategic Significance

This engagement illustrates the viability and value of bespoke numerical builds for enterprise clients. As hybrid and cloud-based infrastructures expand, the ability to tailor core numerical components will become increasingly strategic. The engineering discipline applied here enables reproducible, architecture-specific implementations at manageable cost — creating both commercial opportunity and technical resilience for future clients.

Conclusion

The nAG Engineering Team continues to refine its toolchain and processes to advance cross-architecture reproducibility. The lessons from this Arm Linux implementation reaffirm a simple principle: numerical accuracy and reproducibility must remain paramount, even as computing environments evolve.

Our experience demonstrates that with disciplined engineering, close collaboration, and rigorous validation, reproducibility and efficiency can coexist. We have the expertise and infrastructure to help enterprises achieve these outcomes — bridging the gap between precise numerical performance and modern computational efficiency.

What’s next?

If this work aligns with your organisation’s goals or challenges, we invite you to contact us to arrange a technical discussion with the team.

    Please provide your work email to access the free trial

    By clicking the button below you agree to our Privacy Policy

    This will close in 20 seconds

      Discover how we can help you in just a few clicks





      Discover how we can help you in just a few clicks

      Personal Information





      By clicking the button below you agree to our Privacy Policy

      This will close in 0 seconds