Personal tools
You are here: Home Research Trends & Opportunities High Performance and Quantum Computing High Performance Computing Systems and Applications

High Performance Computing Systems and Applications

Supercomputer_Lawrence_Livermore_National_Lab_1
(Supercomputer, Lawrence Livermore National Laboratory)
 


- Overview

High Performance Computing (HPC) most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. 

HPC is the use of parallel processing and supercomputers to run advanced and complex application programs. The system focuses on developing parallel processing systems by integrating both administration and computational methods. And for decades, HPC and supercomputing were intrinsically linked. 

Please refer to the following for more information:

 

- Advanced AI Solutions and High Performance Architecture

Artificial intelligence (AI) solutions require purpose-built architecture to enable rapid model development, training, tuning and real-time intelligent interaction. 

High Performance Architecture (HPA) is critical at every stage of the AI ​​workflow, from model development to deployment. Without the right foundation, companies will struggle to realize the full value of their investments in AI applications.

HPA strategically integrates high-performance computing workflows, AI/ML development workflows, and core IT infrastructure elements into a single architectural framework designed to meet the intense data requirements of advanced AI solutions. 

  • High-Performance Computing (HPC): Training and running modern AI engines requires the right amount of combined CPU and GPU processing power. 
  • High-performance storage: Training AI/ML models requires the ability to reliably store, clean, and scan large amounts of data. 
  • High-performance networks: AI/ML applications require extremely high-bandwidth and low-latency network connections. 
  • Automation, orchestration, software and applications: HPA requires the right mix of data science tools and infrastructure optimization. 
  • Policy and governance: Data must be protected and easily accessible to systems and users. 
  • Talent and skills: Building AI models and maintaining HPA infrastructure requires the right mix of experts.

 

- High-performance Computing (HPC) vs. Supercomputers

High-performance computing (HPC) and supercomputers are both types of computing that use advanced techniques and systems to solve complex problems. 

HPC is sometimes used as a synonym for supercomputing. However, in other contexts, "supercomputer" is used to refer to a more powerful subset of "high-performance computers". 

Here are some differences between HPC and supercomputers:

  • Hardware configuration and architecture: HPC systems are typically more flexible and scalable than supercomputers.
  • Hardware: Supercomputers are made up of thousands of compute nodes that work together to complete tasks. HPC systems are built using massive clusters of servers with one or more central processing units (CPUs).
  • Security: HPC is more of a headache for security. With connections that may run the gamut from secure onsite servers to publicly managed cloud instances and private cloud deployments, ensuring security across the HPC continuum is inherently challenging.


HPC systems are designed to handle large amounts of data, perform intricate simulations, and deliver results at high speeds. They are often measured in terms of floating-point operations per second (FLOPS). 

Supercomputers are a class of computing machines that are built to handle tasks of unprecedented complexity. Supercomputing clusters are used by AI researchers to test model designs, hyperparameters, and datasets. They can improve AI models and advance AI capabilities. 

HPC can be used for tasks such as: Handling the complex partial differential equations used to express the physics of weather. Supercomputers can be used for tasks such as: Real-time AI applications like autonomous cars and robotics

 

Lake Brienz_Switzerland_082221A
[Lake Brienz, Switzerland]

- Why Do We  Need the HPC?

Specialized computing resources were necessary to help researchers and scientists extracts insights from massive data sets. Scientist need HPC because they hit a tipping point. At some point in research, there is a need to:

  • Expand the current study area (regional → national → global)
  • Integrate new data
  • Increase model resolution

 

But … processing on my desktop or a single server no longer works. Some typical computational barriers:

  • Time – processing on local systems is too slow or not feasible
  • CPU Capacity -- Can only run one model at a time
  • Develop, implement, and disseminate state-of-the-art techniques and tools so that models are more effectively applied to today’s decision-making
  • Management of Computer Systems – Science Groups don’t want to purchase and manage local computer systems – they want to focus on science

 

- Exascale Computing

A supercomputer can contain hundreds of thousands of processor cores and require entire buildings to house and cool—not to mention millions of dollars to create and maintain them. But despite these challenges, more and more devices will come online as the U.S. and China develop new "exascale" supercomputers that promise to boost performance fivefold compared to current leading systems .

Exascale computing is the next milestone in supercomputer development. Exascale computers, capable of processing information faster than today's most powerful supercomputers, will give scientists a new tool to solve some of the biggest challenges facing our world, from climate change to understanding cancer to design New Materials. 

Exascale computers are digital computers, broadly similar to today's computers and supercomputers, but with more powerful hardware. This makes them different from quantum computers, which represent an entirely new approach to building computers suitable for specific types of problems.

 

- Memory Centric Computing

Driven by the development of AI and 5G networks, the ICT industry is ushering in a new turning point. The intelligent society has emerged in our daily life, and the world we live in is becoming more and more intelligent. However, many tasks remain to be solved. In the future, artificial intelligence computing systems will play a pivotal role in the "next intelligent society", which explains why a lot of research is being carried out in related industries aimed at greatly improving performance. 

While various approaches are being developed to leverage systems for an intelligent society, data transfer between processors and memory still hinders system performance. Additionally, increased power consumption due to increased bandwidth remains an issue. The memory industry is currently developing high-bandwidth memory capable of meeting system requirements, contributing to improved system performance, but now is the time to develop more innovative memory technologies to meet the requirements of an intelligent society. 

To achieve this, we need to not only improve memory itself, but also combine computation with memory to minimize unnecessary bandwidth usage. In other words, we need to use a different system-in-package technology to place memory close to the processor. In addition, the need to change the architecture of computing systems makes cross-industry collaboration essential.

 

 

[More to come ...]

 

 

Document Actions