Memory centric Computing Systems: What’s Old Is New Again
Memory centric Computing Systems. “The “memory wall” problem first created by Wulf and McKee in the late 1990s noted that the speed that microprocessor performance is improving is far greater than the rate of growth in DRAM speed.. This trend caused the subsystem of memory one of the main bottlenecks in performance at the system level.. With the memory “bandwidth wall computer engineers have also observed the appearance of brand new memory wall within datacenters.. It is one of which is “capacity capacity..
- Memory Wall
- Benefits of Memory centric Computing Systems Architectures for Advanced Computing
- Exploring Memory centric Computing Systems Architectural Designs
- Application Areas for Memory Centric Architectures
- PIM and Memory Disaggregation for Overcoming the Memory Wall
- Conclusion: IP at Core of Advanced Computing Innovations
Memory Wall
“memory wall” problem was discovered around an entire decade ago.. We realized the growing gap in processing speed and the speed of memory means that processor cores were increasingly still waiting for data.. The problem has gotten more severe with the advent of GPUs and CPUs with multiple cores and accelerators which have drastically smaller memory per core.. The bottleneck is shifting from the flops and onto memory..
Making the landscape of memory even more are the new possibilities in memory subsystems such being pioneering 3D high bandwidth memory stacked (HBM) and disaggregated structures..
Weve created analysis tools and techniques to analyze both system and application level pattern of memory use.. As result of these researches We have created special hardware and software modules that reduce the amount of latency and bandwidth available as well as increase the memory capacity for applications that require lot of data..
where the growing imbalance in peak compute to memory capacity required hyperscalers to overprovision the memory size of each server to its worst case usage resulting in significant memory underutilization.. To overcome the issue of memory walls, computer scientists have conducted two major lines of research: the first focuses on Processing in Memory to address the memory bandwidth problem, and the second on disaggregation of memory, which aims to tackle the memory capacity problem.
Benefits of Memory centric Computing Systems Architectures for Advanced Computing
Memory centric designs offer many advantages notably in terms of power efficiency and performance vital for the advancement of computing.. From the point of view of IP standpoint protecting this type of computing innovation helps to establish your position in the marketplace and helps build the foundation for portfolio of technologies that will give you an edge in competition..
- enhanced performance:By limiting data movement as well as boosting memory access memory centric designs provide significant enhancements in performance specifically when it comes to workloads that are memory bound.. Computing units are able to access faster which results in faster processing of data and better overall system performance.. In this area IP is able to protect specific algorithmic and technological improvements and safeguard the advantages of technological advancements that computing innovation offers..
- Enhances Energy EfficiencyEnergy efficiency is crucial aspect of contemporary computing.. Memory centric designs help conserve energy in that they reduce the necessity for lot of data movements and improving memory access patterns.. This results in lower power usage and also aligns with sustainable goals with advanced computing methods.. Utilizing patents to safeguard these eco friendly technologies and designs does not just guarantee the advancement of technology but also places companies into eco friendly markets..
Exploring Memory centric Computing Systems Architectural Designs
A variety of innovative concepts have set the standard for the field of memory centric design.. They include Processing in Memory (PIM) designs Computational Storage and Cache Coherent Memory Architectures.. Each design offer unique advantages and capabilities pushing the limits of whats feasible with advanced computing as well as IP security opportunities..
Processing in Memory (PIM)
PIM designs are distinguished by the integration of computing components directly into memory modules.. This new approach allows processing functions to be performed in the same location where data is stored and reduces data movement as well as delay.. Techno giants like Samsung SK Hynix and UPMEM are the pioneers in the creation of PIM specific to AI as well as deep learning.. This results in substantial increase in performance by cutting down on data transfers as well as enhancing overall performance of the system.. Secure intellectual property rights for these techniques is crucial to staying competitive within the constantly evolving AI and machine learning world..
Computational Storage
Computing storage is an important improvement in the design of memory centric systems.. It allows computing functions directly into storage devices and eliminates the necessity of transferring data from storage onto the GPU or CPU for processing..
By embedding processing power within the storage devices themselves including CPUs FPGAs (Field Programmable Gate Arrays) or ASICs (Application Specific Integrated Circuits) computational storage can perform tasks like data compression encryption and indexing at the data source..
This method reduces latency and improves the performance of systems and efficiency in the field of advanced computing.. Patents filed in this field can include innovative integration methods and specific techniques used to create these storage capacities further helping to advance the field of computing advancements..
Cache Coherent Memory Architectures
Cache coherent memory structures ensure that the cache is coherent across several CPUs or computing units through making use of shared memory.. This approach reduces the moving data around local caches and improves the overall performance of your system.. One example is NVIDIAs NVLink that is super fast and cache coherent interconnect that allows for efficient sharing of data and synchronization across several GPUs which increases the capabilities of parallel processing.. Protections for IP here could include interconnect technology and specific ways used to keep the coherence of caches..
Application Areas for Memory Centric Architectures
The memory centric architecture has the potential to transform various fields through enhancing efficiency and performance when performing tasks that require lot of data.. Some of the most significant areas for application include:
1.. Big Data Analytics
The field of Big Data Analytics is being revolutionized by computer innovations such as memory centric designs.. These designs speed up analysis and processing of large data sets.. Through limiting data movement and optimizing access to memory theyre opening the door to instantaneous insights as well as rapid decisions in tasks that require lot of data..
2.. Machine Learning and AI
Memory centric design is enhancing the performance of machine learning and AI applications through providing faster access to memory.. They fine tune information retrieval and allow for better efficiency in task of training and inference and enhancing the efficiency of sophisticated deep learning models..
3.. In Memory Databases
In memory databases that store the data and processes it in memory remove the requirement to use disks for data retrieval.. This results in faster time to respond to queries as well as substantial increase in the performance of databases.. Memory centric systems facilitate the efficient processing and storage of data and transform databases..
4.. Scientific Simulations
Scientific simulations that require complex computations as well as large data sets can benefit from the latest computing technologies such as memory centric structures.. These structures improve computational efficiency by decreasing data movement and optimizing access to memory as well as reducing the amount of latency.. This results in reduced time to simulate and allows for larger simulations.. It also speeds up exploration and research..
5.. General Purpose Computing
The fundamentals of memory centric architectures are applicable to general purpose computer systems.. By prioritizing memory resources and optimizing access to memory they increase overall system performance as well as energy efficiency in broad range of different applications..
Link to the official page of Compute Express Link (CXL) for more technical insights.
PIM and Memory Disaggregation for Overcoming the Memory Wall
point no:1
The first development of Processing In Memory (PIM) is to the 1970s.. With the placement of light compute algorithm near or inside memory PIM helps alleviate the limits of bandwidth on memory in traditional von Neumann computer architectures.. Stones Logic in Memory computer is just one of the early research study wherein range of processing engines as well as small caches are added to the memory arrays..
Read more: What is Silicon Photonics? : Hitachi High Tech Corporation
Attracted by its possibilities and potential there have been variety of studies that expand PIM methods with the most well known examples are computation RAM IRAM and Active Pages DIVA etc.. However despite the excitement of academia However companies in the computing industry has not been quick to adopt PIM like methods for commercial applications because of its engineering overhead (e..g.. loss of DRAM density challenges with thermal) as well as its invasiveness to the standard interface between software and hardware (e..g.. PIM ISA support managing coherence of data ……)..
point no:2
Disaggregation of memory aims to solve the issue of underutilization of memory by removing the physical limits of datacenters compute as well as memory resources.. Datacenters traditionally are constructed to be set of single servers that are connected to the fixed CPU and memory resource.. Through the disaggregation of memory and CPUs into distinct physical units..
that allows the CPU process to be able to dynamically (de)allocate resources in the memory pool shared by other processes upon demand and thus allowing for greater memory utilization. In contrast to PIM which did not come with revolutionary application which justified its level of complexity in design theres an evident reason for hyperscalers who adopt disaggregated server structures since they help optimize your TCO (total expense of ownership) of data centers..
point no:3
Intels Rack Scale architecture as well as HPs The Machine and Facebooks Disaggregated Rack are among these prominent strategies aiming to create server platform which is better suited to the dynamic use of resources in datacenters patterns.. But one of the major obstacles in the development of high performance disaggregated servers designs has been the network.. i..e.. to be able to get the best performance from an application it is essential that the network infrastructure ensure low latency communication..
This means that disaggregated servers are bound to experience the highest inter node communications between memory and compute nodes.. This could result in severe performance penalty..
point no:4
The absence of truly innovative application that justified the upgrade of software and hardware was the primary barrier to commercializing PIM The increasing popularity of memory based AI workloads has prompted important industrial players to accept and create PIM similar solutions to memory.. Samsungs HBM is just one of the notable efforts..
that incorporates which incorporates DRAM bank level SIMD processing unit that can handle large scale PIM processes showing remarkable energy efficiency improvements over actual silicon used for memory bound AI applications .. Samsung is also working on an near memory processor known as AxDIMM that working with Facebook proved its efficiency to speed up AI based recommendations ..
Although the focus is not specifically on AI applications, the French company UPMEM has launched the PIM platform, which integrates an engine for processing per bank (called DPU DRAM Processing Unit) to speed up the performance of memory-bound general-purpose applications. A recently released report from ETH Zurich offers a thorough description and analysis of UPMEM’s pros and cons.
point no:5
In addition to the rise of PIM supported memory solutions the newly standardized CXL (compute express link) interconnect has also opened new opportunities to break through the wall of memory (capacity) wall through the use of disaggregated/pooled memory via high speed and scalable communication network. CXL an open and industry supported interconnect standard is based on the PCIe (gen5) interface which allows processors to access the disaggregated memory node in high bandwidth/low latency manner.. Additional important aspects are also part of CXL..
incorporate advanced security functions for managing memory (e..g.. Secure memory allocation Secure Address Translations) as well as logically partitioning of the devices pool of memory over multiple computing devices as well as other features.. This isnt the first time that weve witnessed an interconnect with high bandwidth and cache coherence that is embracing the concept of disaggregated memory..
CXL appears to be among the first technologies to receive large-scale industry-wide coverage, not just from processor companies like Intel, AMD, and ARM, and memory companies such as Samsung, SK Hynix, and Micron, but also from hyperscalers, who will be the largest users of these devices for mass-scale implementation.
Remaining Challenges in Memory centric Computing Systems
point no:1
Even though high-end PIM/memory disaggregation tools exist, researchers still need to resolve several issues to enable widespread use of these architectural tools based on memory in common platforms. These issues include ease of programming support for compilers, hardware virtual memory, coherence support, and so on.
Below is an extract from an official report that Facebook released on technical issues, titled “First Generation Inference Accelerator Deployment at Facebook.”
point no:2
“Weve examined the possibility of applying processing in memory (PIM) to our work and found that there are variety of difficulties to employing these techniques.. The biggest issue with PIM is how programmers can program its abilities. It’s difficult to predict future compression methods for models, and therefore developers need programmability to adjust to changes.
PIM should also be able to support flexible parallelization as its difficult to know how each dimension (the amount of tables and hash sizes or the dimensions that embed) will grow over time (while it is true that for instance TensorDIMM only exploits parallelism over embedding dimensions).. Additionally TensorDIMM and RecNMP use multi rank parallelization within the same DIMM but the quantity of ranks for each DIMM isnt enough to offer significant speed increase..”
point no:3
It is likely to take several many years to understand the implications that comes with adding PIM or disaggregated memory into the architecture of the entire system as well as how to best make usage of it and what additional problems remain for researchers.. However the rise of these widely available technology based solutions for memory shows that PIM and disaggregation has become one off academic endeavor that offers exciting research opportunities and the potential for real world application at an enormous dimensions..
Conclusion: IP at Core of Advanced Computing Innovations
The memory centric architectures are opening the way to massive change in the design of computer systems.. In way they give priority to memory resources these developments are creating new opportunities that can boost efficiency power consumption as well as scalability.. In addition to speeding up data analysis as well as powering AI applications to increasing general purpose computing memory centric systems have the potential to change the way we think about complex computational challenges.. Technology is constantly evolving stay abreast of most recent research trends and developments in the industry is essential to unlock the full power of these architectures..
Additionally, this paradigm shift will not only improve performance and efficiency but also open up new possibilities for innovation that are protected through IP rights. The strategic management of IP is essential in advancing technology and maintaining competitive advantage in the age of advanced computing.. As computing technology advances in the coming years having strong IP strategy is essential to taking advantage of the technology revolution..
With the rapidly changing world of goods and services and the rapid growth of new technologies firms have to find ways to improve their products and services in way that is independent.. This is why Sagacious IP plays pivotal part providing customised technologies scouting solutions that allow companies to take advantage of and profit upon the latest technology effectively that will ensure continuous growth and competitive advantages..
Get in Touch with SJ Articles