What is “Supply Chain and Technological Basis for Economic Security”? - Securing the future of computing -

Date December 13, 2023
Speaker Mukesh KHARE (General Manager, IBM Semiconductors and Vice President of Hybrid Cloud Research, IBM)
Moderator NISHIKAWA Kazumi (Principal Director, Office of Economic Security, Minister's Secretariat, METI)
Materials
Announcement

The convergence of high precision “Bits,” AI-driven “Neurons,” and quantum “Qubits” will define the upcoming new era in computing. The hybrid cloud, integrating these facets, is pivotal for industry evolution. AI computing, poised to significantly boost global GPD, faces challenges, including its sizable carbon footprint. Sustainable practices in the semiconductor industry must be evaluated, and collaborative efforts between private and public sectors are key to achieving this goal. AI development should be approached from a full stack perspective, to achieve maximum efficiency in computing. Accessible AI hardware is also a priority to help foster innovation in this space. Reduced precision computing, exemplified by IBM’s AIU hardware, enhances AI efficiency. The journey will involve reducing precision computing, heterogenous integration, and transition to analog computing. Analog AI will emulate the human brain’s workings and rely on materials innovation. Partnerships, responsible technology development, and comprehensive supply chain considerations are key elements for sustained growth and national security going forward.

Summary

New Era for Computing

IBM is the only IT company that is more than 100 years old. The reason that the company has survived for so long is because of IBM Research and the continued focus on inventing the future of computing. The approach is academic, with an end-goal focus on business outcomes.

We are living in a unique time, as three aspects of computing are coming together: firstly, “Bits” or high precision computing which we all know, and which will continue into the future; secondly, “Neurons” which are represented by new forms of computing such as AI; and thirdly, “Qubits,” which is quantum computing. In the future all three parts will coexist and add to each other. All these elements of computing will be consumed within the hybrid cloud. Hybrid clouds are important because there must be a common platform for consumption of computing. This is the future of the industry.

AI Computing

There is a projection that generative AI could raise global GDP by 7% within 10 years. AI computing is just getting started and there is an opportunity ahead of us to participate in creating new and groundbreaking technology in a sustainable way. The rate of AI workload is growing exponentially. Most advanced AI models use as much as 500 billion parameters, which means that the demand for computing is almost doubling every three months, which is impressive but means that the energy demands are substantial.

How can we address the large carbon footprint generated by AI? To train one GPT3 model generates as much carbon footprint as three jetliner round trips from New York to San Francisco, so the potential is incredible, but so is the challenge to sustainability.

Semi-conductor industry

At the foundation of these advances is the semiconductor industry. Semiconductor technology will enable this new and growing computing potential. Currently, the largest chips use as many as 20-50 billion transistors per chip. However, the industry is headed towards 1 trillion transistors per chip by the end of this decade. AI will be the workload that drives innovation in the next era, as well as the continuous growth of the semiconductor industry.

Another era in semiconductor technology that is evolving is associated with chiplet technology. We will be able to mix and match chiplets which perform different functions. This will help function as an accelerator to get us closer to the 1 trillion transistors per chip goal by the end of the decade.

The semiconductor industry today is worth approximately 600 billion-dollars, but by the end of this decade it will be projected to be a 1 trillion-dollar industry.

Partnerships

To meet these goals, we need strong partnerships between government, the private sector, and academia. At IBM Research, we are participating in one of the largest and most advanced research and development centers in Albany, NY. This center is based on a partnership between IBM and NY Creates, which is run by the State of New York. The ecosystem partners include many stakeholders in the semiconductor industry who have come together to collaborate and develop new technologies and methods. This facility has some of the most advanced tooling capabilities available anywhere in the world and has recently welcomed Rapidus from Japan into the enterprise.

As part of this collaborative ecosystem, we created the world’s first 2nm chip technology, announced in 2021. Rapidus is a new welcome partner in this endeavor. This chip provides 75% less power consumption compared to 7nm technology and uses a new device structure called nanosheet technology and the technology won TIME Magazine’s Best Invention of 2022 award.

Creation of such technology requires robust public and private partnerships. IBM just announced another large partnership between the State of New York, IBM, and other partners to invest 10 billion dollars in the creation of advanced capabilities in enabling high-NA EUV lithography capabilities. This will be the world’s first capability based on a public/private partnership model. These various investments will continue to fuel the computational demand for AI now, and into the future.

AI Hardware Center Initiative

IBM works on every aspect of the computing stack from process technology, chip design, and software, to system design. In the world of AI, it is crucial that we look at every part of the stack to maximize the technology.

In computing, the value will come from every part of the stack, which is why the “full stack approach” is essential. In collaboration with such initiatives, we continue to work with various commercial, academic, and government partners around the world to leverage the world’s best resources.

The Initiative includes the four areas of cores and architecture, heterogenous integration, analog elements, and end-user AI testbeds. IBM has an ambitious goal of improving computing efficiency by 1000x in 10 years. This is critical as the demand for AI workload will continue to grow, along with opportunities for us to innovate. Beyond economic success, these innovations will have an impact on economic growth as well as national security. Semiconductors, AI, and quantum technologies are and will continue to be areas of concern from a national security standpoint.

In the ten-year roadmap we created to reach this goal, we have two broad categories of focus. Firstly, digital, which is represented by approximate computing. This is a technology pioneered by IBM, where we build digital chips that can benefit from the use of reduced precision. That will result in at least a ten-fold improvement in computing efficiency and power reduction. In the long term we are working on analog AI, which has the potential to give us a further 100-fold improvement in terms of computing efficiency.

In that journey there are three steps. Step one is reduced precision arithmetic. The idea is to reduce precision without sacrificing accuracy. In the world of AI, we are not looking at high precision computing, but getting the same answers with less data and effort. IBM designs chips that can leverage the power of reduced precision arithmetic because we are involved in the full stack. The next step is the heterogenous integration, where we bring memory closer to logic by using 3D integration, or chiplet technology. This is a semiconductor innovation beyond design in advanced packaging or chiplet technology. The third step is analog computing. These advanced chips must still engage with other different chips in a digital manner, but they will perform computation in the analog domain by leveraging material innovation. The key point will be to achieve the ability to perform computation within memory itself, instead of having the computation performed outside memory and the information transported back and forth, as it has been done traditionally.

Reduced precision computing

Today’s traditional computing uses anywhere between 32-bit to 16-bit for training and inference. However, IBM has been working on a new algorithm so fewer bits can be used to get the same result both in the form of training and inference, without losing accuracy. By designing a chip that leverages reduced precision arithmetic, we can make AI compute faster and more efficiently. Every 2x improvement in bit reduction results in 4x improvement in performance and reduced energy consumption.

Hardware is essential to realizing these ideas, and we created a hardware unit called the Artificial Intelligence Unit, which utilizes low precision computing, focusing on inferencing and fine tuning for generative AI workloads. This can provide 10x improvement in energy efficiency. AIU uses 32 AI cores and can plug into a standard PCIe slot. It supports FP16, FP8, Int8, Int4, and Int2 data formats, meaning that when it is translated to low-precision Int4 or 2, it will use much less energy.

IBM introduced this low precision code into a microprocessor in 2021, which enabled real-time data inference for applications such as fraud detection. In 2022, IBM built a multi-chip core dedicated to running deep learning models, which are faster and more efficient than a general-purpose CPU.

To translate the chip capabilities into a system, at IBM Research, we are now building an AI-based system which can be deployed at a larger scale to show significant reduction in power consumption or improvement in efficiency at the system level to meet workload demands. This is an example of the connection between semi-conductors and AI technology.

We must not only invest in hardware technology, but also significantly in software. The use of the AI chip must continuously be made easier for developers. We have partnered with a framework called PyTorch 2.0, which is the entry point for AI practitioners to utilize efficient hardware. AI is still an early-stage technology and there will be many further developments down the line. Therefore, the software that is developed must be open-source and utilize common software. An increasing number of AI chips are using PyTorch 2.0. This framework can enable other AI chips to be made available for use in the broader community.

Building the AIU from core to cluster

First, the AIU chip gets converted into an AIU PCIe card which is integrated into a node of many cards, and then a rack of many nodes and finally a cluster of all of those racks, which then form an AI data center.

IBM has created an early prototype of this AIU System, which is scalable and highly power efficient, where we are currently running the AI workload. The system is an On-Prem system which is deployed in our headquarters. The system consumes much less power compared to traditional AI systems, because we focused on innovation at the chip, design, process technology, algorithms, and system levels. By focusing on every individual step, we were able to generate the value needed for an efficient and sustainable future of computing.

In our roadmap, we are currently using a system-on-chip utilizing advanced logic technology. The next step bring more high bandwidth memory close to the AI chip, that will enable us to run bigger models. A longer-term goal will be to use 3D chip technology to bring memory and logic chips together. That is the roadmap that we are working on towards achieving a high-level goal of 1000x improvement in 10 years. We are making progress every year in partnership with many collaborators across the industry.

Analog AI

Analog AI is changing the way we think about AI computing. Analog AI can provide as much as 100x improvement in energy efficiency because we will perform computation in the memory itself. We must learn from how the human brain works and transfer that into technology. This will also require innovation at the materials level.

We are performing basic computation like multiplication and addition using very simple functions, as compared to traditional methods of digital computation which require many thousands of cycles of moving data from memory to compute and back to perform one compute function.

The benefits of in-memory computing (IMC) include improved energy efficiency. IMC is capable of ultrafast computational complexity and provides low latency and almost zero standby leakage power. IBM can map a deep neural network into a circuit and then develop this synaptic element and (using phase-change memory) change the phase of memory to store information. Then we perform computation in one cycle instead of multiple thousands.

Some of the work IBM has done in this space has been recognized worldwide. In one of our published papers, we described how we built analog AI chips in 14 nm technology and were able to demonstrate speech recognition and transcription using analog AI technology. In another paper, we described our work on a 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference.

IBM Research is progressing development using state of the art methods for the next era of AI, which is analog. Our role is to conduct near, medium, and long-term research to build sustainable computing that can meet economic and national security needs.

We should strive to create the future of sustainable computing together, which will require cooperation between industry, government, and academia. Traditional, AI, and quantum computing are all coming together, and semiconductor technology is at the heart of these developments.

Q&A

Q:
Will AI help or hinder energy and climate change issues? Secondly, regarding energy and computation, there is no international security body to coordinate the semiconductor agency. Is such a platform required going forward?

Mukesh KHARE:
AI technology can create many opportunities and will help improve efficiency and productivity, reduce waste, and develop new technologies to help reduce power consumption. AI is a tool that will help to significantly improve productivity and reduce the cost of work. But since AI is an early-stage technology in terms of its development, it is our responsibility to develop sustainable and responsible technology. Just like any other tool, it has the potential to be misused. Therefore, partnerships between various stakeholders are important in creating guardrails and rules.

IBM believes in partnership, and we launched an initiative called the AI Alliance with more than 50 partners to create an open environment for many entities to participate in the creation of responsible and sustainable AI technologies. AI can be an accelerator and we must make sure it is developed for the benefit of society and the planet.

The semiconductor industry has a renewed sense of importance due to national security and supply chain-related challenges. Countries are partnering more, such as the partnership between the U.S. and Japan. We must think about how to create a collaboration between like-minded countries to develop international standards.

Q:
Traditional manufacturing lines of semi-conductors may become stranded assets. How can we prepare to avoid such redundancies?

Mukesh KHARE:
One type of computing will not replace another. They will co-exist. Traditional computing will continue as well as quantum computing, and the latter will not replace the former. Whatever current investments companies are making will continue to be fruitful as they will help meet the ever-growing demand for semi-conductors. In fact, more and not less investment will be required to create new semi-conductor fabrication plants (FABs), and new materials.

Q:
Do you regard Japanese engineers as being excellent? How can Japan nurture its researchers?

Mukesh KHARE:
IBM has partnered with Japanese companies for the last 40 years and have worked with every major technology company in the country. Japanese engineers working at IBM Research are amongst the best in terms of their thoroughness and professionalism. Japan as a country must continue to invest and foster engineering. Because of the current lack of Japanese manufacturing companies in semi-conductors, university research has transitioned into the areas where investment is going. There is now a strong focus in Japan and U.S. to create university programs that train semi-conductor engineers, which should be encouraged to meet what will be an increased demand.

Q:
Computing architecture has developed continuously over the last 100 years. What kind of supply chain is necessary for the future? What elements of the supply chain should be made resilient for the decades to come?

Mukesh KHARE:
We must pay attention to the end-to-end supply chain. Quantum computing, for example, requires different materials. We must look at each branch of computing differently, starting from materials, to manufacturing, to systems. Software is also essential because computing is consumed only by software. Then there is the human aspect which is the workforce. We must look at every aspect in each of these domains. In terms of AI or quantum, we must work towards developing not only chip technology but chip design, system design, and talent.

It is always important to look at the full stack. At IBM we are fortunate that we work on the full stack, because if one thing is missing, it has the potential to break the whole chain for economic growth and national security.

*This summary was compiled by RIETI Editorial staff.