NeuReality’s First AI Inference Server-on-a-Chip Validated and moved to Production

NeuReality’s 7nm AI-centric NR1 chip moved its final, validated design to TSMC manufacturing, creating the world’s first AI-centric server-on-a-chip (SOC). A major step for the semiconductor industry, NeuReality will transform AI inference solutions used in a wide range of applications – from natural language processing and computer vision to speech recognition and recommendation systems.

With the mass deployment of AI as a service (AIaaS) and infrastructure-hungry applications such as ChatGPT, NeuReality’s solution is crucial for an industry urgently in need of affordable access to modernized, AI inference infrastructure. In trials with AI-centric server systems, NeuReality’s NR1 chip demonstrated 10 times the performance at the same cost when compared to conventional CPU-centric systems. These remarkable results signal NeuReality’s technology as a bellwether for achieving cost-effective, highly-efficient execution of AI inference.

AI Inference traditionally requires significant software activity at eye-watering costs.  NeuReality’s final steps from validated design to manufacturing – known in the industry as “tape-out” – signals a new era of highly integrated, highly scalable AI-centric server architecture.

The NR1 chip represents the world’s first NAPU (or Network Addressable Processing Unit) and will be seen as an antidote to an outdated CPU-centric approach for inference AI, according to Moshe Tanach, Co-Founder and CEO of NeuReality. “In order for Inference-specific deep learning accelerators (DLA) to perform at full capacity, free of existing system bottlenecks and high overheads, our solution stack, coupled with any DLA technology out there, enables AI service requests to be processed faster and more efficiently, ” said Tanach.

“Function for function, hardware runs faster and parallelizes much more than software. As an industry, we’ve proven this model, offloading the deep learning processing function from CPUs to DLAs such as the GPU or ASIC solutions. As in Amdahl’s law, it is time to shift the acceleration focus to the other functions of the system to optimize the whole AI inference processing. NR1 offers an unprecedented competitive alternative to today’s general-purpose server solutions, setting a new standard for the direction our industry must take to fully support the AI Digital Age.” added Tanach.

NeuReality is moving the dial for the industry, empowering the transition from a largely software centric approach to a hardware offloading approach where multiple NR1 chips work in parallel to easily avoid system bottlenecks. Each NR1 chip is a network-attached heterogeneous compute device with multiple tiers of programmable compute engines including PCIe interface to host any DLA; an embedded Network Interface controller (NIC) and an embedded AI-hypervisor, a hardware-based sequencer that controls the compute engines and shifts data structures between them. Hardware acceleration throughout NeuReality’s automated SDK flow lowers the barrier to entry for small, medium, and large organizations that need excellent performance, low power consumption and affordable infrastructure – as well as ease of use for inferencing AI services.

“We are excited about our first generation NAPU product, proven, tested, and ready to move to manufacture. It’s full steam ahead as we reach this highly anticipated manufacturing stage with our TSMC partners. Our plan remains to start shipping product directly to customers by the end of the year,” says Tanach

NeuReality Appoints Lynn Comp, Semiconductor Veteran and AMD Corporate VP, to its Board

NeuReality, industry experts in AI-centric inference, has announced the appointment of Lynn A Comp to its Board of Directors. As a semiconductor veteran and leader with vast experience in data center infrastructure, Lynn joins NeuReality’s experienced board consisting of industry heavyweights CJ Bruno, with his wealth of experience in sales and marketing from a distinguished career at Intel, and AI luminary Naveen Rao, a leading industry voice in generative AI and deep learning. Lynn brings invaluable knowledge of the technology underpinning inference and virtual cloud, as well as the sector acumen needed to help guide NeuReality through the complex market dynamics and drive the company through its next integral period of growth.

As AI inference becomes highly sought after, in major part due to the proliferation of Large Language Models (LLMs) such as ChatGPT, AI computation systems must adapt to keep it performing at higher efficiency and optimal Total Cost of Ownership (TCO). NeuReality is building a first-in-class capability to fortify the future of AI inference, helping to improve the performance of machine learning systems by offloading computational tasks the Data-movement and processing requires for hosting Deep Learning Accelerators, from general-purpose CPUs. Moving from the time and resource-intensive CPU-centric model to NeuReality’s AI-centric model and server-on-a-chip solution helps lower cost, reduce required energy input and increase throughput. Lynn’s proven track record in Go-To-Market strategies, as well as her intimate understanding and relationship with the compute industry, fits hand in glove with NeuReality’s vision and solutions, and will contribute to ensuring NeuReality’s solutions realize their full potential.

“I’m excited to be joining NeuReality’s board and welcome the challenge to help steer the company through its next phase of growth,” said Mrs Comp. “The company’s innovative approach to AI-focused computing platforms offers the market an easily integrated, end to end inference solution that speeds up AI deployments while reducing data center footprint and power. I am passionate about simplifying the application of AI to IT infrastructure and take on this role with the intention of helping make this a reality,” added Lynn.

“We are thrilled to welcome Lynn to our board,” said Moshe Tanach, Co-Founder and CEO of NeuReality.” “Adding Lynn to our board of directors with her extensive business experience with semiconductors at AMD, as well as data centers for various market sectors, is one of multiple important steps we are taking as we shift from development to production. Drawing on Lynn’s unique expertise and leadership in the semiconductor market, we are looking forward to working together and refining the GTM strategy and execution for our disruptive AI inference products and services. AMD’s ambitions and priorities chime with our own vision of the market and makes Lynn’s appointment with her background at AMD invaluable.”

During her 25-year tenure, Lynn Comp has held a number of key positions, including Vice President of the Data Center Group and General Manager of the Visual Cloud Division in Intel as well as Corporate VP of cloud business unit and Server BU marketing in AMD. She has been instrumental in helping Intel build and foster strong relationships with top-tier customers and partners, and has taken an active role in promoting the company’s environmental and social responsibilities. Lynn was also recognized as part of the “2020 Women of Technology” by Connected World.

NeuReality raises $35M to bring its inferencing chip to the market

NeuReality, an AI hardware startup which specializes in the next generation of AI inferencing platforms, is announcing a $35M Series A funding round. The round was led by Samsung Ventures, Cardumen Capital, Varana Capital, OurCrowd and XT Hitech. SK Hynix, Cleveland Avenue, Korean Investment Partners, StoneBridge, and Glory Ventures also participated in the round. The round brings NeuReality’s total funding to $48 million

The new fundraising will support NeuReality’s plans to start deploying its Inference solutions in 2023. NeuReality’s system is built for optimized deployment in data centers and near-edge on-premises locations in need for higher performance, lower latency, and much higher efficiency compared to existing technologies. The company had already reported its close collaboration with leading AI ecosystem partners and customers such as IBM, AMD, and Lenovo.

NeuReality makes AI easier to deploy in terms of usability, cost, scalability, and sustainability. The company uses a new type of Network Addressable Processing Units (NAPU) optimized for deep learning inference use cases such as computer vision, natural language processing, and recommendation engines. The NAPU

will bring AI for a broader set of less technical companies. It is also set to allow large-scale users such as Hyperscalers and next-wave datacenter customers to support their growing scale of AI usage. NeuReality’s disruptive system approach includes hardware, software and tools that work together to boost efficiency and simplify the adoption and deployment of AI in a wider range of real-world applications.

The funding will allow NeuReality to bring its first-in-class NAPU, the NR1, to the global market. This next-generation AI chip, which is based on NeuReality’s AI-centric architecture, is pushing the performance and efficiency of AI systems to the next level. It does so by removing existing system bottlenecks to increase the utilization of current deep-learning processors, while lowering the latency of AI operations, and saving in overall system cost and power consumption, which are so critical for an improved total cost of ownership (TCO) of data-centers and on-premises large-scale compute systems, and vital for the business models of many applications. NeuReality also pushes the bar higher on power consumption of those power-hungry applications, a stepping stone for the sustainability requirements driven by governments around the globe.

“This investment is another sign of confidence in the talent and innovation that NeuReality and the Israeli tech industry offer the world,” stated Moshe Tanach, CEO and co-founder of NeuReality. “The high-profile investors in our Series A fundraising prove that NeuReality’s value proposition, architecture and flagship product are a viable reality which will transform the AI market.”

Yoav Sebba, Managing Director at XT Hi-Tech, added: “We are honored to join the NeuReality journey led by this exceptional team, headed by Moshe, Tzvika, Yossi and Lior. High performance and sustainable Inference computing is so critical for growth in day-to-day usage of AI. We are seeing endless AI opportunities developed by software companies, but the existing hardware infrastructure is limiting the deployment of those use cases. Recently new deep learning processors had been developed by multiple companies, serving as new type of “brains”. NeuReality is the full body surrounding those brains and allowing it to serve intelligent voice, language, vision, and recommendation applications”.

Dr. Mingu Lee, Managing Partner at Cleveland Avenue Technology Investments, noted: “NeuReality is bringing ease of use and scalability into the deployment of AI inference solutions, and we see great synergy between their promising technology Fortune 500 enterprises companies we communicate with. We feel that investing in companies such as NeuReality is vital, not only to ensure the future of technology, but also in terms of sustainability.”

NeuReality, which raised its seed round early 2020, is led by a seasoned management team with extensive experience in AI, data-center architecture, system, and software. NeuReality was co-founded by CEO Moshe Tanach, formerly Director of Engineering at Marvell and Intel and AVP R&D at DesignArt-Networks (acquired by Qualcomm); VP Operations Tzvika Shmueli, formerly VP of Backend at Mellanox Technologies and VP of Engineering at Habana Labs; and VP VLSI Yossi Kasus, formerly Senior Director of Engineering at Mellanox and the head of VLSI at EZChip. The company’s leading team also includes CTO Lior Khermosh, former co-founder and Chief Scientist of ParallelM and a fellow at PMC Sierra.

Samsung Ventures invests in Israeli AI company NeuReality

NeuReality, an Israeli AI systems and semiconductor company, announced that Samsung Ventures has made an investment in the company. NeuReality makes inference technologies, such as computer vision, natural language processing, and recommendation engines easier to implement for a broader set of less technical companies.

More than just a chip company, NeuReality’s comprehensive solution includes hardware, software and tools that work together to simplify and accelerate AI deployment. The company currently employs more than 30 employees and plans to double its size and recruit talent in VLSI chip design, AI, software, and hardware.

Ori Kirshner, head of Samsung Ventures in Israel, stated: “We see substantial and immediate need for higher efficiency and easy-to-deploy inference solutions for data centers and on-premises use cases, and this is why we are investing in NeuReality. The company’s innovative disaggregation, data movement and processing technologies improve computation flows, compute-storage flows, and in-storage compute – all of which are critical for the ability to adopt and grow AI solutions. Samsung Ventures is committed to invest in companies with strong synergy to Samsung’s core business and NeuReality fits well into this commitment.”

The adoption and growth of AI solutions face various obstacles, which prevent retail, manufacturing, healthcare, and other sectors from deploying AI inference capabilities into business workflows. While the company is new, NeuReality’s team draws from decades of experience in AI, data center systems, hardware design, and software development. As a result, NeuReality uses a system level approach that combines easy-to-use software with high efficiency deep learning and data handling acceleration hardware. This holistic approach dramatically simplifies and accelerates the adoption and mass deployment of inference technology.

Focusing on the growth of real-life AI applications, NeuReality’s solutions are purpose built for a wide variety of sectors including public safety, e-commerce, social networks, medical and healthcare, digital personal assistants, and more. NR1 solution targets cloud and enterprise datacenters, alongside carriers, telecom operators and other near edge compute solutions.

NeuReality emerged out of stealth last year with $8 million seed funding from Cardumen Capital, OurCrowd and Varana Capital. In November 2021, NeuReality signed an agreement with IBM to develop the next generation of high-performance AI inference platforms that will deliver disruptive cost and power consumption improvements for deep learning use cases. NeuReality is also collaborating with AMD to deliver its first-generation AI-centric FPGA based platforms for Inference acceleration to customers.

NeuReality is creating purpose-built AI-platforms for ultra-scalability of real-life AI applications and positioned itself as a pioneer in the deep learning and AI solutions market. The company’s NR1 is the company’s next generation integrated circuit device that is based on its AI-centric architecture. The SoC improves the utilization of AI compute resources that are currently deployed by removing the existing system bottlenecks, lowering the latency of AI operations and saving in overall system cost and power consumption. The company also develops complementary software tools and runtime libraries that make it easy for customers in various skill levels and various deployment topologies to adopt new AI based services in their business workflows.

Moshe Tanach, CEO and co-founder of NeuReality, stated: “The investment from Samsung Ventures is a big vote of confidence in NeuReality’s technology. The funds will help us take the company to the next level and take our NR1 SoC to production. This will enable our customers to evolve their system architecture, and this evolution will make it easier for them to scale and maintain their AI infrastructure, whether it is in their data center, in a cloud or on-premises.”

NeuReality was founded in 2019 and is led by a seasoned management team with extensive experience in AI, data-center architecture, system, and software. NeuReality was co-founded by CEO Moshe Tanach, formerly Director of Engineering at Marvell and Intel and AVP R&D at DesignArt-Networks (later acquired by Qualcomm); VP Operations Tzvika Shmueli, formerly VP of Backend at Mellanox Technologies and VP of Engineering at Habana Labs; and VP VLSI Yossi Kasus, formerly Senior Director of Engineering at Mellanox and the head of VLSI at EZChip. The company’s leading team also includes CTO Lior Khermosh, former co-founder and Chief Scientist of ParallelM and a fellow at PMC Sierra.

IBM and NeuReality team up to build AI Server-on-a-Chip

IBM and NeuReality, an Israeli AI systems and semiconductor company, have signed an agreement to develop the next generation of high-performance AI inference platforms that will deliver disruptive cost and power consumption improvements for deep learning use cases. IBM and NeuReality will enable critical sectors such as finance, insurance, healthcare, manufacturing, and smart cities to deploy computer vision, Natural Language Processing, recommendation systems, and other AI use cases. The collaboration is also aimed at accelerating deployments in today’s ever-growing AI use cases which are already deployed in public and private cloud datacenters

The agreement involves NR1, NeuReality’s first Server-on-a-Chip ASIC implementation of their revolutionary AI-centric architecture. NR1 is based on NeuReality’s first generation FPGA-based NR1-P prototype platform that was introduced earlier this year. The NR1 will be a new type of integrated circuit device with native AI-over-Fabric networking, full AI pipeline offload and hardware-based AI hypervisor capabilities. These capabilities remove the system bottlenecks of today’s solutions and provide disruptive cost and power consumption benefits for inference systems and services. The NR1-P platform will support software integration and system level validation prior to the availability of the NR1 production platform next year.

This partnership also marks NeuReality as the first start-up semiconductor product member of the IBM Research AI Hardware Center and licensee of the Center’s low-precision high performance Digital AI Cores. As part of the agreement, IBM becomes a design partner of NeuReality and will work on the product requirements for the NR1 chip, system, and SDK, that will be implemented in the next revision of the architecture. Together the two companies will evaluate NeuReality’s products for use in IBM’s Hybrid Cloud, including AI use cases, system flows, virtualization, networking, security and more.

The agreement with IBM marks the continued momentum for NeuReality. In February this year, the company emerged from stealth announcing its first-of-a-kind AI-centric architecture and laying out its roadmap, where NR1-P will be followed by NR1. Shortly after, in September, NeuReality announced that it is collaborating with Xilinx to deliver their new AI-centric FPGA based NR1-P platforms to market.

Moshe Tanach, CEO and co-founder of NeuReality, stated: “We are excited and deeply satisfied that a world-class multinational innovator like IBM is partnering with us. We believe our collaboration is a vote of confidence for our AI-centric technology and architecture and in its potential to power real life AI use cases with unprecedented deep learning capabilities.” Tanach added: “Having the NR1-P FPGA platform available today allows us to develop IBM’s requirements and test them before the NR1 Server-on-a-Chip’s tapeout. Being able to develop, test and optimize complex datacenter distributed features, such as Kubernetes, networking, and security before production is the only way to deliver high quality to our customers. I am extremely proud of our engineering team who will deliver a new reality to datacenters and near edge solutions. This new reality will allow many new sectors to deploy AI use cases more effciently than ever before.”

Dr. Mukesh Khare, Vice President of Hybrid Cloud research at IBM Research, said: “In light of IBM’s vision to deliver the most advanced Hybrid Cloud and AI systems and services to our clients, teaming up with NeuReality, which brings a disruptive AI-centric approach to the table, is the type of industry collaboration we are looking for. The partnership with NeuReality is expected to drive a more streamlined and accessible AI infrastructure, which has the potential to enhance people’s lives.”

NeuReality collaborates with Xilinx to deliver first AI-centric server

NeuReality, an Israeli AI startup developing high performance AI compute platforms for cloud data centers and edge nodes, is collaborating with Xilinx, Inc. to deliver new AI-centric platforms that empower, optimize, and tune real-world AI applications. The collaboration is based on NeuReality’s novel AI-centric inference platform NR1-P, which includes a new type of AI Server-on-Chip (SoC) developed by NeuReality and delivers all components necessary to deploy a complete inference solution. The platform targets high volume AI applications in various fields such as public safety, e-commerce, healthcare, retail, and many other computer-vision use cases.

NeuReality has worked closely with Xilinx to deliver the world’s first fully functional AI-centric server to the market. This breakthrough prototype platform will reduce the two key barriers that inhibit customers’ AI deployment today, cost and complexity. As part of the collaboration, the NR1-P platform, based on the Xilinx Versal ACAP architecture, can be purchased directly from NeuReality, and will be fulfilled through Colfax International.

NR1-P is NeuReality’s first implementation of the company’s new AI-centric architecture, with other implementations to follow. The new prototype platform is accessible for testing and evaluation via remote access. NeuReality’s NR1-P was built upon the Xilinx Versal VCK5000 development card and can deliver up to 3X greater performance/Dollar/Watt compared to the latest Nvidia A100 or T4 based systems, according to NeuReality. The complete prototype solution includes a 4xRU server chassis, 16 AI-centric modules based on the Xilinx development cards, an embedded Linux software stack with Kubernetes support, orchestration functionality and a model database.

Moshe Tanach, CEO and co-founder of NeuReality, noted: “Working closely with Xilinx, the market leader of FPGAs for AI, has taken us one step closer to a new reality of AI-centric Server-on-Chip silicon devices that deliver best in class TCO, linear scalability, reduced latency and a simple user interface and experience. These can enable use cases such as healthcare, public safety and other applications that depend on higher efficiency and much lower cost solutions compared to the existing CPU-centric offerings from companies such as Nvidia.”

NeuReality Ltd. is an AI technology innovation company creating purpose-built AI-platforms for ultra-scalability of real-life AI applications. NeuReality is a pioneer in the deep learning and AI solutions market.

NeuReality was founded in 2019 and is led by a seasoned management team with extensive experience in data centers architecture, system, and software. The co-founders are CEO Moshe Tanach, VP Operations Tzvika Shmueli and VP VLSI Yossi Kasus. Prior to founding NeuReality, Tanach served in several executive roles as Director of Engineering at Marvell and Intel and AVP R&D at DesignArt-Networks (later acquired by Qualcomm). Tzvika Shmueli served as VP of Backend at Mellanox Technologies and VP of Engineering at Habana Labs. Yossi Kasus served as Senior Director of Engineering at Mellanox and the head of VLSI at EZChip.