Red Hat Performance and Scale Engineering

October 4, 2024

News

Red Hat Performance and Scale Engineering

October 4, 2024

Red Hat’s most up-to-date posts about Performance, Scale, Chaos and extra.

LATEST BLOGS

Virtualized database I/O functionality enhancements in RHEL 9.4

September 10, 2024 Sanjay Rao, Stefan Hajnoczi

Databases are delicate to disk I/O functionality. The IOThread Virtqueue Mapping function offered within the QEMU digital gadget track in Red Hat Enterprise Linux (RHEL) 9.4 is designed to beef up disk I/O functionality for workloads that post I/O from many vCPUs. In this newsletter, we will be able to take a look at how the IOThread Virtqueue Mapping function can spice up functionality for database workloads operating in QEMU/KVM visitors.While RHEL visitors have supported multi-queue virtio-blk units for a while, the QEMU digital gadget track treated I/O requests in one thread. This limits the functionality advantage of having a couple of queues since the unmarried thread can turn into a bottleneck.…learn extra

Use kube-burner to measure Red Hat OpenShift VM and garage deployment at scale

September 4, 2024 Jenifer Abrams

Scale trying out is significant for figuring out how a cluster will grasp up beneath manufacturing load. Generally, chances are you’ll wish to scale take a look at to achieve a undeniable max density as the top function, however it’s steadily additionally helpful to scale up from smaller batch sizes to look at how functionality might exchange as the whole cluster turns into extra loaded. Those people that paintings within the house of functionality research know there are lots of techniques to measure a workload and standardizing on a device can lend a hand supply extra related effects throughout other configurations and environments… learn extra

Scaling virtio-blk disk I/O with IOThread Virtqueue Mapping

September 5, 2024 Stefan Hajnoczi Kevin Wolf, Emanuele Giuseppe Esposito, Paolo Bonzini, and Peter Krempa

Modern garage developed to stay tempo with rising numbers of CPUs through offering a couple of queues wherein I/O requests will also be submitted. This lets in CPUs to post I/O requests and take care of finishing touch interrupts in the neighborhood. The result’s excellent functionality and scalability on machines with many CPUs.

Although virtio-blk units in KVM visitors have a couple of queues through default, they don’t profit from multi-queue at the host. I/O requests from all queues are processed in one thread at the host for visitors with the <driving force io=local …> libvirt area XML surroundings. This unmarried thread can turn into a bottleneck for I/O sure workloads…learn extra

Generative AI fine-tuning of LLMs: Red Hat and Supermicro exhibit exceptional effects for effective Llama-2-70b tremendous tuning the usage of LoRA in MLPerf Training v4.0

July 26, 2024 Diane Feddema, Dr Nikola Nikolov

New generative AI (gen AI) coaching effects have been just lately launched through MLCommons in MLPerf Training v4.0. Red Hat, in collaboration with Supermicro, revealed exceptional MLPerf v4.0 Training effects for fine-tuning of enormous language mannequin (LLM) llama-2-70b with LoRA.

LoRA (Low-Rank Adaptation of LLMs) is a cost-saving parameter-efficient tremendous tuning way that may save many hours of coaching time and cut back compute necessities. LoRA permits you to tremendous music a enormous mannequin to your particular use case whilst updating just a small subset of parameters. Red Hat’s llama2-70b with LoRA submission on Supermicro {hardware} demonstrates the supply of higher functionality, inside of 3.5% to eight.6% of submissions on identical {hardware}, whilst offering an progressed developer, consumer and DevOps enjoy…learn extra

Unleashing 100GbE community potency: SR-IOV in Red Hat OpenShift on OpenStack

July 25, 2024 – Pradipta Sahoo

Single Root I/O Virtualization (SR-IOV) is a era that permits the isolation of PCI Express (PCIe) assets for higher community functionality. In this newsletter, we discover a contemporary learn about through the Red Hat OpenStack Performance and Scale workforce, which demonstrated the functions of SR-IOV the usage of 100GbE NVIDIA ConnectX-6 adapters inside of a Red Hat OpenShift on Red Hat OpenStack (ShiftonStack) setup…learn extra

Scaling Red Hat OpenStack Platform 17.1 to greater than 1000+ digital nodes

July 9, 2024 – Asma Suhani Syed Hameed, Rajesh Pulapakula

As Red Hat OpenStack Platform has developed in recent times to deal with a various vary of purchaser wishes, the call for for scalability has turn into an increasing number of essential. Customers rely on Red Hat OpenStack Platform to ship a resilient and adaptable cloud infrastructure, and as its utilization expands, so does the need for deploying extra intensive clusters.

Over the previous years we now have undertaken efforts to scale Red Hat Openstack Platform 16.1 to greater than 700+ baremetal nodes. This 12 months, the Red Hat Performance & Scale Team has devoted itself to pushing Red Hat OpenStack’s Platform scalability to unparalleled heights. As call for for scaling the Red Hat OpenStack Platform larger, we performed an workout to check the scalability of over 1000+ digital computes. Testing such vast scales usually calls for considerable {hardware} assets for baremetal setups. In our undertaking, we accomplished a brand new milestone through effectively scaling to over 1000+ overcloud nodes on Red Hat OpenStack Platform 17.1…learn extra

Sharing is being concerned: How to benefit from your GPUs (section 1 – time-slicing)

July 2, 2024 – Carlos Camacho, Kevin Pouget, David Gray, Will McGrath

As synthetic intelligence (AI) programs proceed to advance, organizations steadily face a commonplace predicament: a restricted provide of robust graphics processing unit (GPU) assets, coupled with an expanding call for for his or her usage. In this newsletter, we’re going to discover quite a lot of methods for optimizing GPU usage by the use of oversubscription throughout workloads in Red Hat OpenShift AI clusters. OpenShift AI is an built-in MLOps platform for development, coaching, deploying and tracking predictive and generative AI (GenAI) fashions at scale throughout hybrid cloud environments…learn extra

Scale trying out image-based upgrades for unmarried node OpenShift

June 28, 2024 – Alex Kros

Image-based upgrades (IBU) are a developer preview function in Red Hat OpenShift Container Platform 4.15 that cut back the time required to improve a unmarried node OpenShift cluster. The image-based improve can carry out each Z and Y move upgrades, come with operator upgrades within the picture, and rollback to the former model manually or routinely upon failure. Image-based improve too can without delay improve OpenShift Container Platform 4.y to 4.y+2, while a standard OpenShift improve will require two separate upgrades to reach the similar finish outcome (4.Y to 4.Y+1 to 4.Y+2) …learn extra

How to create and scale 6,000 digital machines in 7 hours with Red Hat OpenShift Virtualization

June 25, 2024 – Boaz Ben Shabat

In the arena of organizational infrastructure, occasionally there’s an pressing wish to abruptly scale up. Organizations could have a restricted period of time to rise up new infrastructure, an issue which is compounded through the scale of the products and services in query.

In this studying trail, we will be able to discover a large-scale deployment situation enabling 6,000 digital machines (VMs) and 15,000 pods. This comes to an exterior Red Hat® Ceph® Storage 12-node cluster and a Red Hat OpenShift® Virtualization 132 node cluster, built-in with an exterior Ceph Storage cluster…learn extra

Egress IP Scale Testing in OpenShift Container Platform

June 21, 2024 – Venkata Anil Kommaddi

This weblog submit explores how kube-burner-ocp, an opinionated wrapper designed on best of kube-burner can be utilized to simplify functionality and scale trying out, and leveraged to judge egress IP scalability in OpenShift’s default CNI plugin which is OVNKubernetes . We’ll delve into the intricacies of the egress IP function, its position in visitors control, and the way kube-burner and kube-burner-ocp are serving to us perceive its conduct beneath load. This weblog additionally serves as a vintage instance of the way the Red Hat Performance and Scale workforce works with the Development workforce to know, take a look at, symbolize and beef up options with a holistic manner, for the good thing about our shoppers who answer on OpenShift for his or her venture crucial workloads on-prem and within the cloud…learn extra

IPsec Performance on Red Hat Enterprise Linux 9: A Performance Analysis of AES-GCM

June 13, 2024 – Otto Sabart, Adam Okuliar

In lately’s virtual panorama, securing knowledge over insecure channels is extra an important than ever. Traditionally, this essential activity has been treated through specialised {hardware} units referred to as concentrators, which include substantial value tags. But what if you must reach the similar stage of safety and function the usage of readily to be had retail {hardware}? This article explores an exhilarating, cost-effective choice: leveraging Red Hat Enterprise Linux 9 on trendy, multicore CPUs. We’ll dive into quite a lot of configurations and encryption strategies, and disclose how this manner can fit the functionality of top-end commercial units. Astonishingly, it is conceivable to reach 50 Gbps of IPsec AES-GCM with a couple of safety associations on commodity {hardware}…learn extra

Ensure a scalable and performant setting for ROSA with hosted regulate planes

May 30, 2024- Russell Zaleski Murali Krishnasamy, David Sanz Moreno, Mohit Sheth

Ensuring that OpenShift is performant and scalable is a core tenant of the OpenShift Performance and Scale workforce at Red Hat. Prior to its unlock (and nonetheless to these days), ROSA undergoes an unlimited array of functionality and scale trying out to make sure that it delivers business main functionality. These exams run the gamut from regulate airplane and knowledge trail focal point, to upgrades, to community functionality. These exams were used to lend a hand measure and higher the functionality of “classic” ROSA, however what occurs once we transfer to hosted regulate planes?… learn extra

Accelerating generative AI adoption: Red Hat OpenShift AI achieves spectacular leads to MLPerf inference benchmarks with vLLM runtime

April 24, 2024 – Mustafa Eyceoz, Michey Mehta, Diane Feddema, Ashish Kamra

Large Language Model (LLM) inference has emerged as a an important era in recent years, influencing how enterprises manner AI-driven answers using new passion in integrating LLMs into venture programs. But when deploying LLMs in manufacturing environments, functionality turns into paramount, with throughput (measured in tokens generated according to 2d) on a GPU serving as a key metric. In theory, a mannequin with upper throughput can accommodate a enormous consumer base for a given {hardware} infrastructure, whilst assembly particular latency and accuracy necessities, which in the end reduces the price of mannequin deployment for end-users…learn extra

Red Hat Enterprise Linux Performance Results on fifth Gen Intel® Xeon® Scalable Processors

April 4, 2024 – Bill Gray, David Dumas, Douglas Shakshober, Michey Mehta

Intel just lately introduced the fifth era of Intel® Xeon® Scalable processors (Intel Xeon SP), code-named Emerald Rapids; a circle of relatives of top-end, enterprise-focused processors centered at a various vary of workloads. To discover how Intel’s new chips measure up, we’ve labored with Intel and others to run benchmarks with Red Hat Enterprise Linux 8.8 / 9.2 and larger…learn extra

Optimizing Quay/Clair: Database profiling effects

March 19, 2024 – Vishnu Challa

Welcome to the second one a part of our exploration. In this continuation from our earlier article, we will be able to delve deeper into the result of our database profiling efforts and talk about methods for additional optimizing general utility functionality…learn extra

Optimizing Quay/Clair: Profiling, functionality, and potency

March 19, 2024 – Vishnu Challa

Red Hat Quay (additionally introduced as a provider by the use of Quay.io) is a cloud-based container registry provider that permits customers to retailer, set up, and distribute container pictures. It supplies a platform for webhosting, sharing, and securing container pictures throughout a couple of environments, together with on-premise information facilities, public cloud platforms, and hybrid cloud deployments…learn extra

Save reminiscence with OpenShift Virtualization the usage of Free Page Reporting

March 13, 2024 – Robert Krawitz

OpenShift Virtualization, a function of Red Hat OpenShift, lets in operating digital machines (VMs) along packing containers at the similar platform, simplifying control. It lets in the usage of VMs in containerized environments through operating VMs the similar method as some other pod, in order that organizations with important funding in virtualization or who need the better isolation equipped through VMs with legacy workloads can use them in an orchestrated containerized setting…learn extra

Test Kubernetes functionality and scale with kube-burner

March 4, 2024 – Sai Sindhur Malleni, Vishnu Challa, Raul Sevilla Canavate

Three years in the past, we offered kube-burner to the Kubernetes functionality and scale communities. Since then, kube-burner has often persisted its adventure, including a various vary of options that lend a hand clear up distinctive demanding situations in appearing and inspecting effects from functionality and scale exams on Kubernetes and Red Hat OpenShift.

Over the previous couple of years, a couple of new options and usefulness enhancements have been added to kube-burner. In this newsletter, we will be able to transcend the fundamentals, exploring some new bells and whistles added to the software just lately and laying out our imaginative and prescient for the long run…learn extra

5 techniques we paintings to optimize Red Hat Satellite

March 4, 2024 – Imaanpreet Kaur, Jan Hutař, Pablo Mendez Hernandez

In the ever-evolving panorama of era, the quest for optimum functionality and scale is a continuing problem. With Red Hat Satellite 6.13 and six.14 variations, we now have launched into an exhilarating adventure to push the bounds and carry our functions. In this newsletter, we’re going to take you in the back of the scenes to discover how we paintings to beef up functionality and scale our operations…learn extra

Best practices for OpenShift Data Foundation crisis restoration useful resource making plans

February 22, 2024 – Elvir Kuric

Red Hat OpenShift Data Foundation is a key garage part of Red Hat OpenShift. It gives unified block, record, and object garage functions to strengthen a variety of programs.

One of the brand new thrilling options in OpenShift Data Foundation 4.14 is OpenShift Data Foundation Regional Disaster Recovery (RDR), which gives RDR functions for Rados Block Device swimming pools (RBD swimming pools) and Ceph File System (CephFS) (by the use of volsync replication) swimming pools. With RBD pictures replicated between clusters, OpenShift Data Foundation RDR protects shoppers from catastrophic screw ups. With OpenShift Data Foundation RDR, we will be able to…learn extra

DPDK latency in OpenShift – Part II

February 20, 2024 – Rafael Folco, Karl Rister, Andrew Theurer

In a prior article, we shared the result of the DPDK latency exams performed on a Single Node Openshift (SNO) cluster. We have been ready to show {that a} packet will also be transmitted and won again in handiest 3 µs, most commonly beneath 7 µs and, within the worst case, 12 µs. These numbers constitute the spherical travel latencies in Openshift for a unmarried queue transmission of a 64 byte packet, forwarding packets the usage of an Intel E810 twin port adapter…learn extra

Correlating QPS price with useful resource usage in self-managed Red Hat OpenShift with Hosted Control Planes

January 23, 2024 – Guoqing Li

The common availability of hosted regulate planes (HCP) for self-managed Red Hat OpenShift Virtualization (KubeVirt) is an exhilarating milestone. However, the real take a look at lies in device functionality and scalability, that are each an important elements that decide luck. Understanding and pushing those limits is very important for making knowledgeable choices. This article gives a complete research and common sizing insights for consolidating present naked steel assets the usage of hosted regulate planes for self-managed OpenShift Virtualization. It delves into the useful resource utilization patterns of the hosted regulate planes, analyzing their dating with the KubeAPIServer QPS price. Through quite a lot of experiments, we established the linear regression mannequin between the KubeAPI Server QPS price and CPU/Memory/ETCD garage usage, offering treasured insights for effective useful resource consolidation and node capability making plans…learn extra

Continuous functionality and scale validation of Red Hat OpenShift AI model-serving stack

January 17, 2024 – Kevin Pouget

The mannequin serving stack incorporated in OpenShift AI is normally to be had (GA) as of December 2023 (unlock 2.5), that means that OpenShift AI is totally operational to deploy and serve inference fashions by the use of the KServe API. You could have learn my colleague David Gray’s article concerning the functionality of this mannequin serving stack for enormous language fashions (LLMs). This article supplies a special take a look at that very same mannequin serving stack. It discusses how we pressure examined the mannequin deployment and mannequin serving controllers to substantiate that they carry out and scale neatly in single-model, multi-model and many-model environments…learn extra

Kube-burner: Fanning the flames of innovation within the CNCF Sandbox

January 16, 2024 – Sai Sindhur Malleni

We are overjoyed to percentage some thrilling information with you all – kube-burner, the tough functionality and scale trying out software for Kubernetes and Red Hat OpenShift, has formally accomplished the CNCF Sandbox standing! In the evolving panorama of cloud-native applied sciences, the Cloud Native Computing Foundation (CNCF) serves as a hub for incubating and nurturing cutting edge initiatives. Kube-burner has turn into the primary and handiest functionality and scale trying out software to score this reputation, thereby raising the significance of functionality and scale within the cloud local panorama…learn extra

Evaluating LLM inference functionality on Red Hat OpenShift AI

January 16, 2024 – David Gray

The generative synthetic intelligence (AI) panorama has passed through fast evolution over the last 12 months. As the facility of generative vast language fashions (LLMs) grows, organizations an increasing number of search to harness their functions to satisfy industry wishes. Because of the serious computational calls for of operating LLMs, deploying them on a performant and dependable platform is significant to creating cost-effective use of the underlying {hardware}, particularly GPUs.

This article introduces the method and result of functionality trying out the Llama-2 fashions deployed at the mannequin serving stack incorporated with Red Hat OpenShift AI. OpenShift AI is a versatile, scalable MLOps platform with equipment to construct, deploy and set up AI-enabled programs. Built the usage of open supply applied sciences, it supplies depended on, operationally constant functions for groups to experiment, serve fashions and ship cutting edge apps…learn extra

Operating Tekton at scale: 10 courses realized from Red Hat Trusted Application Pipeline

January 11, 2024 – Pradeep Surisetty, Gabe Montero, Pavel Macik, Ann Marie Fred, Jan Hutař

Red Hat Trusted Application Pipeline is constructed on best of Red Hat OpenShift Pipelines and its upstream challenge, Tekton. We use Tekton’s Pipelines as Code, Tekton Chains and Tekton Results functions to offer a extra scalable construct setting with an enhanced safety posture to energy Red Hat’s next-generation construct programs.

This weblog stocks some generic learnings appropriate to Trusted Application Pipeline or OpenShift Pipelines and any vast workload distributing device on OpenShift or Kubernetes…learn extra

Behind the scenes: Introducing OpenShift Virtualization Performance and Scale

January 9, 2024 – Jenifer Abrams

Red Hat OpenShift Virtualization is helping to take away workload boundaries through unifying digital gadget (VM) deployment and control along containerized programs in a cloud-native approach. As a part of the bigger Performance and Scale workforce, we now have been deeply concerned within the dimension and research of VMs operating on OpenShift for the reason that early days of the KubeVirt open supply challenge and feature helped to force product adulthood thru new function analysis, workload tuning and scale trying out. This article dives into a number of of our focal point spaces and stocks further insights into operating and tuning VM workloads on OpenShift…learn extra

KrknChaos is becoming a member of CNCF Sandbox

January 9, 2024 – Naga Ravi Chaitanya Elluri, Brian Riordan, Pradeep Surisetty

We are excited to announce that krknChaos, a chaos engineering software for Kubernetes involved in bettering resilience and function, has been accredited as a Sandbox challenge through the Cloud Native Computing Foundation (CNCF). Additional main points will also be discovered within the proposal. We want to thank the TAG App Delivery workforce (to call a couple of, Josh Gavant and Karena Angell and workforce), the CNCF Technical Oversight Committee (TOC) for his or her valuable steering and strengthen all over the method and naturally the workforce and group for his or her valuable contributions that are secret to meaking this occur…learn extra

Supercharging chaos trying out the usage of AI

January 8, 2024 – Naga Ravi Chaitanya Elluri, Mudit Verma, Sandeep Hans

There has been an enormous build up in call for for operating advanced programs with tens to loads of microservices at large scale. End customers be expecting 24/7 availability of products and services they rely on, so even a couple of mins of downtime issues. A proactive chaos engineer is helping meet consumer expectancies through figuring out bottlenecks, hardening products and services earlier than downtime happens in a manufacturing setting. Chaos engineering is essential to steer clear of dropping agree with along with your finish customers…learn extra

Quantifying functionality of Red Hat OpenShift for Machine Learning (ML) Training on Supermicro A+ Servers with MLPerf Training v3.1

November 23, 2023 – Diane Feddema

We are proud to announce the primary MLPerf Training submission on Red Hat OpenShift, which could also be the primary MLPerf coaching submission on a variant of Kubernetes. Red Hat collaborated with Supermicro in this submission and ran the benchmarks on a Supermicro GPU A+ Server with 8XH100 NVIDIAGPUs. Red Hat OpenShift is helping show you how to run, reproduce and track AI/ML workloads, whilst including minimum overhead for your coaching jobs. In this weblog we give you the functionality numbers of our fresh submission to MLPerf v3.1 coaching…learn extra

OpenShift Cluster Manager API: Load-testing, breaking, and bettering it

October 26, 2023 – Vicente Zepeda Mas

Red Hat OpenShift Cluster Manager (OCM) is a controlled provider the place you’ll set up, alter, perform, and improve your Red Hat OpenShift clusters. Because OCM is a cornerstone of Red Hat’s hybrid cloud technique, we attempt to make certain that it’s scalable sufficient to take care of height visitors and to find bottlenecks that may be mounted to provide a pleasing enjoy for our shoppers. The Performance & Scale workforce found out a functionality downside that affected a core part of the API. In this weblog submit, we talk about how we recognized the issue, how we labored as a cross-functional workforce to spot and attach it, and the measures we applied to forestall identical incidents from going down someday…learn extra

Data Plane Development Kit (DPDK) latency in Red Hat OpenShift – Part I

October 11, 2023 – Rafael Folco, Karl Rister, Andrew Theurer

In this newsletter, we provide the result of DPDK latency exams performed on a unmarried node OpenShift (SNO) cluster. The exams have been carried out the usage of the visitors generator MoonGen, which makes use of the {hardware} timestamping strengthen for measuring packet latencies as they move throughout the community adapters. The result of those exams supply insights into the functionality of DPDK in a real-world setting and be offering steering for community architects and directors in quest of to optimize community latency…learn extra

Running 2500 pods according to node on OCP 4.13

August 22, 2023 – Andrew Collins

There is a default PPN of 250. Customers who exceed 250 ask whether or not they are able to scale past the broadcast most of 500. They ask, “How can we better utilize the capacity of our large bare metal machines?”…learn extra

Bulk API in Automation Controller

August 9, 2023 – Nikhil Jain

Automation controller has a wealthy ReSTful API. REST stands for Representational State Transfer and is occasionally spelled as “ReST”. It is determined by a stateless, client-server, and cacheable communications protocol, normally the HTTP protocol. REST APIs supply get entry to to assets (information entities) by the use of URI paths. You can discuss with the automation controller REST API in a internet browser at: title>/api/…learn extra

Red Hat Enterprise Linux achieves important functionality good points with Intel’s 4th Generation Xeon Scalable Processors

April 20, 2023 – Michey Mehta, Bill Gray, David Dumas, Douglas Shakshober

Intel just lately introduced the 4th era of Intel® Xeon® Scalable processors, a circle of relatives of top-end, enterprise-focused processors centered at a various vary of workloads. To discover how Intel’s new chips measure up, we’ve labored with Intel and others to run benchmarks with Red Hat Enterprise Linux 8.4, 8.5, 8.6, 9.0 and 9.1, in addition to CentOS Stream 9.2 (which is able to turn into Red Hat Enterprise Linux 9.2)…learn extra

OpenShift/Kubernetes Chaos Stories

March 15, 2023 – Naga Ravi Chaitanya Elluri

With the rise in adoption and reliance on virtual era and microservices structure, the uptime of an utility hasn’t ever been extra vital. Downtime of even a couple of mins can result in massive earnings loss and most significantly agree with. This is strictly why we proactively focal point on figuring out bottlenecks and bettering the resilience and function of OpenShift beneath chaotic stipulations…learn extra

Enhancing/Maximizing your Scaling capacity with Automation Controller 2.3

March 13, 2023 – Nikhil Jain

Red Hat Ansible Automation Platform 2 is the following era automation platform from Red Hat’s depended on venture era mavens. We are excited to announce that the Ansible Automation Platform 2.3 unlock comprises automation controller 4.3…learn extra

Red Hat new Benchmark effects on AMD EPYC4 (Genoa) processors

January 6, 2023 – Red Hat Performance Team

Red Hat has persisted to paintings with our companions to higher allow international elegance functionality. Recently, AMD launched its EPYC “Genoa” 4th Gen Data Center CPU, referred to as the AMD EPYC 9004 Series. With a die dimension of 5nm, AMD larger the core rely to 96 cpu and 192 threads / socket with 384 MB L3 cache dimension…learn extra

A Guide to Scaling OpenShift Data Science to Hundreds of Users and Notebooks

December 13, 2022 – through Kevin Pouget

Red Hat OpenShift Data Science supplies an absolutely controlled cloud provider setting for information scientists and builders of clever programs. It gives an absolutely supported setting during which to abruptly expand, educate, and take a look at gadget studying (ML) fashions earlier than deploying in manufacturing…learn extra

Run Windows workloads on OpenShift Container Platform

November 30, 2022 – through Krishna Harsha Voora, Venkata Anil Kommaddi, Sai Sindhur Malleni

OpenShift is helping deliver the facility of cloud-native and containerization for your programs, it doesn’t matter what underlying working programs they depend on. For use circumstances that require each Linux and Windows workloads, Red Hat OpenShift permits you to deploy Windows workloads operating on Windows server whilst additionally supporting conventional Linux workloads…learn extra

A Guide to Functional and Performance Testing of the NVIDIA DGX A100

June 23, 2022 – through Kevin Pouget

This weblog submit, a part of a chain at the DGX-A100 OpenShift release, items the practical and function review we carried out to validate the conduct of the DGX™ A100 device, together with its 8 NVIDIA A100 GPUs. This learn about used to be carried out on OpenShift 4.9 with the GPU computing stack deployed through NVIDIA GPU Operator v1.9…learn extra

Scaling Automation Controller for API Driven Workloads

June 20, 2022 – Elijah Delee

When scaling automation controller in an venture group, directors are confronted with extra shoppers automating their interactions with its REST API. As with any internet utility, automation controller has a finite capability to serve internet requests, and internet shoppers can…learn extra

Performance Improvements in Automation Controller 4.1

February 28, 2022 – Nikhil Jain

With the discharge of Ansible Automation Platform 2.1, customers now have get entry to to the most recent regulate airplane – automation controller 4.1. Automation controller 4.1 supplies important functionality enhancements when in comparison to its predecessor Ansible Tower 3.8. To put this into context, we used Ansible Tower 3.8 to run jobs, seize quite a lot of metrics…learn extra

The Curious Case of the CPU Eating Gunicorn

June 2, 2022 – Gonza Rafuls

We determined to take a primary check out hands-on manner following the long run QUADS roadmap and re-architect our legacy touchdown/requests portal utility, up to now a trusty LAMP stack, into a fully rewritten Flask / SQLAlchemy / Gunicorn / Nginx next-gen platform…learn extra

Entitlement-Free Deployment of the NVIDIA GPU Operator on OpenShift

December 14, 2021 – Kevin Pouget

Version 1.9.0 of the GPU Operator has simply landed in OpenShift OperatorHub, with many various updates. We’re proud to announce that this model comes with the strengthen of the entitlement-free deployment of NVIDIA GPU Driver…learn extra

Red Hat collaborates with NVIDIA to ship record-breaking STAC-A2 Market Risk benchmark

November 9, 2021 – Sebastian Jug

We are glad to announce a record-breaking functionality with NVIDIA within the STAC-A2 benchmark, putting forward Red Hat OpenShift’s skill to run compute heavy, excessive functionality workloads. The Securities Technology Analysis Center (STAC®) facilitates a enormous staff of monetary corporations and era distributors…learn extra

Red Hat Satellite 6.9 with Puma Web Server

September 15, 2021 – Imaanpreet Kaur

Until Red Hat Satellite 6.8, the Passenger internet/app server used to be a core part of Red Hat Satellite. Satellite used Passenger to run Ruby programs comparable to Foreman. Satellite 6.9 is not the usage of the Passenger internet server. The Foreman utility (primary UI and API server) used to be ported to make use of the Puma challenge…learn extra

Using NVIDIA A100’s Multi-Instance GPU to Run Multiple Workloads in Parallel on a Single GPU

August 26, 2021 – Kevin Pouget

The new Multi-Instance GPU (MIG) function shall we GPUs according to the NVIDIA Ampere structure run a couple of GPU-accelerated CUDA programs in parallel in an absolutely remoted method. The compute devices of the GPU, in addition to its reminiscence, will also be partitioned into a couple of MIG circumstances…learn extra

Multi-Instance GPU Support with the GPU Operator v1.7.0

June 15, 2021 – Kevin Pouget

Version 1.7.0 of the GPU Operator has simply landed in OpenShift OperatorHub, with many various updates. We are proud to announce that this model comes with the strengthen of the NVIDIA Multi-Instance GPU (MIG) function for the A100 and A30 Ampere playing cards…learn extra

Making Chaos Part of Kubernetes/OpenShift Performance and Scalability Tests

March 17, 2021 – Naga Ravi Chaitanya Elluri

While we understand how vital functionality and scale are, how are we able to engineer for it when chaos turns into commonplace in advanced programs? What position does Chaos/Resiliency trying out play all through Performance and Scalability analysis? Let’s take a look at the method that we wish to include to imitate a genuine international manufacturing setting to seek out the bottlenecks and attach them earlier than it affects the customers and shoppers…learn extra

Demonstrating Performance Capabilities of Red Hat OpenShift for Running Scientific HPC Workloads

November 11, 2020 – David Gray and Kevin Pouget

This weblog submit is a follow-up to the former weblog submit on operating GROMACS on Red Hat OpenShift Container Platform (OCP) the usage of the Lustre filesystem. In this submit, we will be able to display how we ran two clinical HPC workloads on a 38-node OpenShift cluster the usage of CephFS with OpenShift Container Storage in exterior mode…learn extra

A Complete Guide for Running Specfem Scientific HPC Workload on Red Hat OpenShift

November 11, 2020 – Kevin Pouget

Specfem3D_Globe is a systematic high-performance computing (HPC) code that simulates seismic wave propagation, at a world or regional scale (site and repository). It is determined by a three-D crustal mannequin and takes under consideration parameters such because the Earth density, topography/bathymetry, rotation, oceans, or self-gravitation…learn extra

Running HPC workloads with Red Hat OpenShift Using MPI and Lustre Filesystem

October 29, 2020 – David Gray

The necessities related to information science and AI/ML programs have driven organizations towards the usage of extremely parallel and scalable {hardware} that steadily resemble excessive functionality computing (HPC) infrastructure. HPC has been round for some time and has developed to incorporate extremely vast supercomputers that run hugely parallel duties and perform at exascale (ready to accomplish a thousand million billion operations according to 2d)…learn extra

Introduction to Kraken, a Chaos Tool for OpenShift/Kubernetes

October 8, 2020 – Yashashree Suresh and Paige Rubendall

Chaos engineering is helping in boosting self assurance in a device’s resilience through “breaking things on purpose.” While it is going to appear counterintuitive, it will be important to intentionally inject screw ups into a posh device like OpenShift/Kubernetes and test whether or not the device recovers gracefully…learn extra

roosho Senior Engineer (Technical Services)

I am Rakib Raihan RooSho, Jack of all IT Trades. You got it right. Good for nothing. I try a lot of things and fail more than that. That's how I learn. Whenever I succeed, I note that in my cookbook. Eventually, that became my blog.

See Full Bio

share this article.

Type and hit Enter to search

Red Hat Performance and Scale Engineering

Red Hat Performance and Scale Engineering

September 10, 2024 Sanjay Rao, Stefan Hajnoczi

September 4, 2024 Jenifer Abrams

September 5, 2024 Stefan Hajnoczi Kevin Wolf, Emanuele Giuseppe Esposito, Paolo Bonzini, and Peter Krempa

Generative AI fine-tuning of LLMs: Red Hat and Supermicro exhibit exceptional effects for effective Llama-2-70b tremendous tuning the usage of LoRA in MLPerf Training v4.0

July 26, 2024 Diane Feddema, Dr Nikola Nikolov

Unleashing 100GbE community potency: SR-IOV in Red Hat OpenShift on OpenStack

July 25, 2024 – Pradipta Sahoo

Scaling Red Hat OpenStack Platform 17.1 to greater than 1000+ digital nodes

July 9, 2024 – Asma Suhani Syed Hameed, Rajesh Pulapakula

Sharing is being concerned: How to benefit from your GPUs (section 1 – time-slicing)

July 2, 2024 – Carlos Camacho, Kevin Pouget, David Gray, Will McGrath

June 28, 2024 – Alex Kros

June 25, 2024 – Boaz Ben Shabat

June 21, 2024 – Venkata Anil Kommaddi

IPsec Performance on Red Hat Enterprise Linux 9: A Performance Analysis of AES-GCM

June 13, 2024 – Otto Sabart, Adam Okuliar

May 30, 2024- Russell Zaleski Murali Krishnasamy, David Sanz Moreno, Mohit Sheth

Accelerating generative AI adoption: Red Hat OpenShift AI achieves spectacular leads to MLPerf inference benchmarks with vLLM runtime

April 24, 2024 – Mustafa Eyceoz, Michey Mehta, Diane Feddema, Ashish Kamra

Red Hat Enterprise Linux Performance Results on fifth Gen Intel® Xeon® Scalable Processors

April 4, 2024 – Bill Gray, David Dumas, Douglas Shakshober, Michey Mehta

March 19, 2024 – Vishnu Challa

March 19, 2024 – Vishnu Challa

March 13, 2024 – Robert Krawitz

March 4, 2024 – Sai Sindhur Malleni, Vishnu Challa, Raul Sevilla Canavate

5 techniques we paintings to optimize Red Hat Satellite

March 4, 2024 – Imaanpreet Kaur, Jan Hutař, Pablo Mendez Hernandez

February 22, 2024 – Elvir Kuric

DPDK latency in OpenShift – Part II

February 20, 2024 – Rafael Folco, Karl Rister, Andrew Theurer

Correlating QPS price with useful resource usage in self-managed Red Hat OpenShift with Hosted Control Planes

January 23, 2024 – Guoqing Li

Continuous functionality and scale validation of Red Hat OpenShift AI model-serving stack

January 17, 2024 – Kevin Pouget

Kube-burner: Fanning the flames of innovation within the CNCF Sandbox

January 16, 2024 – Sai Sindhur Malleni

Evaluating LLM inference functionality on Red Hat OpenShift AI

January 16, 2024 – David Gray

Operating Tekton at scale: 10 courses realized from Red Hat Trusted Application Pipeline

January 11, 2024 – Pradeep Surisetty, Gabe Montero, Pavel Macik, Ann Marie Fred, Jan Hutař

Behind the scenes: Introducing OpenShift Virtualization Performance and Scale

January 9, 2024 – Jenifer Abrams

KrknChaos is becoming a member of CNCF Sandbox

January 9, 2024 – Naga Ravi Chaitanya Elluri, Brian Riordan, Pradeep Surisetty

Supercharging chaos trying out the usage of AI

January 8, 2024 – Naga Ravi Chaitanya Elluri, Mudit Verma, Sandeep Hans

Quantifying functionality of Red Hat OpenShift for Machine Learning (ML) Training on Supermicro A+ Servers with MLPerf Training v3.1

November 23, 2023 – Diane Feddema

October 26, 2023 – Vicente Zepeda Mas

Data Plane Development Kit (DPDK) latency in Red Hat OpenShift – Part I

October 11, 2023 – Rafael Folco, Karl Rister, Andrew Theurer

August 22, 2023 – Andrew Collins

August 9, 2023 – Nikhil Jain

Red Hat Enterprise Linux achieves important functionality good points with Intel’s 4th Generation Xeon Scalable Processors

April 20, 2023 – Michey Mehta, Bill Gray, David Dumas, Douglas Shakshober

March 15, 2023 – Naga Ravi Chaitanya Elluri

March 13, 2023 – Nikhil Jain

Red Hat new Benchmark effects on AMD EPYC4 (Genoa) processors

January 6, 2023 – Red Hat Performance Team

December 13, 2022 – through Kevin Pouget

November 30, 2022 – through Krishna Harsha Voora, Venkata Anil Kommaddi, Sai Sindhur Malleni

June 23, 2022 – through Kevin Pouget

June 20, 2022 – Elijah Delee

February 28, 2022 – Nikhil Jain

June 2, 2022 – Gonza Rafuls

December 14, 2021 – Kevin Pouget

Red Hat collaborates with NVIDIA to ship record-breaking STAC-A2 Market Risk benchmark

November 9, 2021 – Sebastian Jug

Red Hat Satellite 6.9 with Puma Web Server

September 15, 2021 – Imaanpreet Kaur

August 26, 2021 – Kevin Pouget

June 15, 2021 – Kevin Pouget

March 17, 2021 – Naga Ravi Chaitanya Elluri

November 11, 2020 – David Gray and Kevin Pouget

November 11, 2020 – Kevin Pouget

October 29, 2020 – David Gray

October 8, 2020 – Yashashree Suresh and Paige Rubendall