Release Notes 2.2.77
Upgrades to DDN Infinia 2.2.77
DDN Infinia 2.2.77 supports both online and offline upgrade methods:
- Online upgrades — The cluster remains operational throughout the process.
- Offline upgrades — The cluster is taken offline for the duration of the upgrade.
The table below outlines the supported upgrade paths to version 2.2.77:
For detailed upgrade instructions (both online and offline), see Upgrade DDN Infinia.
New Features
New Features in DDN Infinia 2.2.77
None.
New Features in DDN Infinia 2.2
The following are the new features in DDN Infinia 2.2:
- Data Services Features
- Supports native “C” SDK.
- Supports object and bucket level tagging and versioning.
- PVC resizing and quota management for Kubernetes workloads.
- Enterprise Features
- Supports embedded DNS server with integrated health checks and configurable failover settings.
- Expanded support for block metrics.
- Documentation support for troubleshooting issues with DDN Infinia.
- API and SDK related documentation for developers to integrate and automate workflows.
- Improved stability and reliability.
Preview Features in DDN Infinia 2.2
Features listed in this section are available for preview only. To run any of these features, contact DDN sales representative or field support engineer. These preview features are not supported for production use and hot fixes will not be provided for any issues found.
The following are the preview features in DDN Infinia 2.2:
- Data Services (Data Access)
- CSI Data Service (block)
- Native Hadoop Data Service
Security Enhancements and Hardening
DDN Infinia 2.2 continues DDN’s commitment to security-first engineering by prioritizing internal security review of new components and reinforcing baseline protections through targeted scanning and hardening. This release introduces Tenable Vulnerability Management for OS-level CVE detection and CIS benchmark reporting, complementing continued use of OpenSCAP for configuration assessment. These efforts are guided by industry-recognized frameworks, including NIST SP 800-209 and CIS Level 1 recommendations.
The release of DDN Infinia 2.2 introduces a series of robust security enhancements as follows:
-
OS-Level Vulnerability Assessment Using Tenable Vulnerability Management
DDN Infinia 2.2 introduces Tenable Vulnerability Management for scanning OS-level vulnerabilities on release-targeted system images. The most recent scans reported no critical or high-severity CVEs, confirming a clean baseline at release. These assessments are conducted regularly to guide patching priorities and support secure platform maintenance.
-
CIS Benchmark-Based Hardening
DDN Infinia 2.2 expands its benchmark-driven hardening efforts by incorporating Tenable Vulnerability Management alongside OpenSCAP to generate CIS Level 1 reports and guide configuration tuning. The use of both tools improves coverage and cross-validation of system posture. Configuration adjustments are tested for compatibility and stability across deployment environments.
Fixed Issues
The following sections list the issues fixed in each release of DDN Infinia.
Fixed in DDN Infinia 2.2.77
- Resolved a performance regression issue observed with small objects.
Fixed in DDN Infinia 2.2
- Upgrades no longer require a manual step to update the reverse proxy configuration on each node to expose the metrics telemetry data.
Fixed in DDN Infinia 2.1
- Buckets created through CLI/API were applied incorrect quotas.
- The reds3 client could not initialize with a large number of subtenants.
Fixed in DDN Infinia 2.0
- RED API container sometimes used to crash due to congestion or network instability, causing all ongoing redcli commands to become unresponsive. This particularly affected long-running tasks, such as redcli logs upload operations.
- redcli commands sometimes used to stop responding due to logs rotation not working correctly.
- redcli realm configuration updates sometimes used to fail with a 504 Gateway timeout.
Fixed in DDN Infinia 1.3
- Creating and deleting a bucket or dataset in a cycle with the same name was sometimes causing errors.
- With large multipart uploads the runs were sometimes failing either due to data checks reporting unexpected values or due to ENOSPC hits. In some cases, the S3 client used to crash and restart.
- Performance improvements and enhancements.
- Scalability improvements.
Known Issues and Limitations
The following are known issues and limitations of DDN Infinia 2.2.77.
Stretch Cluster
-
Issue: In a stretch cluster configuration, if the network connection between the sites goes down, one site will continue to serve I/O operations. However, the CLI on the other site will remain unresponsive until the connection is restored.
-
Issue: In a stretch cluster configuration, if the link between sites goes down but both sites can communicate with the witness node, the cluster may still become unavailable.
-
Issue: Downloads may fail or take longer than expected after taking a site down.
-
Issue: When using a stretch cluster, it’s important that the correct data protection profile is applied to new buckets. To do this, create buckets using
redcli dataset createas shown in this example: -
Issue: Cluster recovery after site failure: To complete a cluster recovery after failover or failback, a realm configuration update is required to ensure that ETCD is correctly relocated. If the user does not already have a realm configuration file, it should first be generated using
redcli realm config generate. This will create a file named realm_config.yaml, which can then be used to perform the update usingredcli realm config update -f realm_config.yaml. -
Issue: In some multi-site configurations, the etcd node may not fail over or redistribute to the third site, even when it is technically possible. This behavior is observed when a site is taken down and then brought back up—etcd node distribution remains unbalanced.
Workaround: Manual intervention is required to restore balance. This can be done by reapplying the realm configuration or restarting one of the nodes on the site that hosts more than one etcd node. If this does not result in an even distribution across sites, restart the other node running etcd on that site.
Other
-
Issue: Data services using the Container Storage Interface (CSI) don’t function.
Workaround: Don’t upgrade to DDN Infinia 2.2.77.
-
Issue: The REDMON * Trend dashboard doesn’t show any data.
Workaround: Contact DDN Support for a corrected set of Grafana files.
-
Issue: During an online upgrade, the
instance.logfile may show a “segmentation fault” message.Workaround: The node should recover on its own. If it doesn’t, perform the upgrade again or perform an offline upgrade.
-
Issue: During an online upgrade, if you run
redcli task showon the upgrade task, you may see that the upgrade has failed in the STAGE_UPGRADE_CORE_SERVICES stage. In addition, theinstance.logfile may show a critical internal error, “assertion failed.”Workaround: Try the upgrade again.
-
Issue: During upgrades, if you are using Infinia with the integrated DNS round-robin load balancing, there is a slight delay between the reds3 process terminating and the node’s IP being removed from DNS. During this time, you may experience intermittent connection errors such as “Connection reset by peer” or “Connection refused” as your application sends requests to the restarting reds3 servers.
Workaround: Most of these errors should be masked when using S3 recommended retry settings. For mission-critical applications, a traditional load balancer can be used to ensure requests are not routed to restarting servers.
-
Issue: When performing an NVMe drive firmware update, a rare HMI poller crash may cause CATs to remain evicted and drives to have a status of missing or failed even after the firmware update completes successfully.
Workaround: The final status of the firmware update should be FIRMWARE_UPDATE_STATUS_NO_UPDATE_TASK. Wait at least ten minutes after the firmware update is complete to see if any of the following symptoms remain:
- CATs show as Evicted
- Drives show old firmware
- Drives show as missing or failed
To resolve the problem, perform the following steps:
-
Issue: In certain scenarios, nameserver monitoring may be disabled following an upgrade. You can verify this by executing
redcli nameserver status, which will report DNS monitoring as disabled if the issue is present.Workaround: To restore monitoring, execute
redcli nameserver monitoring enable. -
Issue: Systems using Broadcom cards on Ubuntu 24.04 could encounter continuous reboots during cluster creation if they aren’t using the Hardware Enablement kernel.
Workaround: Install the Hardware Enablement kernel using the following commands:
-
Issue: CSI data service quota not honored: When creating a new block data service using
redcli service create, the initial quota for the data service may not be properly set.Workaround: The quota must be manually updated using
redcli service updatewith the-bargument to specify the desired quota. -
Issue: Executing
redcli cat deletemay not delete an existing CAT: Under certain conditions, attempting to delete an existing CAT from the cluster using theredcli cat deletemay fail silently—the CAT is not removed, and no error is returned.Workaround: Delete the CAT from the inventory using
redcli inventory deleteand then re-runredcli cat deleteto complete the removal. -
Issue: Unable to authenticate via
redclias an AD user after adding an LDAP identity provider: LDAP or AD federated users are authenticated by binding to their Distinguished Name (DN) attribute. If the userAttr field in the identity provider (IDP) configuration is set to a value other than cn, the DN constructed for the user does not match, causing authentication to fail.Workaround: Set the userAttr field in the LDAP IDP configuration to cn. This enables DDN Infinia 2.2.77 to correctly construct the DN for user binding. After this change, use the cn value with the following commands:
Note that this is a temporary workaround. DDN Infinia team is working on a permanent fix for this issue.
-
Issue: DDN Infinia may report INVALID readings for certain sensors: Under specific conditions—such as when the queried hardware is absent, or the BMC lacks monitoring capabilities—the
redcli server sensor-listcommand may return INVALID readings. In such cases, ipmitool will return ‘na’.Workaround: This issue has been reported to the hardware manufacturer. Currently, there is no workaround available.
-
Issue: If a cluster operates with non-uniform fault domains (for example, racks with different node counts) over an extended period, experiences drive failures, and eventually fills up, it may enter a state where some drives become full before others. This can result in the cluster reporting a full state even though overall capacity is still available.
Workaround: To prevent this condition, ensure fault domains are evenly balanced and failed drives are promptly replaced.
-
Issue: After executing
redcli realm stopcommand, theredcli realm startcommand fails.Workaround: Stopping the cluster should be done with caution as stopping data services or cluster components could result in data unavailability. If you need to stop the entire cluster for maintenance reasons, use
redcli cluster stopinstead. If you have already usedredcli realm stop, contact DDN Support for help restarting the cluster. -
Issue: reds3 instance restarts intermittently due to the panic on bucket deletion.
-
Issue: Active Directory (AD)/LDAP authentication fails when the user’s relative distinguished name (RDN) contains space characters.
-
Issue: Updating S3 SSL certificates via
redcli certificate installor the equivalent API call(s) causes a momentary outage of S3 and ancillary services in DDN Infinia. -
Issue: Attempting to update S3 SSL certificates via
redcli certificate installor the equivalent API call(s) may fail if DDN Infinia management network interfaces are offline on any nodes in the realm. -
Issue: The following are limitations for forwarding logs via OpenTelemetry protocol:
- DDN Infinia 2.2.77 supports the OpenTelemetry HTTP protocol
otlphttponly. - Only a single logs backend is currently supported. If the command is run a second time, it will overwrite the configuration and use only the new logs exporter endpoint configuration.
- The configuration does not currently persist as part of the realm. Therefore, if nodes are added to the cluster or a node fails and is replaced, the command must be run again to regenerate the configuration and restart the telemetry agent collector on each node in the cluster.
- DDN Infinia 2.2.77 supports the OpenTelemetry HTTP protocol
-
Issue: Multiple group membership in Active Directory is a restriction.
-
Issue: Groups mapped to nested groups in active directory are not supported.
-
Issue: Only a single default system pool is supported.
-
Issue: Dynamic cluster expansion limitation—at present the maximum number of nodes that can be added at a time to a cluster is limited to the half of the cluster size at the time of expansion.
-
Issue: DDN Infinia doesn’t support multiple network interfaces in the same subnet. Deployment will not be blocked if multiple network interfaces per subnet are identified, but the system will be unstable. A sysadmin event is triggered (“Should not have more than one interface per virtual network in v1. Networking could be unstable”). The system will function if multiple network interface is configured to be separated in different subnets.
-
Issue: Creating, deleting, and recreating of tenant or subtenant in a cycle with the same name was sometimes causing errors on creating new buckets under the same subtenant.
-
Issue: Multiple drive eviction and reinsertion cycles are not fully supported. DDN Infinia might get into a blocking state, requiring a reboot. To avoid this situation, insert devices one by one. Add a device and wait until it reaches the joined state. Once the device is joined, proceed to add the next device.
-
Issue: In node failure, a cluster may experience transient timeouts of up to 90 seconds, during which IO operations may fail. You can reattempt the operation.
-
Issue: DDN Infinia 2.2.77 allows only one log upload job at a time. If you run concurrent log upload jobs, then redapi will display an error “CallHome: Log upload job is already running”.
Workaround: Check log upload status using
redcli task show <task_id>command. Do not start a log upload until a previous one has finished. -
Issue: Following a complete node failure, when the failing node rejoins, s3 workloads may terminate.
Workaround: As a realm admin, run
redcli realm restart -s reds3on the node. -
Issue: DDN Infinia instance to instance communication supports both RoCE and TCP/IP. Applications to S3 services and NVMeoF support only TCP/IP.
-
Issue: Listing many buckets (for example, 100’s of thousands of buckets) might block the application requesting the listing.
-
Issue: CAT usage shown in following two CLI commands is different.
redcli tenant list: This command shows the actual data stored for each tenant including upsert metadata + bulk (erasure + replicated) data.redcli cat list: This command shows the actual data stored by each CAT including all the data for tenants + CAT internal data stored by various subsystems (for example, CAT, RFS, DLM and so on.)
On any cluster, typically the total CAT usage will be higher than any single tenant usage.

