Overview and Training Materials

Technical Documentation

Technical Diagrams

Pattern: Data to Compute

In many existing enterprise and research deployments, HPC clusters are separated from long term data storage technologies. When data needs to be moved to HPC and back again, the Data to Compute design pattern can leverage metadata-driven workflows and automate the execution of an organization's data management policy.
Pattern: Compute to Data

When data is stored in specific locations due to a requirement for specialized software or hardware or only because it is too big and expensive to move, compute requests can be routed to the appropriate location automatically. This metadata-driven computation design pattern could serve as a bridge until the time services are more fully containerized.
Capability: Automated Ingest - Filesystem Scanner

When an organization first discovers iRODS, it is usually true that the organization already has a lot of data in disparate storage systems. The automated ingest framework is based on RQ (Redis Queue) and can scale workers to bring large filesystems under management quickly.
Capability: Automated Ingest - Landing Zone

iRODS is often deployed to capture products from systems that generate new files in a regular way (sequencers, telescopes, microscopes, sensor networks, etc.). The automated ingest framework can be configured to watch locations for new files and extract metadata, define manifests, and otherwise prepare them for use by the rest of the system.
Capability: Storage Tiering

The storage tiering framework provides efficient policy-driven storage utilization by automatically moving data between any number of identified tiers of storage within a configured tiering group. To define a storage tiering group, selected storage resources are labeled with metadata which define their place in the group and how long data should reside in that tier before being migrated to the next tier.
Core: Integration Layer

iRODS provides a layer of abstraction which integrates with your pre-existing infrastructure. This flexibility allows your infrastructure to continue to change over time.

Data Lifecycle

As data matures and reaches a broader community, data management policy must also evolve to meet these additional requirements. iRODS virtualizes the stages of the data lifecycle through policy evolution.
From Prototype to Production

Deploying iRODS requires making decisions about how quickly and how deeply to integrate with existing systems. The flexibility of iRODS allows for a dynamic approach that supports building confidence and trust in the software. Tighter integration and automation can lead to better performance and stronger assertions about what has happened to your data throughout its lifecycle.
Metadata Templates

With metadata being a central part of how iRODS fosters best practices in workflows and provenance, it is also important to encourage good metadata curation. Metadata templates afford iRODS a friendly UI for specifying requirements, validation, and standardization.
Multipart Data Objects: Transfer

An effort to improve reliability and predictability of large file transfers in iRODS, multipart data objects will improve transport speeds (parallel and/or multisource), allow for cache-free object storage plugins, and provide natural support for reliable restarts.

User Group Meetings

Blog Posts

Communication with the iRODS Community

Common Citations (iRODS 4.x)

Hao Xu, Ben Keller, Antoine de Torcy, Jason Coposky (2016) QueryArrow: Bidirectional Integration of Multiple Metadata Sources. 8th iRODS User Group Meeting, University of North Carolina at Chapel Hill. June 2016. (PDF)

Reagan W. Moore, Hao Xu, Mike Conway, Arcot Rajasekar, Jon Crabtree, Helen Tibbo (2016) Trustworthy Policies for Distributed Repositories. 133pp. (publisher)

Hao Xu, Jason Coposky, Ben Keller, Terrell Russell (2015) Pluggable Rule Engine Architecture. 7th iRODS User Group Meeting, University of North Carolina at Chapel Hill. June 2015. (PDF)

Hao Xu, Jason Coposky, Dan Bedard, Jewel H. Ward, Terrell Russell, Arcot Rajasekar, Reagan Moore, Ben Keller, Zoey Greer (2015) A Method for the Systematic Generation of Audit Logs in a Digital Preservation Environment and Its Experimental Implementation In a Production Ready System. 12th International Conference on Digital Preservation, University of North Carolina at Chapel Hill. November 2-6, 2015. (PDF) (direct link)

Terrell Russell, Jason Coposky, Harry Johnson, Ray Idaszak, Charles Schmitt (2013) iRODS Composable Resources. 5th iRODS User Group Meeting, University of North Carolina at Chapel Hill. June 2013. (PDF)

Reagan Moore, Arcot Rajasekar, Hao Xu (2015) DataNet Federation Consortium Preservation Policy Toolkit. 12th International Conference on Digital Preservation, University of North Carolina at Chapel Hill. November 2-6, 2015. (PDF) (direct link)

Arcot Rajasekar, Terrell Russell, Jason Coposky, Antoine de Torcy, Hao Xu, Michael Wan, Reagan W. Moore, Wayne Schroeder, Sheau-Yen Chen, Mike Conway, Jewel H. Ward (2015) The integrated Rule-Oriented Data System (iRODS 4.0) Microservice Workbook. 248pp. (PDF) (amazon)


Papers and White Papers

Media & more ...