iRODS Development Update: January 2015

This is the first installment of a series of blog posts detailing the ongoing efforts of the iRODS Consortium development team.  We intend to highlight new directions in the design and specifics in the implementation as well as encourage further discussion within the iRODS community. There is a comments section below if you have questions or opinions about anything iRODS.

iRODS Configuration Moving to a JSON Format


The first topic for the next several posts will discuss the move to the use of JSON as the markup language for iRODS configuration.  We decided to unify the syntax of all iRODS configuration files rather than maintain several different parser implementations, each with their own personality and quirks.

JSON was chosen for a number of reasons:

  • We can maintain an independently versioned history of schemas
  • The configuration files can be validated for correctness
  • There is consistent, reliable parsing
  • JSON is web friendly
  • JSON has wide language support
  • JSON will be easier to automatically upgrade with new additions

We currently have the v1 release of the iRODS schemas, which has been included for use within the independently released Zone Report tool for iRODS 4.0.3.  iRODS 4.1 will leverage a new v2 release for the 4.1 Zone Report, as well as all configuration files used by iRODS. This includes the client environment (irods_environment.json), the server configuration (server_config.json), the irods_hosts.json file, as well as the host_access_control.json file.

Backward compatibility for the client environment and the server configuration will be maintained in the upcoming 4.1 release.  On startup, iRODS will first look for the JSON version of the configuration file.  If the JSON configuration file is not available, the server will attempt to load and translate the legacy version of the file into the new in-memory data structure.  Support for legacy configuration will be deprecated and eventually removed.

Configuring the Client Environment

Focusing on the iRODS client environment, a few changes have been made.  While the client environment file still resides in the path $HOME/.irods it is now named $HOME/.irods/irods_environment.json rather than $HOME/.irods/.irodsEnv.  For consistency in behavior, the PID files which contain the current working directory from a given instance of the use of icd are also named irods_environment.json.pid and irods_environment.json.cwd.

For clarity, the naming convention for the entries within the client environment have also changed to follow an all lower case, underscore-separated format with no abbreviations.  For example “irodsHost” is now referenced as “irods_host”, and “irodsAuthFileName” is now “irods_authentication_file”.  Additionally, any client environment value provided as a shell environment variable will follow the same convention, but using upper case letters. For example, IRODS_HOST or IRODS_AUTHENTICATION_FILE.

The following is a quick review of some of the default client environment parameters:

irods_user_name – the username within iRODS for this account
irods_host – a fully qualified domain name for the given iRODS server
irods_port – the port number for the given iRODS Zone
irods_home – the home directory within the iRODS Zone for a given user
irods_cwd – the current working directory within iRODS
irods_authentication_scheme – this user’s iRODS authentication method, currently: “pam”, “krb”, “gsi” or “native”
irods_default_resource – the name of the resource used for iRODS operations if one is not specified
irods_zone – the name of the iRODS Zone in question
irods_authentication_file – fully qualified path to a file holding the credentials of an authenticated iRODS user
irods_log_level – desired verbosity of the iRODS logging
irods_debug – desired verbosity of the debug logging level

We have also added two other parameters for the advanced client-server negotiation:

irods_client_server_negotiation – set to “request_server_negotiation” indicating advanced negotiation is desired, for use in enabling SSL and other technologies
irods_client_server_policy – “CS_NEG_REFUSE” for no SSL, “CS_NEG_REQ” to demand SSL, or “CS_NEG_DONT_CARE” to allow the server to decide

Since parallel transfer does not make use of SSL, we have added a full set of encryption parameters for this method of transport:

irods_encryption_key_size – key size for parallel transfer encryption
irods_encryption_salt_size – salt size for parallel transfer encryption
irods_encryption_num_hash_rounds – number of hash rounds for parallel transfer encryption
irods_encryption_algorithm – EVP supplied encryption algorithm for parallel transfer encryption

iRODS 4.x also supports configurable hashing for checksums:

irods_default_hash_scheme – currently either MD5 or SHA256
irods_match_hash_policy – ‘strict’ to refuse defaulting to another scheme or ‘compatible’ for supporting alternate schemes

iRODS 4.1 and beyond will also include the SSL configuration within the client environment:

irods_ssl_ca_certificate_path – location of a directory containing CA certificates in PEM format. The files each contain one CA certificate. The files are looked up by the CA subject name hash value, which must hence be available. If more than one CA certificate with the same name hash value exist, the extension must be different (e.g. 9d66eef0.0, 9d66eef0.1 etc). The search is performed in the ordering of the extension number, regardless of other properties of the certificates. Use the ‘c_rehash’ utility to create the necessary links.

irods_ssl_ca_certificate_file – location of a file of trusted CA certificates in PEM format. Note that the certificates in this file are used in conjunction with the system default trusted certificates.

irods_ssl_verify_server – what level of server certificate based authentication to perform. ‘none’ means not to perform any authentication at all. ‘cert’ means to verify the certificate validity (i.e. that it was signed by a trusted CA). ‘hostname’ means to validate the certificate and to verify that the irods_host‘s FQDN matches either the common name or one of the subjectAltNames of the certificate. ‘hostname’ is the default setting.

irods_ssl_certificate_chain_file – the file containing the server’s certificate chain. The certificates must be in PEM format and must be sorted starting with the subject’s certificate (actual client or server certificate), followed by intermediate CA certificates if applicable, and ending at the highest level (root) CA.

irods_ssl_certificate_key_file – private key corresponding to the server’s certificate in the certificate chain file.

irods_ssl_dh_params_file – the Diffie-Hellman parameter file location

Next Time…

The next development update will review the new iRODS server_config.json and database_config.json files.  Please add comments with any questions, feedback, or requests for topics to be covered in later updates.

  • Generally looking good with some progress in this area! I wonder though, since AFAIK, JSON is not super-human friendly for editing (being rather easy to break the format), will there be some kind of tool for edit the config?

  • Andrey Nevolin

    Just want to remind you about non-trivial bug related to “*.pid” files: https://github.com/irods/irods/issues/2406

    Since you are redesigning configuration files and structures, may be it makes sense to reconsider the concept of “*.pid” files. Or just fix it somehow (for example by introducing cleanup at each iRODS server restart).

  • Andrey Nevolin

    Could you please clarify the meaning of ‘irods_cwd’ client configuration parameter?
    Does it change with each ‘icd’ command?

  • Andrey Nevolin

    Didn’t catch the meaning of ‘irods_client_server_negotiation’…

  • jason coposky

    Answering the questions bottom to to:

    We will not provide a tool to edit the json but we will ship the schema in order to allow for validation of the json which will show syntax errors, missing elements or invalid values.

    We will reconsider the use of the pid files in possible future releases. as icommands have historically been used in scripts we will need some way to maintain state for mulitple processes in flight from the same user account. we can investigate tools to clean up orphaned PID files to prevent the use of an inappropriate irods_cwd value.

    The irods_cwd is the same paramter as irodsCwd from the previous implementation. since we do not want to move things too quickly we have made a one to one mapping from the old environment file to the new one, with a few additions. if there are no PID files then the irods_cwd is used as the current working directory. this value can then be overridden by a PID file, should one exist.

    The irods_client_server_negotation is a configuration parameter used to request the advanced negotiation. this is a simple protocol we added before authentication which is used to request SSL among other things.

    • Andrey Nevolin

      Thanks!

  • Colin

    Since this post was written, the parameter for defining the zone have changed from `irods_zone` to `irods_zone_name` (see the “Configuration Files” page in the configuration manual).

    It’s also worth noting that the `irods_port` argument must be an int, not a string.

Next:

Previous: