rucio

Technology from the
Science Mesh Data Source

Built on more than a decade of experience, Rucio serves the data needs of modern scientific experiments. Large amounts of data, countless numbers of files, heterogeneous storage systems, globally distributed data centres, monitoring and analytics. All coming together in modular solution to fit your needs. Rucio serves the data needs of modern scientific experiments. Large amounts of data, countless numbers of files, heterogeneous storage systems, globally distributed data centres, monitoring and analytics. All coming together in modular solution to fit your needs.

Extremely scalable

Need to search through billions of files? Need to transfer petabytes of data? Rucio has got you covered. Our largest installation for the ATLAS Experiment is responsible for more than 450 Petabytes of data, stored in a billion files, distributed over 120 data centres globally, and orchestrating an Exabyte of data access and transfer per year.

Policy-driven

Declarative data management allows you to say what you want, and let Rucio figure out the details how to do it. Manage your data with expressive statements. Three copies of my file on different continents, and have one backup on tape? Automatically remove it once its access popularity goes to zero? No problem.

Insights and analytics

Follow your data evolution over time, so you can keep control. From the popularity of your files, to the storage space and tape accounting of your data centres. Fully integrated with Graphite, ElasticSearch, and Hadoop.

FAIR

Rucio supports the FAIR data principles that promote maximal use of research data!

Smart namespace

Organise your files in datasets and containers, create virtual overlaps, distribute them by scope, or attach important metadata.

Storage support

Rucio connects your existing storage, and you can easily add new and different ones. Even tapes, cloud-based storage, or supercomputers. We want you to have choice and not lock you down to a single solution.

Easy integration

Existing applications and workflow systems can be integrated easily through our open libraries and REST servers. Rucio will not disrupt your experiment's workflows.

Authentication and authorisation

The classic username/password, x509 certificates with proxy support, GSS/Kerberos, SSH public keys, OpenID Connect, and SAML are all supported.

Monitoring

Directly integrated with ElasticSearch and Graphite, so you will never lose track of your data. Follow system performance from a single file to the global overview.

Open source powered

Robust code written in the Python language, unit-tested, PEP-certified. Deploy with pip or containers. It's free as in freedom (Apache v2) and open source!

Consistency

Data loss happens every day, and Rucio is prepared. Smart consistency and recovery mechanisms help you not to lose your data!.

Proven track record

Originally built to withstand the requirements of the high-energy physics experiment ATLAS, Rucio is scalable and robust; But also serving smaller communities, Rucio makes a scientist's life easier.


Official page