Help:Dat

From CECS wiki
Jump to navigation Jump to search

Dat is a data distribution tool with a version control feature for tracking changes and publishing data sets. It is primarily used for data-driven science, but it can be used to keep track of changes in any data set. As a distributed revision control system it is aimed at speed, simplicity, security, and support for distributed, non-linear workflows. For a more detailed description of Dat itself, see wikipedia:Dat

View online documentation
https://docs.datproject.org/
Software availability
Other related software
command to type to run
Location of example files

Installing Dat[edit]

There are a number of ways of using Dat

  • Via the Command Line
    • Dat is built using Node.JS. The command line tools allow the sharing, downloading, and seeding (similarly to BitTorrent) of data. Seeding datasets when possible is helpful to the scientific community, as download speeds increase when there are more seeders, the download becomes more reliably available, and it reduces the strain on the creator's systems.
    • If you would just like to seed the data (please!) or if you are a system administrator with a data repository/server with spare room, you can actually run a Dat server as well. This is particularly useful for institutions that are using datasets published via Dat.
  • Via a Desktop Application
    • The Dat project also provides desktop apps for Linux and macOS. These have much of the functionality of the command line tools, with a user friendly interface
    • The Beaker Browser is a web browser with Dat functionality built in. Its primary goals are the realization of a decentralized Web. It is available for Linux, macOS, and Windows.

Installation instructions for the official command line and desktop applications can be found in the Dat documentation

Publishing Data via Dat[edit]

Dat was created with publishing scientific data in mind. If you would like to publish your data via Dat, see the Dat Documentation. It might also be wise to speak to your IT department/server administrator about running a Dat server for your published datasets. The Dat Project also hosts a Registry, DatBase to provide a searchable database of published datasets.