Clusters

Clusters define hardware specifications for compute jobs run on Kaspian

Clusters are configured in the Settings > Clusters tab on the left-hand side. Click on the + symbol next to New Cluster. Kaspian currently supports two types of compute clusters. Single-node clusters are useful for running any Python jobs, including those involving ETL on small datasets or PyTorch deep learning models. Spark clusters are useful for running large-scale ETL jobs.


Single-Node Clusters

Single-node clusters are used for running jobs on a single machine

Spark Clusters

Spark clusters are used for running distributed Spark jobs over multiple machines