CyVerse_logo2

Home_Icon2 Learning Center Home

Introduction to Cloud Computing

The simplist way to think of cloud is that its like a laptop or desktop computer that you connect to remotely. In more granular applications, cloud is a way of submitting a single task to a computer or a set of computers on demand without having to host them or keep them running. Cloud services can also be optimized to run many machines in parallel for large cluster-like tasks, similar to what is done using High Performance Computing.

There are three major private cloud providers: Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. These services cost money to use. They do provide free credits to researchers (GCP, AWS, Azure) with short applications.

The big cloud providers have been replicating publicly owned data sets, e.g. 40+ years of NASA and European Space Agency (ESA) earth observation system data, on their cloud-hosted data storage services in the hope that researchers and businesses will pay to use these data by doing cloud computing on them. NASA recently announced a plan to move hundreds of PBs of its data to AWS.

Some cloud services are free, like Google’s Earth Engine, others have limited sandboxes which are useful for training, but may not fit your needs for larger scale data analyses. Launching IDEs like Jupyter Notebooks or RStudio in cloud is possible with platforms like MyBinder and CoLab.

There are also options where your institution can stand up a cloud service on its own hardware: Open Stack and VMWare. CyVerse and XSEDE operate multiple OpenStack clouds which they provide as a service called Atmosphere (CyVerse) and Jetstream (XSEDE) free to researchers.

We will be focusing the workshop time on the applications of ‘containers’ like Docker and Singularity which are a way of taking your research and running analyses on ANY cloud providor. The rapid development of containers and container orchestration is due to the rise of the cloud. The utility of containers to researchers in the areas of sharing and reproducible research are fortunate benefits.


Fix or improve this documentation: