Cloud computing is not new network technology, but a network application concept. The core concept of cloud computing is to take the Internet as the centre and provide fast and secure cloud computing services and data storage on the website. Anyone who uses the Internet can use the huge computing resources and data centre on the Internet. Through this technology, it is possible to complete the processing of tens of thousands of data in a short period of time (a few seconds), thereby achieving powerful network services. Cloud computing service forms include the following three:
IaaS (Infrastructure-as-a-Service): Infrastructure as a service. Consumers can obtain services from a complete computer infrastructure through the Internet.
PaaS (Platform-as-a-Service): Platform-as-a-Service. PaaS actually refers to the software development platform as a service and presented to users in the SaaS model.
SaaS (Software-as-a-Service): Software as a service. It is a mode of providing software through the Internet. Users do not need to purchase software, but rent web-based software from providers to manage business activities.
Among the three types of cloud computing services, Infrastructure as a Service (IaaS) and Software as a Service (SaaS) need to be built using OpenStack (cloud computing management platform), but OpenStack (cloud computing management platform) is written in Python.
The Python programming language is flexible and easy to use. It is the best programming language tool for data query, analysis, visualization and mining. It is also necessary to learn the Python programming language before learning cloud computing.
Common Python data analysis libraries.
Pandas is an open source Python library that uses powerful data structures to provide high-performance data manipulation and analysis tools. Before Pandas, Python was mainly used for data migration and preparation. Its contribution to data analysis is even smaller. Pandas solves this problem. Using Pandas can complete the five typical steps of data processing and analysis , regardless of the source of the data-loading, preparation, operation, model and analysis. Python Pandas is used in a wide range of fields, including academic and commercial fields such as finance, economics, statistics, and analysis. In general, Pandas is more suitable for data preprocessing and data structure processing .
NumPy is a Python package. It stands for "Numeric Python". It is a library composed of multidimensional array objects and a collection of routines for processing arrays. NumPy has built-in functions for linear algebra and random number generation. In general, NumPy is suitable for large-scale computing projects such as scientific computing and machine learning , and has even become an excellent substitute for MatLab.
SciPy is an open source BSD licensed math, science and engineering library. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array operations. The main reason for building the SciPy library is that it can work with NumPy arrays and provides many user-friendly and efficient numerical practices, such as numerical integration and optimization routines .
Matplotlib is a Python 2D plotting library that can generate publication quality data in various hardcopy formats and cross-platform interactive environments. Matplotlib can be used for Python scripts, Python and IPython shells, Jupyter notebooks, web application servers, and four graphical user interface toolkits.
Matplotlib is mainly used for data visualization at the end of data analysis . Of course, there are many alternatives to Matplotlib, such as Pychart and echarts .