.. _ch:mpi_first:

Hello MPI
===========

Memory
---------

In the context of parallel computing, various memory domains are used to store and manage data.

**Shared Memory:**

Shared memory is accessible by all processors or threads in a parallel system. It simplifies communication and data sharing but can lead to issues like data races and requires careful synchronization.

**Distributed Memory:**

Distributed memory systems have separate memory spaces for each process in a parallel system. 
These memory spaces are not directly accessible to other processors, so communication between nodes typically involves *message passing*.
Examples of distributed memory systems include clusters and supercomputers.


Hello World
------------

`MPI <https://www.mpi-forum.org/>`_ which stands for Message Passing Interface is commonly used in high-performance computing (HPC) and parallel computing environments to facilitate communication and data exchange between different processes running on multiple computing nodes in a cluster or supercomputer. It supports point-to-point communication, collective communication, and synchronization mechanisms.


**Common Steps:**

The following three steps are essential when using MPI in C/C++.

#. ``#include <mpi.h>``: Makes MPI's declaration visible in C/C++.
#. ``MPI_Init(&argc, &argv)``: Initializes the MPI environment. It is a crucial step before using MPI functions. argc is the pointer to the number of arguments and argv is the pointer to the vector of arguments.
#. ``MPI_Finalize()``: Finalizes the MPI environment, ensuring all MPI-related resources are released properly. It should be called at the end of your program to avoid resource leaks.


**Rank and Communicator:**

.. code-block:: c++

  #include <iostream>
  #include <mpi.h>

  int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);

    int l_rank;
    int l_comm_size;

    MPI_Comm_rank(MPI_COMM_WORLD, &l_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &l_comm_size);

    std::cout << "Process " << l_rank
              << " out of " << l_comm_size << " processes says: Hello, MPI!" << std::endl;

    MPI_Finalize();

    return 0;
  }

**Compiling:**

When compiling MPI programs in C/C++, you typically need to use a specific wrapper that is compatible with MPI.

**Open MPI**: `Open MPI <https://docs.open-mpi.org/en/v5.0.x/index.html>`_ is a widely used implementation of the MPI standard and comes with its own C/C++ compiler wrappers (mpicxx or mpic++). You can use these wrappers to compile your C/C++ MPI programs. For example:

.. code-block:: shell

  module load mpi/openmpi/<version>
  mpicxx -o <output_name> <file_name>

``mpicxx`` is actually a compiler wrapper provided by MPI implementations. It is designed to simplify the process of compiling and linking C++ programs that use MPI for distributed computing.
When you use ``mpicxx``, it acts as a wrapper around your system's C/C++ compiler (such as g++) and includes the necessary flags and libraries required for MPI support. You can use ``mpicxx --show`` to display the compiler and linker flags that ``mpicxx`` would use.

.. code-block:: shell

  $ mpicxx --show

  g++ -I/usr/local/include -pthread -Wl,-rpath -Wl,/usr/local/lib-Wl,\
      --enable-new-dtags -L/usr/local/lib -lmpi_cxx -lmpi

- ``g++`` is the underlying C/C++ compiler.
- ``-I/usr/local/include`` specifies the include directories.
- ``-pthread`` is used for multithreading support.
- ``-Wl`` flags are related to linker options.
- ``-L/usr/local/lib`` specifies the library directories.
- ``-lmpi_cxx`` and ``-lmpi`` are the necessary MPI libraries for your program.


**Executing:**

To run your code after compiling: 

.. code-block:: bash

  mpirun -np <number_of_processes> ./<output_name>
  

**Error Handling:**

For error handling in C/C++ programs, one option is to use the ``<cassert>`` header, which provides the assert macro. The assert macro is a built-in debugging tool that allows you to include conditional checks in your code. 

.. code-block:: c++

  int main() {
    // ...
    assert(some_condition);
    // ...
    return 0;
  }

For example during initializing a MPI:

.. code-block:: c++

  int l_ret = MPI_Init(&argc, &argv);
  assert(l_ret == MPI_SUCCESS); // Ensure MPI_Init succeeded
  
.. _fig:linear_proc:

.. figure:: ../images/linearProcesses.svg
   :align: center
   :width: 70 %
   
   Illustration of a linear arrangement of four processes. Arrows indicate communication between processes.

.. admonition:: Task

  1. Research and provide an overview of two different MPI implementations (e.g., Intel MPI and Open MPI).
  2. Write an MPI program that prints a message for each process, indicating its rank and its neighbors.
     - Assume a linear arrangement of processes within the communicator, where each process exept for the first and last has two neighbors. An example with four processes is shown in :numref:`fig:linear_proc`.
     - For process 2 the output could be something like:  Hello from process rank 2 to my right neighbor (rank 3) and my left neighbor (rank 1).
  3. Execute your program with 8 processes.

Time Measurement
----------------------

When working with MPI programs, it is often essential to measure the elapsed time of specific code segments for performance analysis.
MPI provides a convenient function ``MPI_Wtime()`` for this purpose.

To measure the time taken by a particular code block, follow these steps:

.. code-block:: cpp

  // MPI_Barrier(MPI_COMM_WORLD);

  double start_time = MPI_Wtime();

  // Code you want to measure

  // Insert a barrier to synchronize processes before printing
  MPI_Barrier(MPI_COMM_WORLD);

  double end_time = MPI_Wtime();
  double elapsed_time = end_time - start_time;

  if (rank == 0) {
    std::cout << "Elapsed time: " << elapsed_time << " seconds" << std::endl;
  }