7. Point-to-Point Communication

Point-to-Point communication in MPI refers to the process of sending and receiving messages between two specific processes in a parallel computing environment.

7.1. Blocking

Blocking communication is one version of point to point communications.

  • MPI_Send is used to send a message from the sender process to the receiver process.

  • MPI_Recv is used to receive a message in the receiving process.

MPI_Send:

MPI_Send(
  &data_to_send, // Pointer to Data
  1,             // Count of Data Items
  MPI_INT,       // Data Type
  receiver_rank, // Receiver's Rank
  0,             // Message Tag
  MPI_COMM_WORLD // Communicator
);
  • &data_to_send: This is a pointer to the data you want to send.

  • 1: This parameter specifies how many data items you are sending. In this example, we are sending one integer, so the count is set to 1.

  • MPI_INT: This parameter defines the data type of the data you are sending. Here, we’re sending an integer, so we specify MPI_INT to indicate the data type.

  • receiver_rank: This is the rank of the process that will receive the data.

  • 0: The message tag is an integer used to label the message. It helps you distinguish different messages.

  • MPI_COMM_WORLD: This parameter specifies the communicator over which the communication is taking place. MPI_COMM_WORLD is a predefined communicator that represents all the processes in the MPI program. It is a way of specifying which group of processes is involved in the communication.

Some examples:

Sending an integer (my_val) to processor with rank 2:

int my_val = 42;
int receiver_rank = 2;
MPI_Send(&my_val, 1, MPI_INT, receiver_rank, 0, MPI_COMM_WORLD);

Sending an array of integers (values) to processor with rank 1:

int values[5] = {10, 20, 30, 40, 50};
int receiver_rank = 1;
MPI_Send(values, 5, MPI_INT, receiver_rank, 0, MPI_COMM_WORLD);

Sending an array of doubles (double_values) to processor with rank 10 using tag 6:

double double_values[3] = {3.14, 2.71, 1.618};
int receiver_rank = 10;
int message_tag = 6;
MPI_Send(double_values, 3, MPI_DOUBLE, receiver_rank, message_tag, MPI_COMM_WORLD);

MPI_Recv:

MPI_Recv(
  &received_data, // Pointer to Data
  1,              // Count of Data Items
  MPI_INT,        // Data Type
  sender_rank,    // Sender's Rank
  0,              // Message Tag
  MPI_COMM_WORLD, // Communicator
  &status         // Status Variable
);
  • &received_data: This is a pointer to the location where the received data will be stored.

  • 1: This parameter specifies how many data items you expect to receive.

  • MPI_INT: This parameter defines the data type of the received data. Here, we expect to receive an integer, so we specify MPI_INT to indicate the data type.

  • sender_rank: This is the rank of the process that will send the data.

  • 0: The message tag is an integer used to match the received message with the one sent. It helps ensure that the received message corresponds to the expected message.

  • MPI_COMM_WORLD: This parameter specifies the communicator over which the communication is taking place. MPI_COMM_WORLD is a predefined communicator that represents all the processes in the MPI program. It is a way of specifying which group of processes is involved in the communication.

  • &status: This is a pointer to a status variable that will store information about the received message, such as the source rank and message tag.

Some examples:

Receiving an integer from process 5 with tag 0:

int received_value;
int sender_rank = 5;
int message_tag = 0;
MPI_Status status;
MPI_Recv(&received_value, 1, MPI_INT, sender_rank, message_tag, MPI_COMM_WORLD, &status);

Receiving an array of 3 doubles from any process with any tag:

double received_values[3];
int sender_rank = MPI_ANY_SOURCE;  // Receive from any source
int message_tag = MPI_ANY_TAG;
MPI_Recv(received_values, 3, MPI_DOUBLE, sender_rank,
         message_tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE)
../_images/ring.svg

Fig. 7.1.1 Illustration of a logical ring of four processes. Arrows indicate communication between processes.

Task

  1. Implement an algorithm that calculates the sum of all ranks in a communicator at the root process (usually process rank 0).

  2. Use a logical ring arrangement of processes within the communicator. An example with four processes is shown in Fig. 7.1.1.

  3. Initialize an integer with value 0 on the root process. This data must be propagated through the communicator with the following restrictions:

  • Starting from the root process, each process sends the data to its right neighbor.

  • Each process, exept for the root process, sends data once it receives data from its left neighbor.

  • Before a process sends the data, it adds its own rank to the data.

  • The algorithm terminates, when the root process receives the data.

  1. Each process must print the data it receives and the sender rank.

  2. Each process must print the data it sends and the receiver_rank.

  3. Consider error handling.

  4. Execute your code with at least four processes.

7.2. Non-Blocking

Non-blocking point-to-point communication refers to a mode of communication where a process can initiate data transfer with another process and continue its computation without waiting for the transfer to complete. This allows for better overlap of computation and communication, improving overall program efficiency.

  • MPI_Isend: This function is used for non-blocking message sending in MPI. It allows the sender process to initiate a data transfer without waiting for its completion.

  • MPI_Irecv: This function is employed for non-blocking message receiving, enabling the receiving process to post a request for data and continue its computation without blocking.

MPI_Isend:

int MPI_Isend(
  const void *buf,     // Pointer to Data
  int count,           // Count of Data Items
  MPI_Datatype datatype, // Data Type
  int dest,            // Receiver's Rank
  int tag,             // Message Tag
  MPI_Comm comm,       // Communicator
  MPI_Request *request // Request Object
);
  • const void *buf: This is a pointer to the data you want to send. The data is of type void *, allowing you to send data of various types by specifying the MPI_Datatype.

  • int count: This parameter specifies how many data items you are sending. In this example, it represents the number of data elements you want to send.

  • MPI_Datatype datatype: This parameter defines the data type of the data you are sending. It is an MPI data type that describes the type of data being sent (e.g., MPI_INT for integers).

  • int dest: The rank of the destination process that will receive the data.

  • int tag: The message tag is an integer used to label the message. It helps you distinguish different messages and is used for matching with the corresponding MPI_Irecv call on the receiving side.

  • MPI_Comm comm: This parameter specifies the communicator over which the communication is taking place. MPI_COMM_WORLD is a predefined communicator that represents all the processes in the MPI program.

  • MPI_Request *request: This is an MPI request object that can be used to check the status of the non-blocking send operation or to wait for its completion using MPI_Wait or test its status using MPI_Test.

Some examples:

Sending an array of integers to a specific rank:

int data_array[5] = {1, 2, 3, 4, 5};
int receiver_rank = 1;
MPI_Isend(data_array, 5, MPI_INT, receiver_rank, 0, MPI_COMM_WORLD, &request);

Sending a struct to another process:

typedef struct {
    int x;
    double y;
} MyStruct;

MyStruct data;
data.x = 42;
data.y = 3.141;
int receiver_rank = 0;

MPI_Isend(&data, 1, my_struct_type, receiver_rank, 0, MPI_COMM_WORLD, &request);

Sending a string (character array) to another process:

#include <string.h>
// ...
char message[] = "Hello, World!";
int receiver_rank = 3;

MPI_Isend(message, strlen(message) + 1, MPI_CHAR, \
          receiver_rank, 0, MPI_COMM_WORLD, &request);

MPI_Irecv:

int MPI_Irecv(
  void *buf,           // Pointer to Data
  int count,           // Count of Data Items
  MPI_Datatype datatype, // Data Type
  int source,          // Sender's Rank
  int tag,             // Message Tag
  MPI_Comm comm,       // Communicator
  MPI_Status *status,  // Status Object
  MPI_Request *request // Request Object
);
  • void *buf: This is a pointer to the buffer where received data will be stored. It should be appropriately allocated and large enough to hold the incoming data. The data type of the buffer is determined by the MPI_Datatype parameter.

  • int count: This parameter specifies how many data items you are expecting to receive. It represents the number of data elements you want to receive.

  • MPI_Datatype datatype: This parameter defines the data type of the data you are expecting to receive, and it should match the data type used by the sending process.

  • int source: The rank of the source process from which you expect to receive data. If you set source to MPI_ANY_SOURCE, it means you’re willing to receive data from any source.

  • int tag: The message tag is an integer used to label the message. It helps you distinguish different messages and is used for matching with the corresponding MPI_Isend call on the sending side.

  • MPI_Comm comm: This parameter specifies the communicator over which the communication is taking place. It should match the communicator used by the sending process.

  • MPI_Status *status: This is an MPI status object that can be used to retrieve information about the received message, such as the source and tag of the message. It is optional, and you can pass MPI_STATUS_IGNORE if you don’t need to access this information.

  • MPI_Request *request: This is an MPI request object that can be used to check the status of the non-blocking receive operation or to wait for its completion using MPI_Wait or test its status using MPI_Test.

Some examples:

Receiving an integer from a specific sender:

int received_value;
int sender_rank = 1;
int message_tag = 0;
MPI_Status status;
MPI_Request request;

MPI_Irecv(&received_value, 1, MPI_INT, sender_rank, message_tag, MPI_COMM_WORLD, &request);

Receiving an array of doubles from any sender:

double received_data[10];
int sender_rank;
int message_tag = 0;
MPI_Status status;
MPI_Request request;

MPI_Irecv(received_data, 10, MPI_DOUBLE, MPI_ANY_SOURCE,\
          message_tag, MPI_COMM_WORLD, &request);

7.3. Synchronization

MPI is designed to support both blocking and non-blocking communication. While functions like MPI_Send and MPI_Recv block the process until the message exchange is complete, MPI_Isend and MPI_Irecv are non-blocking and return immediately, allowing the sender and receiver to continue their computation. MPI_Wait and MPI_Waitall are used to synchronize and wait for the completion of these non-blocking operations.

MPI_Wait and MPI_Waitall are used to manage and synchronize non-blocking communication, providing balance workloads, and improve the overall performance and responsiveness of parallel MPI programs.

MPI_Wait:

You use MPI_Wait when you have initiated a single non-blocking operation using functions like MPI_Isend or MPI_Irecv and want to ensure that this particular operation has completed before proceeding further.

int MPI_Wait(
  MPI_Request *request,  // Request Object
  MPI_Status *status     // Status Object
);
  • MPI_Request *request: This is a pointer to the request object that corresponds to a previous non-blocking operation (e.g., an MPI_Isend or MPI_Irecv call). MPI_Wait will block until the operation associated with this request is completed.

  • MPI_Status *status: This is an optional status object that can provide information about the completed operation, such as the source and tag of the message.

For examples:

 #include <mpi.h>

int main(int argc, char** argv) {
  MPI_Init(&argc, &argv);

  int rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  int data;
  MPI_Request send_request, recv_request;
  MPI_Status send_status, recv_status;

  if (rank == 0) {
      // Sender process
      data = 42;
      MPI_Isend(&data, 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &send_request);
      // Continue computation while the send is in progress

      // Wait for the send operation to complete
      MPI_Wait(&send_request, &send_status);
      // Continue with other computation or communication
  } else if (rank == 1) {
      // Receiver process
      MPI_Irecv(&data, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &recv_request);
      // Continue computation while the receive is in progress

      // Wait for the receive operation to complete
      MPI_Wait(&recv_request, &recv_status);
      // Process received data
  }

  MPI_Finalize();
  return 0;
}

MPI_Waitall:

You use MPI_Waitall when you have initiated several non-blocking operations using functions like MPI_Isend or MPI_Irecv and want to wait for all of them to complete. This is especially useful when you have a sequence of non-blocking operations or when you need to synchronize multiple operations before continuing.

int MPI_Waitall(
 int count,                         // Number of Requests
 MPI_Request array_of_requests[],   // Array of Request Objects
 MPI_Status array_of_statuses[]     // Array of Status Objects
);
  • int count: The number of request objects in array_of_requests.

  • MPI_Request array_of_requests[]: An array of request objects to be waited on. These request objects correspond to multiple non-blocking operations.

  • MPI_Status array_of_statuses[]: An array of status objects that can provide information about the completed operations. The size of this array should be at least equal to the count parameter.

For example:

#include <mpi.h>

int main(int argc, char** argv) {
  MPI_Init(&argc, &argv);

  int rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  int data[2];
  MPI_Request send_requests[2], recv_requests[2];
  MPI_Status send_statuses[2], recv_statuses[2];

  if (rank == 0) {
    // Sender process
    data[0] = 42;
    data[1] = 24;
    MPI_Isend(&data[0], 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &send_requests[0]);
    MPI_Isend(&data[1], 1, MPI_INT, 2, 0, MPI_COMM_WORLD, &send_requests[1]);
    // Continue computation while the sends are in progress
  } else if (rank == 1) {
    // Receiver process
    MPI_Irecv(&data[0], 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &recv_requests[0]);
    // Continue computation while the receive is in progress
  } else if (rank == 2) {
    // Receiver process
    MPI_Irecv(&data[1], 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &recv_requests[1]);
    // Continue computation while the receive is in progress
  }

  // Wait for all send and receive operations to complete
  MPI_Waitall(2, send_requests, send_statuses);
  MPI_Waitall(2, recv_requests, recv_statuses);

  MPI_Finalize();
  return 0;
}

MPI_Test:

In addition to the MPI_Wait and MPI_Waitall functions, MPI also provides the MPI_Test function to check for the completion of a non-blocking operation without blocking the progress of the program. The MPI_Test function is useful when you want to test whether a specific non-blocking operation has completed without waiting for it to finish. Calling MPI_Test progresses MPI messages and allows for increased overlap of computation and communication.

int MPI_Test(
  MPI_Request *request,  // Request Object
  int *flag,             // Completion Flag
  MPI_Status *status     // Status Object
);
  • request: A pointer to the request object associated with the non-blocking operation.

  • flag: A pointer to an integer flag that is set to true (non-zero) if the operation has completed, and false (zero) otherwise.

  • status: A pointer to a status object that can be queried for information about the completed operation.

In the following example, the MPI_Test function is used to check whether the non-blocking operation initiated by MPI_Isend or MPI_Irecv has completed. If the operation has completed, further processing can be performed. This allows for increased flexibility in managing communication and computation in MPI programs.

#include <mpi.h>

int main() {
  MPI_Init(NULL, NULL);

  int rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  int data = 42;
  MPI_Request request;
  MPI_Status status;

  if (rank == 0) {
    // Non-blocking send
    MPI_Isend(&data, 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &request);
  } else if (rank == 1) {
    // Non-blocking receive
    MPI_Irecv(&data, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &request);
  }

  // Check if the non-blocking operation has completed
  int flag = 0;
  while(flag == 0) {
    MPI_Test(&request, &flag, &status);

    if (flag) {
      // The non-blocking operation has completed
      // Additional processing can be performed here
    }
    else {
      // Execute work which does not depend on completion
    }
  }

  MPI_Finalize();
  return 0;
}

MPI_Test vs MPI_Wait:

MPI_Test:

  • MPI_Test is non-blocking and returns immediately, allowing you to check the completion status without waiting. It is suitable for scenarios where you want to overlap communication and computation.

  • It returns immediately and provides a flag indicating whether the operation associated with the request has finished.

  • If the operation is complete, the flag is set to true (non-zero), allowing you to proceed with further computation.

  • If the operation is not complete, the flag is set to false (zero), and the program can continue with other tasks.

MPI_Wait:

  • MPI_Wait is blocking and ensures that the program does not proceed until the specified non-blocking operation is complete. It is useful when you need to synchronize and coordinate the flow of your program.

  • It is used when you want to ensure that a specific non-blocking operation has finished before continuing with the program.

  • If the operation is already complete, MPI_Wait returns immediately; otherwise, it waits for the completion of the operation.

Task

Task: Implement a summation function named my_custom_reduction with non-blocking Point-to-Point communication.

  1. Initialize a random double value in each process.

  2. Sends the values to the root process and store them in one array, sorted by the senders’ ranks.

  3. Sum up all the values of the array.

  4. Print the array and the summed value.

  5. Implement the receiving process (root process) in a file and the sending processes in a different file.

  6. Test your Implementation.

  7. Research and identify an MPI command that resembles this function. Write a brief description of its functionality and usage.

Note:

  • Use only non-blocking Point-to-Point communication.

  • The following command runs MPI with multiple files: % mpirun [options] [program-name] : [options2] [program-name2]

7.4. Blocking - Non-blocking

(Optional)

The flexibility of MPI lies in the ability to mix these two types of communication. This flexibility can lead to various advantages:

  • Overlap of Computation and Communication: By integrating non-blocking operations, you can overlap computation tasks with communication tasks, optimizing the overall program’s performance.

  • Improved Responsiveness: Non-blocking operations can make your application more responsive as processes can continue working while waiting for communication to complete.

  • Reduced Latency: Mixing non-blocking communication can help minimize communication latency by allowing processes to initiate data transfers as soon as the data is available.

  • Load Balancing: In some cases, a mix of blocking and non-blocking operations can help achieve better load balancing in distributed computing environments.

Keep in mind that while mixing these communication modes can be beneficial in many cases, it requires careful programming and synchronization to ensure data integrity and correctness. Choosing the right combination of blocking and non-blocking communication routines depends on your application’s specific requirements and the desired balance between performance and reliability.