6. Point-to-Point Communication
Point-to-Point communication in MPI refers to the process of sending and receiving messages between two specific processes in a parallel computing environment.
6.1. Blocking
Blocking communication is one version of point to point communications.
MPI_Send
is used to send a message from the sender process to the receiver process.MPI_Recv
is used to receive a message in the receiving process.
MPI_Send:
MPI_Send(
&data_to_send, // Pointer to Data
1, // Count of Data Items
MPI_INT, // Data Type
receiver_rank, // Receiver's Rank
0, // Message Tag
MPI_COMM_WORLD // Communicator
);
&data_to_send
: This is a pointer to the data you want to send.1
: This parameter specifies how many data items you are sending. In this example, we are sending one integer, so the count is set to 1.MPI_INT
: This parameter defines the data type of the data you are sending. Here, we’re sending an integer, so we specify MPI_INT to indicate the data type.receiver_rank
: This is the rank of the process that will receive the data.0
: The message tag is an integer used to label the message. It helps you distinguish different messages.MPI_COMM_WORLD
: This parameter specifies the communicator over which the communication is taking place. MPI_COMM_WORLD is a predefined communicator that represents all the processes in the MPI program. It is a way of specifying which group of processes is involved in the communication.
Some examples:
Sending an integer (my_val) to processor with rank 2:
int my_val = 42;
int receiver_rank = 2;
MPI_Send(&my_val, 1, MPI_INT, receiver_rank, 0, MPI_COMM_WORLD);
Sending an array of integers (values) to processor with rank 1:
int values[5] = {10, 20, 30, 40, 50};
int receiver_rank = 1;
MPI_Send(values, 5, MPI_INT, receiver_rank, 0, MPI_COMM_WORLD);
Sending an array of doubles (double_values) to processor with rank 10 using tag 6:
double double_values[3] = {3.14, 2.71, 1.618};
int receiver_rank = 10;
int message_tag = 6;
MPI_Send(double_values, 3, MPI_DOUBLE, receiver_rank, message_tag, MPI_COMM_WORLD);
MPI_Recv:
MPI_Recv(
&received_data, // Pointer to Data
1, // Count of Data Items
MPI_INT, // Data Type
sender_rank, // Sender's Rank
0, // Message Tag
MPI_COMM_WORLD, // Communicator
&status // Status Variable
);
&received_data
: This is a pointer to the location where the received data will be stored.1
: This parameter specifies how many data items you expect to receive.MPI_INT
: This parameter defines the data type of the received data. Here, we expect to receive an integer, so we specify MPI_INT to indicate the data type.sender_rank
: This is the rank of the process that will send the data.0
: The message tag is an integer used to match the received message with the one sent. It helps ensure that the received message corresponds to the expected message.MPI_COMM_WORLD
: This parameter specifies the communicator over which the communication is taking place. MPI_COMM_WORLD is a predefined communicator that represents all the processes in the MPI program. It is a way of specifying which group of processes is involved in the communication.&status
: This is a pointer to a status variable that will store information about the received message, such as the source rank and message tag.
Some examples:
Receiving an integer from process 5 with tag 0:
int received_value;
int sender_rank = 5;
int message_tag = 0;
MPI_Status status;
MPI_Recv(&received_value, 1, MPI_INT, sender_rank, message_tag, MPI_COMM_WORLD, &status);
Receiving an array of 3 doubles from any process with any tag:
double received_values[3];
int sender_rank = MPI_ANY_SOURCE; // Receive from any source
int message_tag = MPI_ANY_TAG;
MPI_Recv(received_values, 3, MPI_DOUBLE, sender_rank,\
message_tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE)
Task
Logical Ring Arrangement
Assume a logical ring arrangement of processes within the communicator, where each process is a neighbor to the one on its left and right.
Consider that we have an integer value initialized in the root process (usually process rank 0) and this data has to propagate through the communicator following some restrictions:
Each process sends data to its right neighbor.
Each process gets data just from its previous one (left neighbor).
The last process adds its rank to the received data and sends it back to the root.
Each process has to print what it receives from which process and what it sends to which process.
Consider error handling.
Consider that you need at least two processes.
Don’t run your code on login node!
6.2. Non-Blocking
Non-blocking point-to-point communication refers to a mode of communication where a process can initiate data transfer with another process and continue its computation without waiting for the transfer to complete. This allows for better overlap of computation and communication, improving overall program efficiency.
MPI_Isend
: This function is used for non-blocking message sending in MPI. It allows the sender process to initiate a data transfer without waiting for its completion.MPI_Irecv
: This function is employed for non-blocking message receiving, enabling the receiving process to post a request for data and continue its computation without blocking.
MPI_Isend:
int MPI_Isend(
const void *buf, // Pointer to Data
int count, // Count of Data Items
MPI_Datatype datatype, // Data Type
int dest, // Receiver's Rank
int tag, // Message Tag
MPI_Comm comm, // Communicator
MPI_Request *request // Request Object
);
const void *buf
: This is a pointer to the data you want to send. The data is of type void *, allowing you to send data of various types by specifying the MPI_Datatype.int count
: This parameter specifies how many data items you are sending. In this example, it represents the number of data elements you want to send.MPI_Datatype datatype
: This parameter defines the data type of the data you are sending. It is an MPI data type that describes the type of data being sent (e.g., MPI_INT for integers).int dest
: The rank of the destination process that will receive the data.int tag
: The message tag is an integer used to label the message. It helps you distinguish different messages and is used for matching with the corresponding MPI_Irecv call on the receiving side.MPI_Comm comm
: This parameter specifies the communicator over which the communication is taking place. MPI_COMM_WORLD is a predefined communicator that represents all the processes in the MPI program.MPI_Request *request
: This is an MPI request object that can be used to check the status of the non-blocking send operation or to wait for its completion using MPI_Wait or test its status using MPI_Test.
Some examples:
Sending an array of integers to a specific rank:
int data_array[5] = {1, 2, 3, 4, 5};
int receiver_rank = 1;
MPI_Isend(data_array, 5, MPI_INT, receiver_rank, 0, MPI_COMM_WORLD, &request);
Sending a struct to another process:
typedef struct {
int x;
double y;
} MyStruct;
MyStruct data;
data.x = 42;
data.y = 3.141;
int receiver_rank = 0;
MPI_Isend(&data, 1, my_struct_type, receiver_rank, 0, MPI_COMM_WORLD, &request);
Sending a string (character array) to another process:
#include <string.h>
// ...
char message[] = "Hello, World!";
int receiver_rank = 3;
MPI_Isend(message, strlen(message) + 1, MPI_CHAR, \
receiver_rank, 0, MPI_COMM_WORLD, &request);
MPI_Irecv:
int MPI_Irecv(
void *buf, // Pointer to Data
int count, // Count of Data Items
MPI_Datatype datatype, // Data Type
int source, // Sender's Rank
int tag, // Message Tag
MPI_Comm comm, // Communicator
MPI_Status *status, // Status Object
MPI_Request *request // Request Object
);
void *buf
: This is a pointer to the buffer where received data will be stored. It should be appropriately allocated and large enough to hold the incoming data. The data type of the buffer is determined by the MPI_Datatype parameter.int count
: This parameter specifies how many data items you are expecting to receive. It represents the number of data elements you want to receive.MPI_Datatype datatype
: This parameter defines the data type of the data you are expecting to receive, and it should match the data type used by the sending process.int source
: The rank of the source process from which you expect to receive data. If you setsource
toMPI_ANY_SOURCE
, it means you’re willing to receive data from any source.int tag
: The message tag is an integer used to label the message. It helps you distinguish different messages and is used for matching with the correspondingMPI_Isend
call on the sending side.MPI_Comm comm
: This parameter specifies the communicator over which the communication is taking place. It should match the communicator used by the sending process.MPI_Status *status
: This is an MPI status object that can be used to retrieve information about the received message, such as the source and tag of the message. It is optional, and you can passMPI_STATUS_IGNORE
if you don’t need to access this information.MPI_Request *request
: This is an MPI request object that can be used to check the status of the non-blocking receive operation or to wait for its completion usingMPI_Wait
or test its status usingMPI_Test
.
Some examples:
Receiving an integer from a specific sender:
int received_value;
int sender_rank = 1;
int message_tag = 0;
MPI_Status status;
MPI_Request request;
MPI_Irecv(&received_value, 1, MPI_INT, sender_rank, message_tag, MPI_COMM_WORLD, &request);
Receiving an array of doubles from any sender:
double received_data[10];
int sender_rank;
int message_tag = 0;
MPI_Status status;
MPI_Request request;
MPI_Irecv(received_data, 10, MPI_DOUBLE, MPI_ANY_SOURCE,\
message_tag, MPI_COMM_WORLD, &request);
6.3. Synchronization
MPI is designed to support both blocking and non-blocking communication. While functions like MPI_Send
and MPI_Recv
block the process until the message exchange is complete, MPI_Isend
and MPI_Irecv
are non-blocking and return immediately, allowing the sender and receiver to continue their computation. MPI_Wait
and MPI_Waitall
are used to synchronize and wait for the completion of these non-blocking operations.
MPI_Wait
and MPI_Waitall
are used to manage and synchronize non-blocking communication, providing balance workloads, and improve the overall performance and responsiveness of parallel MPI programs.
MPI_Wait:
You use MPI_Wait
when you have initiated a single non-blocking operation using functions like MPI_Isend
or MPI_Irecv
and want to ensure that this particular operation has completed before proceeding further.
int MPI_Wait(
MPI_Request *request, // Request Object
MPI_Status *status // Status Object
);
MPI_Request *request
: This is a pointer to the request object that corresponds to a previous non-blocking operation (e.g., an MPI_Isend or MPI_Irecv call). MPI_Wait will block until the operation associated with this request is completed.MPI_Status *status
: This is an optional status object that can provide information about the completed operation, such as the source and tag of the message.
For examples:
#include <mpi.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int data;
MPI_Request send_request, recv_request;
MPI_Status send_status, recv_status;
if (rank == 0) {
// Sender process
data = 42;
MPI_Isend(&data, 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &send_request);
// Continue computation while the send is in progress
// Wait for the send operation to complete
MPI_Wait(&send_request, &send_status);
// Continue with other computation or communication
} else if (rank == 1) {
// Receiver process
MPI_Irecv(&data, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &recv_request);
// Continue computation while the receive is in progress
// Wait for the receive operation to complete
MPI_Wait(&recv_request, &recv_status);
// Process received data
}
MPI_Finalize();
return 0;
}
MPI_Waitall:
You use MPI_Waitall
when you have initiated several non-blocking operations using functions like MPI_Isend
or MPI_Irecv
and want to wait for all of them to complete. This is especially useful when you have a sequence of non-blocking operations or when you need to synchronize multiple operations before continuing.
int MPI_Waitall(
int count, // Number of Requests
MPI_Request array_of_requests[], // Array of Request Objects
MPI_Status array_of_statuses[] // Array of Status Objects
);
int count
: The number of request objects inarray_of_requests
.MPI_Request array_of_requests[]
: An array of request objects to be waited on. These request objects correspond to multiple non-blocking operations.MPI_Status array_of_statuses[]
: An array of status objects that can provide information about the completed operations. The size of this array should be at least equal to the count parameter.
For example:
#include <mpi.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int data[2];
MPI_Request send_requests[2], recv_requests[2];
MPI_Status send_statuses[2], recv_statuses[2];
if (rank == 0) {
// Sender process
data[0] = 42;
data[1] = 24;
MPI_Isend(&data[0], 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &send_requests[0]);
MPI_Isend(&data[1], 1, MPI_INT, 2, 0, MPI_COMM_WORLD, &send_requests[1]);
// Continue computation while the sends are in progress
} else if (rank == 1) {
// Receiver process
MPI_Irecv(&data[0], 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &recv_requests[0]);
// Continue computation while the receive is in progress
} else if (rank == 2) {
// Receiver process
MPI_Irecv(&data[1], 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &recv_requests[1]);
// Continue computation while the receive is in progress
}
// Wait for all send and receive operations to complete
MPI_Waitall(2, send_requests, send_statuses);
MPI_Waitall(2, recv_requests, recv_statuses);
MPI_Finalize();
return 0;
}
MPI_Test:
In addition to the MPI_Wait
and MPI_Waitall
functions, MPI also provides the MPI_Test
function to check for the completion of a non-blocking operation without blocking the progress of the program. The MPI_Test
function is useful when you want to test whether a specific non-blocking operation has completed without waiting for it to finish. Calling MPI_Test
progresses MPI messages and allows for increased overlap of computation and communication.
int MPI_Test(
MPI_Request *request, // Request Object
int *flag, // Completion Flag
MPI_Status *status // Status Object
);
request
: A pointer to the request object associated with the non-blocking operation.flag
: A pointer to an integer flag that is set to true (non-zero) if the operation has completed, and false (zero) otherwise.status
: A pointer to a status object that can be queried for information about the completed operation.
In the following example, the MPI_Test
function is used to check whether the non-blocking operation initiated by MPI_Isend
or MPI_Irecv
has completed. If the operation has completed, further processing can be performed. This allows for increased flexibility in managing communication and computation in MPI programs.
#include <mpi.h>
int main() {
MPI_Init(NULL, NULL);
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int data = 42;
MPI_Request request;
MPI_Status status;
if (rank == 0) {
// Non-blocking send
MPI_Isend(&data, 1, MPI_INT, 1, 0, MPI_COMM_WORLD, &request);
} else if (rank == 1) {
// Non-blocking receive
MPI_Irecv(&data, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &request);
}
// Check if the non-blocking operation has completed
int flag = 0;
while(flag == 0) {
MPI_Test(&request, &flag, &status);
if (flag) {
// The non-blocking operation has completed
// Additional processing can be performed here
}
else {
// Execute work which does not depend on completion
}
}
MPI_Finalize();
return 0;
}
MPI_Test vs MPI_Wait:
MPI_Test:
MPI_Test
is non-blocking and returns immediately, allowing you to check the completion status without waiting. It is suitable for scenarios where you want to overlap communication and computation.It returns immediately and provides a flag indicating whether the operation associated with the request has finished.
If the operation is complete, the flag is set to true (non-zero), allowing you to proceed with further computation.
If the operation is not complete, the flag is set to false (zero), and the program can continue with other tasks.
MPI_Wait:
MPI_Wait
is blocking and ensures that the program does not proceed until the specified non-blocking operation is complete. It is useful when you need to synchronize and coordinate the flow of your program.It is used when you want to ensure that a specific non-blocking operation has finished before continuing with the program.
If the operation is already complete,
MPI_Wait
returns immediately; otherwise, it waits for the completion of the operation.
Task
Implement a simple non-blocking summation function named
my_custom_reduction
.In each process (except the root), initialize a random double value.
Every process sends its own value to the root process.
The root process gathers these values from all other processes and stores in one array, sorted by the senders’ ranks.
The root process prints out the received array and the summed value.
Try out the function.
Provide a screenshot of your results.
Upload the used SLURM script, output and error files.
Research and identify an MPI command that closely resembles this specified operation, offering a brief description of its functionality and usage.
Note:
You are just allowed to use non-blocking functions.
6.4. Blocking - Non-blocking
(Optional)
The flexibility of MPI lies in the ability to mix these two types of communication. This flexibility can lead to various advantages:
Overlap of Computation and Communication: By integrating non-blocking operations, you can overlap computation tasks with communication tasks, optimizing the overall program’s performance.
Improved Responsiveness: Non-blocking operations can make your application more responsive as processes can continue working while waiting for communication to complete.
Reduced Latency: Mixing non-blocking communication can help minimize communication latency by allowing processes to initiate data transfers as soon as the data is available.
Load Balancing: In some cases, a mix of blocking and non-blocking operations can help achieve better load balancing in distributed computing environments.
Keep in mind that while mixing these communication modes can be beneficial in many cases, it requires careful programming and synchronization to ensure data integrity and correctness. Choosing the right combination of blocking and non-blocking communication routines depends on your application’s specific requirements and the desired balance between performance and reliability.