diff options
Diffstat (limited to 'gprofng')
-rw-r--r-- | gprofng/examples/mxv-pthreads/README.md | 158 | ||||
-rwxr-xr-x | gprofng/examples/mxv-pthreads/experiments/profile.sh | 79 | ||||
-rw-r--r-- | gprofng/examples/mxv-pthreads/src/Makefile | 70 | ||||
-rw-r--r-- | gprofng/examples/mxv-pthreads/src/main.c | 374 | ||||
-rw-r--r-- | gprofng/examples/mxv-pthreads/src/manage_data.c | 148 | ||||
-rw-r--r-- | gprofng/examples/mxv-pthreads/src/mxv.c | 78 | ||||
-rw-r--r-- | gprofng/examples/mxv-pthreads/src/mydefs.h | 117 | ||||
-rw-r--r-- | gprofng/examples/mxv-pthreads/src/workload.c | 91 |
8 files changed, 1115 insertions, 0 deletions
diff --git a/gprofng/examples/mxv-pthreads/README.md b/gprofng/examples/mxv-pthreads/README.md new file mode 100644 index 0000000..28450a6 --- /dev/null +++ b/gprofng/examples/mxv-pthreads/README.md @@ -0,0 +1,158 @@ +# README for the matrix-vector multiplication demo code
+
+## Synopsis
+
+This program implements the multiplication of a matrix and a vector. It is
+written in C and has been parallelized using the Pthreads parallel programming
+model. Each thread gets assigned a contiguous set of rows of the matrix to
+work on and the results are stored in the output vector.
+
+The code initializes the data, executes the matrix-vector multiplication, and
+checks the correctness of the results. In case of an error, a message to this
+extent is printed and the program aborts. Otherwise it prints a one line
+message on the screen.
+
+## About this code
+
+This is a standalone code, not a library. It is meant as a simple example to
+experiment with gprofng.
+
+## Directory structure
+
+There are four directories:
+
+1. `bindir` - after the build, it contains the executable.
+
+2. `experiments` - after the installation, it contains the executable and
+also has an example profiling script called `profile.sh`.
+
+3. `objects` - after the build, it contains the object files.
+
+4. `src` - contains the source code and the make file to build, install,
+and check correct functioning of the executable.
+
+## Code internals
+
+This is the main execution flow:
+
+* Parse the user options.
+* Compute the internal settings for the algorithm.
+* Initialize the data and compute the reference results needed for the correctness
+check.
+* Create and execute the threads. Each thread performs the matrix-vector
+multiplication on a pre-determined set of rows.
+* Verify the results are correct.
+* Print statistics and release the allocated memory.
+
+## Installation
+
+The Makefile in the `src` subdirectory can be used to build, install and check the
+code.
+
+Use `make` at the command line to (re)build the executable called `mxv-pthreads`. It will be
+stored in the directory `bindir`:
+
+```
+$ make
+gcc -o ../objects/main.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes main.c
+gcc -o ../objects/manage_data.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes manage_data.c
+gcc -o ../objects/workload.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes workload.c
+gcc -o ../objects/mxv.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes mxv.c
+gcc -o ../bindir/mxv-pthreads ../objects/main.o ../objects/manage_data.o ../objects/workload.o ../objects/mxv.o -lm -lpthread
+ldd ../bindir/mxv-pthreads
+ linux-vdso.so.1 (0x0000ffff9ea8b000)
+ libm.so.6 => /lib64/libm.so.6 (0x0000ffff9e9ad000)
+ libc.so.6 => /lib64/libc.so.6 (0x0000ffff9e7ff000)
+ /lib/ld-linux-aarch64.so.1 (0x0000ffff9ea4e000)
+$
+```
+The `make install` command installs the executable in directory `experiments`.
+
+```
+$ make install
+Installed mxv-pthreads in ../experiments
+$
+```
+The `make check` command may be used to verify the program works as expected:
+
+```
+$ make check
+Running mxv-pthreads in ../experiments
+mxv: error check passed - rows = 1000 columns = 1500 threads = 2
+$
+```
+The `make clean` comand removes the object files from the `objects` directory
+and the executable from the `bindir` directory.
+
+The `make veryclean` command implies `make clean`, but also removes the
+executable from directory `experiments`.
+
+## Usage
+
+The code takes several options, but all have a default value. If the code is
+executed without any options, these defaults will be used. To get an overview of
+all the options supported, and the defaults, use the `-h` option:
+
+```
+$ ./mxv-pthreads -h
+Usage: ./mxv-pthreads [-m <number of rows>] [-n <number of columns] [-r <repeat count>] [-t <number of threads] [-v] [-h]
+ -m - number of rows, default = 2000
+ -n - number of columns, default = 3000
+ -r - the number of times the algorithm is repeatedly executed, default = 200
+ -t - the number of threads used, default = 1
+ -v - enable verbose mode, off by default
+ -h - print this usage overview and exit
+$
+```
+
+For more extensive run time diagnostic messages use the `-v` option.
+
+As an example, these are the options to compute the product of a 2000x1000 matrix
+with a vector of length 1000 and use 4 threads. Verbose mode has been enabled:
+
+```
+$ ./mxv-pthreads -m 2000 -n 1000 -t 4 -v
+Verbose mode enabled
+Allocated data structures
+Initialized matrix and vectors
+Defined workload distribution
+Assigned work to threads
+Thread 0 has been created
+Thread 1 has been created
+Thread 2 has been created
+Thread 3 has been created
+Matrix vector multiplication has completed
+Verify correctness of result
+Error check passed
+mxv: error check passed - rows = 2000 columns = 1000 threads = 4
+$
+```
+
+## Executing the examples
+
+Directory `experiments` contains the `profile.sh` script. This script
+checks if gprofng can be found and for the executable to be installed.
+
+The script will then run a data collection experiment, followed by a series
+of invocations of `gprofng display text` to show various views. The results
+are printed on stdout.
+
+To include the commands executed in the output of the script, and store the
+results in a file called `LOG`, execute the script as follows:
+
+```
+$ bash -x ./profile.sh >& LOG
+```
+
+## Additional comments
+
+* The reason that compiler based inlining is disabled is to make the call tree
+look more interesting. For the same reason, the core multiplication function
+`mxv_core` has inlining disabled through the `void __attribute__ ((noinline))`
+attribute. Of course you're free to change this. It certainly does not affect
+the workings of the code.
+
+* This distribution includes a script called `profile.sh`. It is in the
+`experiments` directory and meant as an example for (new) users of gprofng.
+It can be used to produce profiles at the command line. It is also suitable
+as a starting point to develop your own profiling script(s).
diff --git a/gprofng/examples/mxv-pthreads/experiments/profile.sh b/gprofng/examples/mxv-pthreads/experiments/profile.sh new file mode 100755 index 0000000..f8812a2 --- /dev/null +++ b/gprofng/examples/mxv-pthreads/experiments/profile.sh @@ -0,0 +1,79 @@ +# +# Copyright (C) 2021-2023 Free Software Foundation, Inc. +# +# This file is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; see the file COPYING3. If not see +# <http://www.gnu.org/licenses/>. +# +#------------------------------------------------------------------------------ +# This script demonstrates how to use gprofng. +# +# After the experiment data has been generated, several views into the data +# are shown. +#------------------------------------------------------------------------------ + +#------------------------------------------------------------------------------ +# Define the executable, algorithm parameters and gprofng settings. +#------------------------------------------------------------------------------ +exe=../experiments/mxv-pthreads +rows=4000 +columns=2000 +threads=2 +exp_directory=experiment.$threads.thr.er + +#------------------------------------------------------------------------------ +# Check if gprofng has been installed and can be executed. +#------------------------------------------------------------------------------ +which gprofng > /dev/null 2>&1 +if (test $? -eq 0) then + echo "" + echo "Version information of the gprofng release used:" + echo "" + gprofng --version + echo "" +else + echo "Error: gprofng cannot be found - if it was installed, check your path" + exit +fi + +#------------------------------------------------------------------------------ +# Check if the executable is present. +#------------------------------------------------------------------------------ +if (! test -x $exe) then + echo "Error: executable $exe not found - check the make install command" + exit +fi + +echo "-------------- Collect the experiment data -----------------------------" +gprofng collect app -O $exp_directory $exe -m $rows -n $columns -t $threads + +#------------------------------------------------------------------------------ +# Make sure that the collect experiment succeeded and created an experiment +# directory with the performance data. +#------------------------------------------------------------------------------ +if (! test -d $exp_directory) then + echo "Error: experiment directory $exp_directory not found" + exit +fi + +echo "-------------- Show the function overview -----------------------------" +gprofng display text -functions $exp_directory + +echo "-------------- Show the function overview limit to the top 5 -----------" +gprofng display text -limit 5 -functions $exp_directory + +echo "-------------- Show the source listing of mxv_core ---------------------" +gprofng display text -metrics e.totalcpu -source mxv_core $exp_directory + +echo "-------------- Show the disassembly listing of mxv_core ----------------" +gprofng display text -metrics e.totalcpu -disasm mxv_core $exp_directory diff --git a/gprofng/examples/mxv-pthreads/src/Makefile b/gprofng/examples/mxv-pthreads/src/Makefile new file mode 100644 index 0000000..ef1c55a --- /dev/null +++ b/gprofng/examples/mxv-pthreads/src/Makefile @@ -0,0 +1,70 @@ +#
+# Copyright (C) 2021-2023 Free Software Foundation, Inc.
+#
+# This file is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; see the file COPYING3. If not see
+# <http://www.gnu.org/licenses/>.
+
+CC = gcc
+WARNINGS = -Wall -Werror=undef -Wstrict-prototypes
+OPT = -g -O
+CFLAGS = $(OPT) $(WARNINGS)
+LDFLAGS =
+LIBS = -lm -lpthread
+OBJDIR = ../objects
+BINDIR = ../bindir
+EXPDIR = ../experiments
+
+EXE = mxv-pthreads
+OBJECTS = $(OBJDIR)/main.o $(OBJDIR)/manage_data.o $(OBJDIR)/workload.o $(OBJDIR)/mxv.o
+
+default: $(BINDIR)/$(EXE)
+
+$(BINDIR)/$(EXE): $(OBJECTS)
+ @mkdir -p $(BINDIR)
+ $(CC) -o $(BINDIR)/$(EXE) $(LDFLAGS) $(OBJECTS) $(LIBS)
+ ldd $(BINDIR)/$(EXE)
+
+$(OBJDIR)/main.o: main.c
+ @mkdir -p $(OBJDIR)
+ $(CC) -o $(OBJDIR)/main.o -c $(CFLAGS) main.c
+$(OBJDIR)/manage_data.o: manage_data.c
+ @mkdir -p $(OBJDIR)
+ $(CC) -o $(OBJDIR)/manage_data.o -c $(CFLAGS) manage_data.c
+$(OBJDIR)/workload.o: workload.c
+ @mkdir -p $(OBJDIR)
+ $(CC) -o $(OBJDIR)/workload.o -c $(CFLAGS) workload.c
+$(OBJDIR)/mxv.o: mxv.c
+ @mkdir -p $(OBJDIR)
+ $(CC) -o $(OBJDIR)/mxv.o -c $(CFLAGS) mxv.c
+
+$(OBJECTS): mydefs.h
+
+.c.o:
+ $(CC) -c -o $@ $(CFLAGS) $<
+
+check:
+ @echo "Running $(EXE) in $(EXPDIR)"
+ @./$(EXPDIR)/$(EXE) -m 1000 -n 1500 -t 2
+
+install: $(BINDIR)/$(EXE)
+ @/bin/cp $(BINDIR)/$(EXE) $(EXPDIR)
+ @echo "Installed $(EXE) in $(EXPDIR)"
+
+clean:
+ @/bin/rm -f $(BINDIR)/$(EXE)
+ @/bin/rm -f $(OBJECTS)
+
+veryclean:
+ @make clean
+ @/bin/rm -f $(EXPDIR)/$(EXE)
diff --git a/gprofng/examples/mxv-pthreads/src/main.c b/gprofng/examples/mxv-pthreads/src/main.c new file mode 100644 index 0000000..625c604 --- /dev/null +++ b/gprofng/examples/mxv-pthreads/src/main.c @@ -0,0 +1,374 @@ +/* Copyright (C) 2021-2023 Free Software Foundation, Inc. + Contributed by Oracle. + + This file is part of GNU Binutils. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, 51 Franklin Street - Fifth Floor, Boston, + MA 02110-1301, USA. */ + +/* +* ----------------------------------------------------------------------------- +* This program implements the multiplication of an m by n matrix with a vector +* of length n. The Posix Threads parallel programming model is used to +* parallelize the core matrix-vector multiplication algorithm. +* ----------------------------------------------------------------------------- +*/ + +#include "mydefs.h" + +int main (int argc, char **argv) +{ + bool verbose = false; + + thread_data *thread_data_arguments; + pthread_t *pthread_ids; + + int64_t remainder_rows; + int64_t rows_per_thread; + int64_t active_threads; + + int64_t number_of_rows; + int64_t number_of_columns; + int64_t number_of_threads; + int64_t repeat_count; + + double **A; + double *b; + double *c; + double *ref; + + int64_t errors; + +/* +* ----------------------------------------------------------------------------- +* Start the ball rolling - Get the user options and parse them. +* ----------------------------------------------------------------------------- +*/ + (void) get_user_options ( + argc, + argv, + &number_of_rows, + &number_of_columns, + &repeat_count, + &number_of_threads, + &verbose); + + if (verbose) printf ("Verbose mode enabled\n"); + +/* +* ----------------------------------------------------------------------------- +* Allocate storage for all data structures. +* ----------------------------------------------------------------------------- +*/ + (void) allocate_data ( + number_of_threads, number_of_rows, + number_of_columns, &A, &b, &c, &ref, + &thread_data_arguments, &pthread_ids); + + if (verbose) printf ("Allocated data structures\n"); + +/* +* ----------------------------------------------------------------------------- +* Initialize the data. +* ----------------------------------------------------------------------------- +*/ + (void) init_data (number_of_rows, number_of_columns, A, b, c, ref); + + if (verbose) printf ("Initialized matrix and vectors\n"); + +/* +* ----------------------------------------------------------------------------- +* Determine the main workload settings. +* ----------------------------------------------------------------------------- +*/ + (void) get_workload_stats ( + number_of_threads, number_of_rows, + number_of_columns, &rows_per_thread, + &remainder_rows, &active_threads); + + if (verbose) printf ("Defined workload distribution\n"); + + for (int64_t TID=active_threads; TID<number_of_threads; TID++) + { + thread_data_arguments[TID].do_work = false; + } + for (int64_t TID=0; TID<active_threads; TID++) + { + thread_data_arguments[TID].thread_id = TID; + thread_data_arguments[TID].verbose = verbose; + thread_data_arguments[TID].do_work = true; + thread_data_arguments[TID].repeat_count = repeat_count; + + (void) determine_work_per_thread ( + TID, rows_per_thread, remainder_rows, + &thread_data_arguments[TID].row_index_start, + &thread_data_arguments[TID].row_index_end); + + thread_data_arguments[TID].m = number_of_rows; + thread_data_arguments[TID].n = number_of_columns; + thread_data_arguments[TID].b = b; + thread_data_arguments[TID].c = c; + thread_data_arguments[TID].A = A; + } + + if (verbose) printf ("Assigned work to threads\n"); + +/* +* ----------------------------------------------------------------------------- +* Create and execute the threads. Note that this means that there will be +* <t+1> threads, with <t> the number of threads specified on the commandline, +* or the default if the -t option was not used. +* +* Per the pthread_create () call, the threads start executing right away. +* ----------------------------------------------------------------------------- +*/ + for (int TID=0; TID<active_threads; TID++) + { + if (pthread_create (&pthread_ids[TID], NULL, driver_mxv, + (void *) &thread_data_arguments[TID]) != 0) + { + printf ("Error creating thread %d\n", TID); + perror ("pthread_create"); exit (-1); + } + else + { + if (verbose) printf ("Thread %d has been created\n", TID); + } + } +/* +* ----------------------------------------------------------------------------- +* Wait for all threads to finish. +* ----------------------------------------------------------------------------- +*/ + for (int TID=0; TID<active_threads; TID++) + { + pthread_join (pthread_ids[TID], NULL); + } + + if (verbose) + { + printf ("Matrix vector multiplication has completed\n"); + printf ("Verify correctness of result\n"); + } + +/* +* ----------------------------------------------------------------------------- +* Check the numerical results. +* ----------------------------------------------------------------------------- +*/ + if ((errors = check_results (number_of_rows, number_of_columns, + c, ref)) == 0) + { + if (verbose) printf ("Error check passed\n"); + } + else + { + printf ("Error: %ld differences in the results detected\n", errors); + } + +/* +* ----------------------------------------------------------------------------- +* Print a summary of the execution. +* ----------------------------------------------------------------------------- +*/ + print_all_results (number_of_rows, number_of_columns, number_of_threads, + errors); + +/* +* ----------------------------------------------------------------------------- +* Release the allocated memory and end execution. +* ----------------------------------------------------------------------------- +*/ + free (A); + free (b); + free (c); + free (ref); + free (pthread_ids); + + return (0); +} + +/* +* ----------------------------------------------------------------------------- +* Parse user options and set variables accordingly. In case of an error, print +* a message, but do not bail out yet. In this way we can catch multiple input +* errors. +* ----------------------------------------------------------------------------- +*/ +int get_user_options (int argc, char *argv[], + int64_t *number_of_rows, + int64_t *number_of_columns, + int64_t *repeat_count, + int64_t *number_of_threads, + bool *verbose) +{ + int opt; + int errors = 0; + int64_t default_number_of_threads = 1; + int64_t default_rows = 2000; + int64_t default_columns = 3000; + int64_t default_repeat_count = 200; + bool default_verbose = false; + + *number_of_rows = default_rows; + *number_of_columns = default_columns; + *number_of_threads = default_number_of_threads; + *repeat_count = default_repeat_count; + *verbose = default_verbose; + + while ((opt = getopt (argc, argv, "m:n:r:t:vh")) != -1) + { + switch (opt) + { + case 'm': + *number_of_rows = atol (optarg); + break; + case 'n': + *number_of_columns = atol (optarg); + break; + case 'r': + *repeat_count = atol (optarg); + break; + case 't': + *number_of_threads = atol (optarg); + break; + case 'v': + *verbose = true; + break; + case 'h': + default: + printf ("Usage: %s " \ + "[-m <number of rows>] " \ + "[-n <number of columns] [-r <repeat count>] " \ + "[-t <number of threads] [-v] [-h]\n", argv[0]); + printf ("\t-m - number of rows, default = %ld\n", + default_rows); + printf ("\t-n - number of columns, default = %ld\n", + default_columns); + printf ("\t-r - the number of times the algorithm is " \ + "repeatedly executed, default = %ld\n", + default_repeat_count); + printf ("\t-t - the number of threads used, default = %ld\n", + default_number_of_threads); + printf ("\t-v - enable verbose mode, %s by default\n", + (default_verbose) ? "on" : "off"); + printf ("\t-h - print this usage overview and exit\n"); + + exit (0); + break; + } + } + +/* +* ----------------------------------------------------------------------------- +* Check for errors and bail out in case of problems. +* ----------------------------------------------------------------------------- +*/ + if (*number_of_rows <= 0) + { + errors++; + printf ("Error: The number of rows is %ld but should be strictly " \ + "positive\n", *number_of_rows); + } + if (*number_of_columns <= 0) + { + errors++; + printf ("Error: The number of columns is %ld but should be strictly " \ + "positive\n", *number_of_columns); + } + if (*repeat_count <= 0) + { + errors++; + printf ("Error: The repeat count is %ld but should be strictly " \ + "positive\n", *repeat_count); + } + if (*number_of_threads <= 0) + { + errors++; + printf ("Error: The number of threads is %ld but should be strictly " \ + "positive\n", *number_of_threads); + } + if (errors != 0) + { + printf ("There are %d input error (s)\n", errors); exit (-1); + } + + return (errors); +} + +/* +* ----------------------------------------------------------------------------- +* Print a summary of the execution status. +* ----------------------------------------------------------------------------- +*/ +void print_all_results (int64_t number_of_rows, + int64_t number_of_columns, + int64_t number_of_threads, + int64_t errors) +{ + printf ("mxv: error check %s - rows = %ld columns = %ld threads = %ld\n", + (errors == 0) ? "passed" : "failed", + number_of_rows, number_of_columns, number_of_threads); +} + +/* +* ----------------------------------------------------------------------------- +* Check whether the computations produced the correct results. +* ----------------------------------------------------------------------------- +*/ +int64_t check_results (int64_t m, int64_t n, double *c, double *ref) +{ + char *marker; + int64_t errors = 0; + double relerr; + double TOL = 100.0 * DBL_EPSILON; + double SMALL = 100.0 * DBL_MIN; + + if ((marker=(char *)malloc (m*sizeof (char))) == NULL) + { + perror ("array marker"); + exit (-1); + } + + for (int64_t i=0; i<m; i++) + { + if (fabs (ref[i]) > SMALL) + { + relerr = fabs ((c[i]-ref[i])/ref[i]); + } + else + { + relerr = fabs ((c[i]-ref[i])); + } + if (relerr <= TOL) + { + marker[i] = ' '; + } + else + { + errors++; + marker[i] = '*'; + } + } + if (errors > 0) + { + printf ("Found %ld differences in results for m = %ld n = %ld:\n", + errors,m,n); + for (int64_t i=0; i<m; i++) + printf (" %c c[%ld] = %f ref[%ld] = %f\n",marker[i],i,c[i],i,ref[i]); + } + + return (errors); +} diff --git a/gprofng/examples/mxv-pthreads/src/manage_data.c b/gprofng/examples/mxv-pthreads/src/manage_data.c new file mode 100644 index 0000000..3f2891c --- /dev/null +++ b/gprofng/examples/mxv-pthreads/src/manage_data.c @@ -0,0 +1,148 @@ +/* Copyright (C) 2021-2023 Free Software Foundation, Inc. + Contributed by Oracle. + + This file is part of GNU Binutils. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, 51 Franklin Street - Fifth Floor, Boston, + MA 02110-1301, USA. */ + +#include "mydefs.h" + +bool verbose; + +/* +* ----------------------------------------------------------------------------- +* This function allocates the data and sets up the data structures to be used +* in the remainder. +* ----------------------------------------------------------------------------- +*/ +void allocate_data (int active_threads, + int64_t number_of_rows, + int64_t number_of_columns, + double ***A, + double **b, + double **c, + double **ref, + thread_data **thread_data_arguments, + pthread_t **pthread_ids) +{ + if ((*b = (double *) malloc (number_of_columns * sizeof (double))) == NULL) + { + printf ("Error: allocation of vector b failed\n"); + perror ("vector b"); + exit (-1); + } + else + { + if (verbose) printf ("Vector b allocated\n"); + } + + if ((*c = (double *) malloc (number_of_rows * sizeof (double))) == NULL) + { + printf ("Error: allocation of vector c failed\n"); + perror ("vector c"); + exit (-1); + } + else + { + if (verbose) printf ("Vector c allocated\n"); + } + + if ((*ref = (double *) malloc (number_of_rows * sizeof (double))) == NULL) + { + printf ("Error: allocation of vector ref failed\n"); + perror ("vector ref"); + exit (-1); + } + + if ((*A = (double **) malloc (number_of_rows * sizeof (double))) == NULL) + { + printf ("Error: allocation of matrix A failed\n"); + perror ("matrix A"); + exit (-1); + } + else + { + for (int64_t i=0; i<number_of_rows; i++) + { + if (((*A)[i] = (double *) malloc (number_of_columns + * sizeof (double))) == NULL) + { + printf ("Error: allocation of matrix A columns failed\n"); + perror ("matrix A[i]"); + exit (-1); + } + } + if (verbose) printf ("Matrix A allocated\n"); + } + + + if ((*thread_data_arguments = (thread_data *) malloc ((active_threads) + * sizeof (thread_data))) == NULL) + { + perror ("malloc thread_data_arguments"); + exit (-1); + } + else + { + if (verbose) printf ("Structure thread_data_arguments allocated\n"); + } + + if ((*pthread_ids = (pthread_t *) malloc ((active_threads) + * sizeof (pthread_t))) == NULL) + { + perror ("malloc pthread_ids"); + exit (-1); + } + else + { + if (verbose) printf ("Structure pthread_ids allocated\n"); + } +} + +/* +* ----------------------------------------------------------------------------- +* This function initializes the data. +* ----------------------------------------------------------------------------- +*/ +void init_data (int64_t m, + int64_t n, + double **restrict A, + double *restrict b, + double *restrict c, + double *restrict ref) +{ + + (void) srand48 (2020L); + + for (int64_t j=0; j<n; j++) + b[j] = 1.0; + + for (int64_t i=0; i<m; i++) + { + ref[i] = n*i; + c[i] = -2022; + for (int64_t j=0; j<n; j++) + A[i][j] = drand48 (); + } + + for (int64_t i=0; i<m; i++) + { + double row_sum = 0.0; + for (int64_t j=0; j<n; j++) + row_sum += A[i][j]; + ref[i] = row_sum; + } +} diff --git a/gprofng/examples/mxv-pthreads/src/mxv.c b/gprofng/examples/mxv-pthreads/src/mxv.c new file mode 100644 index 0000000..1ccbbda --- /dev/null +++ b/gprofng/examples/mxv-pthreads/src/mxv.c @@ -0,0 +1,78 @@ +/* Copyright (C) 2021-2023 Free Software Foundation, Inc. + Contributed by Oracle. + + This file is part of GNU Binutils. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, 51 Franklin Street - Fifth Floor, Boston, + MA 02110-1301, USA. */ + +#include "mydefs.h" + +/* +* ----------------------------------------------------------------------------- +* Driver for the core computational part. +* ----------------------------------------------------------------------------- +*/ +void *driver_mxv (void *thread_arguments) +{ + thread_data *local_data; + + local_data = (thread_data *) thread_arguments; + + bool do_work = local_data->do_work; + int64_t repeat_count = local_data->repeat_count; + int64_t row_index_start = local_data->row_index_start; + int64_t row_index_end = local_data->row_index_end; + int64_t m = local_data->m; + int64_t n = local_data->n; + double *b = local_data->b; + double *c = local_data->c; + double **A = local_data->A; + + if (do_work) + { + for (int64_t r=0; r<repeat_count; r++) + { + (void) mxv_core (row_index_start, row_index_end, m, n, A, b, c); + } + } + + return (0); +} + +/* +* ----------------------------------------------------------------------------- +* Computational heart of the algorithm. +* +* Disable inlining to avoid the repeat count loop is removed by the compiler. +* This is only done to make for a more interesting call tree. +* ----------------------------------------------------------------------------- +*/ +void __attribute__ ((noinline)) mxv_core (int64_t row_index_start, + int64_t row_index_end, + int64_t m, + int64_t n, + double **restrict A, + double *restrict b, + double *restrict c) +{ + for (int64_t i=row_index_start; i<=row_index_end; i++) + { + double row_sum = 0.0; + for (int64_t j=0; j<n; j++) + row_sum += A[i][j] * b[j]; + c[i] = row_sum; + } +} diff --git a/gprofng/examples/mxv-pthreads/src/mydefs.h b/gprofng/examples/mxv-pthreads/src/mydefs.h new file mode 100644 index 0000000..eae0834 --- /dev/null +++ b/gprofng/examples/mxv-pthreads/src/mydefs.h @@ -0,0 +1,117 @@ +/* Copyright (C) 2021-2023 Free Software Foundation, Inc. + Contributed by Oracle. + + This file is part of GNU Binutils. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, 51 Franklin Street - Fifth Floor, Boston, + MA 02110-1301, USA. */ + +#ifndef ALREADY_INCLUDED +#define ALREADY_INCLUDED + +#include <stdlib.h> +#include <stdio.h> +#include <stdint.h> +#include <stdbool.h> +#include <string.h> +#include <unistd.h> +#include <float.h> +#include <math.h> +#include <malloc.h> +#include <pthread.h> + +struct thread_arguments_data { + int thread_id; + bool verbose; + bool do_work; + int64_t repeat_count; + int64_t row_index_start; + int64_t row_index_end; + int64_t m; + int64_t n; + double *b; + double *c; + double **A; +}; + +typedef struct thread_arguments_data thread_data; + +void *driver_mxv (void *thread_arguments); + +void __attribute__ ((noinline)) mxv_core (int64_t row_index_start, + int64_t row_index_end, + int64_t m, + int64_t n, + double **restrict A, + double *restrict b, + double *restrict c); + +int get_user_options (int argc, + char *argv[], + int64_t *number_of_rows, + int64_t *number_of_columns, + int64_t *repeat_count, + int64_t *number_of_threads, + bool *verbose); + +void init_data (int64_t m, + int64_t n, + double **restrict A, + double *restrict b, + double *restrict c, + double *restrict ref); + +void allocate_data (int active_threads, + int64_t number_of_rows, + int64_t number_of_columns, + double ***A, + double **b, + double **c, + double **ref, + thread_data **thread_data_arguments, + pthread_t **pthread_ids); + +int64_t check_results (int64_t m, + int64_t n, + double *c, + double *ref); + +void get_workload_stats (int64_t number_of_threads, + int64_t number_of_rows, + int64_t number_of_columns, + int64_t *rows_per_thread, + int64_t *remainder_rows, + int64_t *active_threads); + +void determine_work_per_thread (int64_t TID, + int64_t rows_per_thread, + int64_t remainder_rows, + int64_t *row_index_start, + int64_t *row_index_end); + +void mxv (int64_t m, + int64_t n, + double **restrict A, + double *restrict b, + double *restrict c); + +void print_all_results (int64_t number_of_rows, + int64_t number_of_columns, + int64_t number_of_threads, + int64_t errors); + +extern bool verbose; + +#endif diff --git a/gprofng/examples/mxv-pthreads/src/workload.c b/gprofng/examples/mxv-pthreads/src/workload.c new file mode 100644 index 0000000..fca0e81 --- /dev/null +++ b/gprofng/examples/mxv-pthreads/src/workload.c @@ -0,0 +1,91 @@ +/* Copyright (C) 2021-2023 Free Software Foundation, Inc. + Contributed by Oracle. + + This file is part of GNU Binutils. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, 51 Franklin Street - Fifth Floor, Boston, + MA 02110-1301, USA. */ + +#include "mydefs.h" + +/* +* ----------------------------------------------------------------------------- +* This function determines the number of rows each thread will be working on +* and also how many threads will be active. +* ----------------------------------------------------------------------------- +*/ +void get_workload_stats (int64_t number_of_threads, + int64_t number_of_rows, + int64_t number_of_columns, + int64_t *rows_per_thread, + int64_t *remainder_rows, + int64_t *active_threads) +{ + if (number_of_threads <= number_of_rows) + { + *remainder_rows = number_of_rows%number_of_threads; + *rows_per_thread = (number_of_rows - (*remainder_rows))/number_of_threads; + } + else + { + *remainder_rows = 0; + *rows_per_thread = 1; + } + + *active_threads = number_of_threads < number_of_rows + ? number_of_threads : number_of_rows; + + if (verbose) + { + printf ("Rows per thread = %ld remainder = %ld\n", + *rows_per_thread, *remainder_rows); + printf ("Number of active threads = %ld\n", *active_threads); + } +} + +/* +* ----------------------------------------------------------------------------- +* This function determines which rows each thread will be working on. +* ----------------------------------------------------------------------------- +*/ +void determine_work_per_thread (int64_t TID, int64_t rows_per_thread, + int64_t remainder_rows, + int64_t *row_index_start, + int64_t *row_index_end) +{ + int64_t chunk_per_thread; + + if (TID < remainder_rows) + { + chunk_per_thread = rows_per_thread + 1; + *row_index_start = TID * chunk_per_thread; + *row_index_end = (TID + 1) * chunk_per_thread - 1; + } + else + { + chunk_per_thread = rows_per_thread; + *row_index_start = remainder_rows * (rows_per_thread + 1) + + (TID - remainder_rows) * chunk_per_thread; + *row_index_end = remainder_rows * (rows_per_thread + 1) + + (TID - remainder_rows) * chunk_per_thread + + chunk_per_thread - 1; + } + + if (verbose) + { + printf ("TID = %ld row_index_start = %ld row_index_end = %ld\n", + TID, *row_index_start, *row_index_end); + } +} |