aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBenjamin Kosnik <bkoz@redhat.com>2008-03-20 14:20:49 +0000
committerBenjamin Kosnik <bkoz@gcc.gnu.org>2008-03-20 14:20:49 +0000
commit1285e2a25db39ca03eb0c0474a5d03c5a12782b4 (patch)
treef8420e783a074cdd0190903ad3f7a9c2aa2df111
parent6fd85d214441ab1760f2d650399433fbcb7681d2 (diff)
downloadgcc-1285e2a25db39ca03eb0c0474a5d03c5a12782b4.zip
gcc-1285e2a25db39ca03eb0c0474a5d03c5a12782b4.tar.gz
gcc-1285e2a25db39ca03eb0c0474a5d03c5a12782b4.tar.bz2
re PR libstdc++/35256 (Bad link on http://gcc.gnu.org/onlinedocs/libstdc++/parallel_mode.html)
2008-03-19 Benjamin Kosnik <bkoz@redhat.com> PR libstdc++/35256 * doc/xml/manual/parallel_mode.xml: Correct configuration documentation. * doc/html/manual/bk01pt12ch31s04.html: Regenerate. From-SVN: r133378
-rw-r--r--libstdc++-v3/ChangeLog6
-rw-r--r--libstdc++-v3/doc/html/manual/bk01pt12ch31s04.html139
-rw-r--r--libstdc++-v3/doc/xml/manual/parallel_mode.xml290
3 files changed, 334 insertions, 101 deletions
diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 996bed9..d794b80 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,9 @@
+2008-03-19 Benjamin Kosnik <bkoz@redhat.com>
+
+ PR libstdc++/35256
+ * doc/xml/manual/parallel_mode.xml: Correct configuration documentation.
+ * doc/html/manual/bk01pt12ch31s04.html: Regenerate.
+
2008-03-18 Benjamin Kosnik <bkoz@redhat.com>
* configure.ac (libtool_VERSION): To 6:11:0.
diff --git a/libstdc++-v3/doc/html/manual/bk01pt12ch31s04.html b/libstdc++-v3/doc/html/manual/bk01pt12ch31s04.html
index 99c1356..3db7d91 100644
--- a/libstdc++-v3/doc/html/manual/bk01pt12ch31s04.html
+++ b/libstdc++-v3/doc/html/manual/bk01pt12ch31s04.html
@@ -1,9 +1,10 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Design</title><meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /><meta name="keywords" content="&#10; C++&#10; , &#10; library&#10; , &#10; parallel&#10; " /><meta name="keywords" content="&#10; ISO C++&#10; , &#10; library&#10; " /><link rel="start" href="../spine.html" title="The GNU C++ Library Documentation" /><link rel="up" href="parallel_mode.html" title="Chapter 31. Parallel Mode" /><link rel="prev" href="bk01pt12ch31s03.html" title="Using" /><link rel="next" href="bk01pt12ch31s05.html" title="Testing" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Design</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="bk01pt12ch31s03.html">Prev</a> </td><th width="60%" align="center">Chapter 31. Parallel Mode</th><td width="20%" align="right"> <a accesskey="n" href="bk01pt12ch31s05.html">Next</a></td></tr></table><hr /></div><div class="sect1" lang="en" xml:lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.parallel_mode.design"></a>Design</h2></div></div></div><p>
- </p><div class="sect2" lang="en" xml:lang="en"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.parallel_mode.design.intro"></a>Interface Basics</h3></div></div></div><p>All parallel algorithms are intended to have signatures that are
+ </p><div class="sect2" lang="en" xml:lang="en"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.parallel_mode.design.intro"></a>Interface Basics</h3></div></div></div><p>
+All parallel algorithms are intended to have signatures that are
equivalent to the ISO C++ algorithms replaced. For instance, the
-<code class="code">std::adjacent_find</code> function is declared as:
+<code class="function">std::adjacent_find</code> function is declared as:
</p><pre class="programlisting">
namespace std
{
@@ -57,36 +58,124 @@ parallel algorithms look like this:
ISO C++ signature to the correct parallel version. Also, some of the
algorithms do not have support for run-time conditions, so the last
overload is therefore missing.
-</p></div><div class="sect2" lang="en" xml:lang="en"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.parallel_mode.design.tuning"></a>Configuration and Tuning</h3></div></div></div><p> Some algorithm variants can be enabled/disabled/selected at compile-time.
-See <a class="ulink" href="latest-doxygen/compiletime__settings_8h.html" target="_top">
-<code class="code">&lt;compiletime_settings.h&gt;</code></a> and
-See <a class="ulink" href="latest-doxygen/compiletime__settings_8h.html" target="_top">
-<code class="code">&lt;features.h&gt;</code></a> for details.
+</p></div><div class="sect2" lang="en" xml:lang="en"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.parallel_mode.design.tuning"></a>Configuration and Tuning</h3></div></div></div><div class="sect3" lang="en" xml:lang="en"><div class="titlepage"><div><div><h4 class="title"><a id="parallel_mode.design.tuning.omp"></a>Setting up the OpenMP Environment</h4></div></div></div><p>
+Several aspects of the overall runtime environment can be manipulated
+by standard OpenMP function calls.
</p><p>
-To specify the number of threads to be used for an algorithm,
-use <code class="code">omp_set_num_threads</code>.
-To force a function to execute sequentially,
-even though parallelism is switched on in general,
-add <code class="code">__gnu_parallel::sequential_tag()</code>
-to the end of the argument list.
+To specify the number of threads to be used for an algorithm, use the
+function <code class="function">omp_set_num_threads</code>. An example:
+</p><pre class="programlisting">
+#include &lt;stdlib.h&gt;
+#include &lt;omp.h&gt;
+
+int main()
+{
+ // Explicitly set number of threads.
+ const int threads_wanted = 20;
+ omp_set_dynamic(false);
+ omp_set_num_threads(threads_wanted);
+ if (omp_get_num_threads() != threads_wanted)
+ abort();
+
+ // Do work.
+
+ return 0;
+}
+</pre><p>
+Other parts of the runtime environment able to be manipulated include
+nested parallelism (<code class="function">omp_set_nested</code>), schedule kind
+(<code class="function">omp_set_schedule</code>), and others. See the OpenMP
+documentation for more information.
+</p></div><div class="sect3" lang="en" xml:lang="en"><div class="titlepage"><div><div><h4 class="title"><a id="parallel_mode.design.tuning.compile"></a>Compile Time Switches</h4></div></div></div><p>
+To force an algorithm to execute sequentially, even though parallelism
+is switched on in general via the macro <code class="constant">_GLIBCXX_PARALLEL</code>,
+add <code class="classname">__gnu_parallel::sequential_tag()</code> to the end
+of the algorithm's argument list, or explicitly qualify the algorithm
+with the <code class="code">__gnu_parallel::</code> namespace.
+</p><p>
+Like so:
+</p><pre class="programlisting">
+std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag());
+</pre><p>
+or
+</p><pre class="programlisting">
+__gnu_serial::sort(v.begin(), v.end());
+</pre><p>
+In addition, some parallel algorithm variants can be enabled/disabled/selected
+at compile-time.
+</p><p>
+See <a class="ulink" href="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00446.html" target="_top"><code class="filename">compiletime_settings.h</code></a> and
+See <a class="ulink" href="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00505.html" target="_top"><code class="filename">features.h</code></a> for details.
+</p></div><div class="sect3" lang="en" xml:lang="en"><div class="titlepage"><div><div><h4 class="title"><a id="parallel_mode.design.tuning.settings"></a>Run Time Settings and Defaults</h4></div></div></div><p>
+The default parallization strategy, the choice of specific algorithm
+strategy, the minimum threshold limits for individual parallel
+algorithms, and aspects of the underlying hardware can be specified as
+desired via manipulation
+of <code class="classname">__gnu_parallel::_Settings</code> member data.
</p><p>
-Parallelism always incurs some overhead. Thus, it is not
-helpful to parallelize operations on very small sets of data.
-There are measures to avoid parallelizing stuff that is not worth it.
-For each algorithm, a minimum problem size can be stated,
-usually using the variable
-<code class="code">__gnu_parallel::Settings::[algorithm]_minimal_n</code>.
-Please see <a class="ulink" href="latest-doxygen/settings_8h.html" target="_top">
-<code class="code">&lt;settings.h&gt;</code></a> for details.</p></div><div class="sect2" lang="en" xml:lang="en"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.parallel_mode.design.impl"></a>Implementation Namespaces</h3></div></div></div><p> One namespace contain versions of code that are explicitly sequential:
+First off, the choice of parallelization strategy: serial, parallel,
+or implementation-deduced. This corresponds
+to <code class="code">__gnu_parallel::_Settings::algorithm_strategy</code> and is a
+value of enum <span class="type">__gnu_parallel::_AlgorithmStrategy</span>
+type. Choices
+include: <span class="type">heuristic</span>, <span class="type">force_sequential</span>,
+and <span class="type">force_parallel</span>. The default is
+implementation-deduced, ie <span class="type">heuristic</span>.
+</p><p>
+Next, the sub-choices for algorithm implementation. Specific
+algorithms like <code class="function">find</code> or <code class="function">sort</code>
+can be implemented in multiple ways: when this is the case,
+a <code class="classname">__gnu_parallel::_Settings</code> member exists to
+pick the default strategy. For
+example, <code class="code">__gnu_parallel::_Settings::sort_algorithm</code> can
+have any values of
+enum <span class="type">__gnu_parallel::_SortAlgorithm</span>: <span class="type">MWMS</span>, <span class="type">QS</span>,
+or <span class="type">QS_BALANCED</span>.
+</p><p>
+Likewise for setting the minimal threshold for algorithm
+paralleization. Parallelism always incurs some overhead. Thus, it is
+not helpful to parallelize operations on very small sets of
+data. Because of this, measures are taken to avoid parallelizing below
+a certain, pre-determined threshold. For each algorithm, a minimum
+problem size is encoded as a variable in the
+active <code class="classname">__gnu_parallel::_Settings</code> object. This
+threshold variable follows the following naming scheme:
+<code class="code">__gnu_parallel::_Settings::[algorithm]_minimal_n</code>. So,
+for <code class="function">fill</code>, the threshold variable
+is <code class="code">__gnu_parallel::_Settings::fill_minimal_n</code>
+</p><p>
+Finally, hardware details like L1/L2 cache size can be hardwired
+via <code class="code">__gnu_parallel::_Settings::L1_cache_size</code> and friends.
+</p><p>
+All these configuration variables can be changed by the user, if
+desired. Please
+see <a class="ulink" href="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00640.html" target="_top"><code class="filename">settings.h</code></a>
+for complete details.
+</p><p>
+A small example of tuning the default:
+</p><pre class="programlisting">
+#include &lt;parallel/algorithm&gt;
+#include &lt;parallel/settings.h&gt;
+
+int main()
+{
+ __gnu_parallel::_Settings s;
+ s.algorithm_strategy = __gnu_parallel::force_parallel;
+ __gnu_parallel::_Settings::set(s);
+
+ // Do work... all algorithms will be parallelized, always.
+
+ return 0;
+}
+</pre></div></div><div class="sect2" lang="en" xml:lang="en"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.parallel_mode.design.impl"></a>Implementation Namespaces</h3></div></div></div><p> One namespace contain versions of code that are always
+explicitly sequential:
<code class="code">__gnu_serial</code>.
</p><p> Two namespaces contain the parallel mode:
<code class="code">std::__parallel</code> and <code class="code">__gnu_parallel</code>.
</p><p> Parallel implementations of standard components, including
template helpers to select parallelism, are defined in <code class="code">namespace
-std::__parallel</code>. For instance, <code class="code">std::transform</code> from
-&lt;algorithm&gt; has a parallel counterpart in
-<code class="code">std::__parallel::transform</code> from
-&lt;parallel/algorithm&gt;. In addition, these parallel
+std::__parallel</code>. For instance, <code class="function">std::transform</code> from <code class="filename">algorithm</code> has a parallel counterpart in
+<code class="function">std::__parallel::transform</code> from <code class="filename">parallel/algorithm</code>. In addition, these parallel
implementations are injected into <code class="code">namespace
__gnu_parallel</code> with using declarations.
</p><p> Support and general infrastructure is in <code class="code">namespace
diff --git a/libstdc++-v3/doc/xml/manual/parallel_mode.xml b/libstdc++-v3/doc/xml/manual/parallel_mode.xml
index 4236f63..0bcbbca 100644
--- a/libstdc++-v3/doc/xml/manual/parallel_mode.xml
+++ b/libstdc++-v3/doc/xml/manual/parallel_mode.xml
@@ -28,7 +28,7 @@ implementation of many algorithms the C++ Standard Library.
<para>
Several of the standard algorithms, for instance
-<code>std::sort</code>, are made parallel using OpenMP
+<function>std::sort</function>, are made parallel using OpenMP
annotations. These parallel mode constructs and can be invoked by
explicit source declaration or by compiling existing sources with a
specific compiler flag.
@@ -39,52 +39,52 @@ specific compiler flag.
<title>Intro</title>
<para>The following library components in the include
-<code>&lt;numeric&gt;</code> are included in the parallel mode:</para>
+<filename class="headerfile">numeric</filename> are included in the parallel mode:</para>
<itemizedlist>
- <listitem><para><code>std::accumulate</code></para></listitem>
- <listitem><para><code>std::adjacent_difference</code></para></listitem>
- <listitem><para><code>std::inner_product</code></para></listitem>
- <listitem><para><code>std::partial_sum</code></para></listitem>
+ <listitem><para><function>std::accumulate</function></para></listitem>
+ <listitem><para><function>std::adjacent_difference</function></para></listitem>
+ <listitem><para><function>std::inner_product</function></para></listitem>
+ <listitem><para><function>std::partial_sum</function></para></listitem>
</itemizedlist>
<para>The following library components in the include
-<code>&lt;algorithm&gt;</code> are included in the parallel mode:</para>
+<filename class="headerfile">algorithm</filename> are included in the parallel mode:</para>
<itemizedlist>
- <listitem><para><code>std::adjacent_find</code></para></listitem>
- <listitem><para><code>std::count</code></para></listitem>
- <listitem><para><code>std::count_if</code></para></listitem>
- <listitem><para><code>std::equal</code></para></listitem>
- <listitem><para><code>std::find</code></para></listitem>
- <listitem><para><code>std::find_if</code></para></listitem>
- <listitem><para><code>std::find_first_of</code></para></listitem>
- <listitem><para><code>std::for_each</code></para></listitem>
- <listitem><para><code>std::generate</code></para></listitem>
- <listitem><para><code>std::generate_n</code></para></listitem>
- <listitem><para><code>std::lexicographical_compare</code></para></listitem>
- <listitem><para><code>std::mismatch</code></para></listitem>
- <listitem><para><code>std::search</code></para></listitem>
- <listitem><para><code>std::search_n</code></para></listitem>
- <listitem><para><code>std::transform</code></para></listitem>
- <listitem><para><code>std::replace</code></para></listitem>
- <listitem><para><code>std::replace_if</code></para></listitem>
- <listitem><para><code>std::max_element</code></para></listitem>
- <listitem><para><code>std::merge</code></para></listitem>
- <listitem><para><code>std::min_element</code></para></listitem>
- <listitem><para><code>std::nth_element</code></para></listitem>
- <listitem><para><code>std::partial_sort</code></para></listitem>
- <listitem><para><code>std::partition</code></para></listitem>
- <listitem><para><code>std::random_shuffle</code></para></listitem>
- <listitem><para><code>std::set_union</code></para></listitem>
- <listitem><para><code>std::set_intersection</code></para></listitem>
- <listitem><para><code>std::set_symmetric_difference</code></para></listitem>
- <listitem><para><code>std::set_difference</code></para></listitem>
- <listitem><para><code>std::sort</code></para></listitem>
- <listitem><para><code>std::stable_sort</code></para></listitem>
- <listitem><para><code>std::unique_copy</code></para></listitem>
+ <listitem><para><function>std::adjacent_find</function></para></listitem>
+ <listitem><para><function>std::count</function></para></listitem>
+ <listitem><para><function>std::count_if</function></para></listitem>
+ <listitem><para><function>std::equal</function></para></listitem>
+ <listitem><para><function>std::find</function></para></listitem>
+ <listitem><para><function>std::find_if</function></para></listitem>
+ <listitem><para><function>std::find_first_of</function></para></listitem>
+ <listitem><para><function>std::for_each</function></para></listitem>
+ <listitem><para><function>std::generate</function></para></listitem>
+ <listitem><para><function>std::generate_n</function></para></listitem>
+ <listitem><para><function>std::lexicographical_compare</function></para></listitem>
+ <listitem><para><function>std::mismatch</function></para></listitem>
+ <listitem><para><function>std::search</function></para></listitem>
+ <listitem><para><function>std::search_n</function></para></listitem>
+ <listitem><para><function>std::transform</function></para></listitem>
+ <listitem><para><function>std::replace</function></para></listitem>
+ <listitem><para><function>std::replace_if</function></para></listitem>
+ <listitem><para><function>std::max_element</function></para></listitem>
+ <listitem><para><function>std::merge</function></para></listitem>
+ <listitem><para><function>std::min_element</function></para></listitem>
+ <listitem><para><function>std::nth_element</function></para></listitem>
+ <listitem><para><function>std::partial_sort</function></para></listitem>
+ <listitem><para><function>std::partition</function></para></listitem>
+ <listitem><para><function>std::random_shuffle</function></para></listitem>
+ <listitem><para><function>std::set_union</function></para></listitem>
+ <listitem><para><function>std::set_intersection</function></para></listitem>
+ <listitem><para><function>std::set_symmetric_difference</function></para></listitem>
+ <listitem><para><function>std::set_difference</function></para></listitem>
+ <listitem><para><function>std::sort</function></para></listitem>
+ <listitem><para><function>std::stable_sort</function></para></listitem>
+ <listitem><para><function>std::unique_copy</function></para></listitem>
</itemizedlist>
<para>The following library components in the includes
-<code>&lt;set&gt;</code> and <code>&lt;map&gt;</code> are included in the parallel mode:</para>
+<filename class="headerfile">set</filename> and <filename class="headerfile">map</filename> are included in the parallel mode:</para>
<itemizedlist>
<listitem><para><code>std::(multi_)map/set&lt;T&gt;::(multi_)map/set(Iterator begin, Iterator end)</code> (bulk construction)</para></listitem>
<listitem><para><code>std::(multi_)map/set&lt;T&gt;::insert(Iterator begin, Iterator end)</code> (bulk insertion)</para></listitem>
@@ -113,23 +113,25 @@ It might work with other compilers, though.</para>
<sect2 id="parallel_mode.using.parallel_mode" xreflabel="using.parallel_mode">
<title>Using Parallel Mode</title>
-<para>To use the libstdc++ parallel mode, compile your application with
- the compiler flag <code>-D_GLIBCXX_PARALLEL -fopenmp</code>. This
+<para>
+ To use the libstdc++ parallel mode, compile your application with
+ the compiler flag <constant>-D_GLIBCXX_PARALLEL -fopenmp</constant>. This
will link in <code>libgomp</code>, the GNU OpenMP <ulink url="http://gcc.gnu.org/onlinedocs/libgomp">implementation</ulink>,
whose presence is mandatory. In addition, hardware capable of atomic
operations is mandatory. Actually activating these atomic
operations may require explicit compiler flags on some targets
- (like sparc and x86), such as <code>-march=i686</code>,
- <code>-march=native</code> or <code>-mcpu=v9</code>.
+ (like sparc and x86), such as <literal>-march=i686</literal>,
+ <literal>-march=native</literal> or <literal>-mcpu=v9</literal>.
</para>
-<para>Note that the <code>_GLIBCXX_PARALLEL</code> define may change the
+<para>Note that the <constant>_GLIBCXX_PARALLEL</constant> define may change the
sizes and behavior of standard class templates such as
- <code>std::search</code>, and therefore one can only link code
+ <function>std::search</function>, and therefore one can only link code
compiled with parallel mode and code compiled without parallel mode
if no instantiation of a container is passed between the two
translation units. Parallel mode functionality has distinct linkage,
- and cannot be confused with normal mode symbols.</para>
+ and cannot be confused with normal mode symbols.
+</para>
</sect2>
<sect2 id="manual.ext.parallel_mode.usings" xreflabel="using.specific">
@@ -420,9 +422,10 @@ It might work with other compilers, though.</para>
<title>Interface Basics</title>
-<para>All parallel algorithms are intended to have signatures that are
+<para>
+All parallel algorithms are intended to have signatures that are
equivalent to the ISO C++ algorithms replaced. For instance, the
-<code>std::adjacent_find</code> function is declared as:
+<function>std::adjacent_find</function> function is declared as:
</para>
<programlisting>
namespace std
@@ -506,39 +509,176 @@ overload is therefore missing.
<sect2 id="manual.ext.parallel_mode.design.tuning" xreflabel="Tuning">
<title>Configuration and Tuning</title>
-<para> Some algorithm variants can be enabled/disabled/selected at compile-time.
-See <ulink url="latest-doxygen/compiletime__settings_8h.html">
-<code>&lt;compiletime_settings.h&gt;</code></ulink> and
-See <ulink url="latest-doxygen/compiletime__settings_8h.html">
-<code>&lt;features.h&gt;</code></ulink> for details.
+
+<sect3 id="parallel_mode.design.tuning.omp" xreflabel="OpenMP Environment">
+ <title>Setting up the OpenMP Environment</title>
+
+<para>
+Several aspects of the overall runtime environment can be manipulated
+by standard OpenMP function calls.
+</para>
+
+<para>
+To specify the number of threads to be used for an algorithm, use the
+function <function>omp_set_num_threads</function>. An example:
+</para>
+
+<programlisting>
+#include &lt;stdlib.h&gt;
+#include &lt;omp.h&gt;
+
+int main()
+{
+ // Explicitly set number of threads.
+ const int threads_wanted = 20;
+ omp_set_dynamic(false);
+ omp_set_num_threads(threads_wanted);
+ if (omp_get_num_threads() != threads_wanted)
+ abort();
+
+ // Do work.
+
+ return 0;
+}
+</programlisting>
+
+<para>
+Other parts of the runtime environment able to be manipulated include
+nested parallelism (<function>omp_set_nested</function>), schedule kind
+(<function>omp_set_schedule</function>), and others. See the OpenMP
+documentation for more information.
+</para>
+
+</sect3>
+
+<sect3 id="parallel_mode.design.tuning.compile" xreflabel="Compile Switches">
+ <title>Compile Time Switches</title>
+
+<para>
+To force an algorithm to execute sequentially, even though parallelism
+is switched on in general via the macro <constant>_GLIBCXX_PARALLEL</constant>,
+add <classname>__gnu_parallel::sequential_tag()</classname> to the end
+of the algorithm's argument list, or explicitly qualify the algorithm
+with the <code>__gnu_parallel::</code> namespace.
+</para>
+
+<para>
+Like so:
+</para>
+
+<programlisting>
+std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag());
+</programlisting>
+
+<para>
+or
+</para>
+
+<programlisting>
+__gnu_serial::sort(v.begin(), v.end());
+</programlisting>
+
+<para>
+In addition, some parallel algorithm variants can be enabled/disabled/selected
+at compile-time.
+</para>
+
+<para>
+See <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00446.html"><filename class="headerfile">compiletime_settings.h</filename></ulink> and
+See <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00505.html"><filename class="headerfile">features.h</filename></ulink> for details.
+</para>
+</sect3>
+
+<sect3 id="parallel_mode.design.tuning.settings" xreflabel="_Settings">
+ <title>Run Time Settings and Defaults</title>
+
+<para>
+The default parallization strategy, the choice of specific algorithm
+strategy, the minimum threshold limits for individual parallel
+algorithms, and aspects of the underlying hardware can be specified as
+desired via manipulation
+of <classname>__gnu_parallel::_Settings</classname> member data.
+</para>
+
+<para>
+First off, the choice of parallelization strategy: serial, parallel,
+or implementation-deduced. This corresponds
+to <code>__gnu_parallel::_Settings::algorithm_strategy</code> and is a
+value of enum <type>__gnu_parallel::_AlgorithmStrategy</type>
+type. Choices
+include: <type>heuristic</type>, <type>force_sequential</type>,
+and <type>force_parallel</type>. The default is
+implementation-deduced, ie <type>heuristic</type>.
+</para>
+
+
+<para>
+Next, the sub-choices for algorithm implementation. Specific
+algorithms like <function>find</function> or <function>sort</function>
+can be implemented in multiple ways: when this is the case,
+a <classname>__gnu_parallel::_Settings</classname> member exists to
+pick the default strategy. For
+example, <code>__gnu_parallel::_Settings::sort_algorithm</code> can
+have any values of
+enum <type>__gnu_parallel::_SortAlgorithm</type>: <type>MWMS</type>, <type>QS</type>,
+or <type>QS_BALANCED</type>.
+</para>
+
+<para>
+Likewise for setting the minimal threshold for algorithm
+paralleization. Parallelism always incurs some overhead. Thus, it is
+not helpful to parallelize operations on very small sets of
+data. Because of this, measures are taken to avoid parallelizing below
+a certain, pre-determined threshold. For each algorithm, a minimum
+problem size is encoded as a variable in the
+active <classname>__gnu_parallel::_Settings</classname> object. This
+threshold variable follows the following naming scheme:
+<code>__gnu_parallel::_Settings::[algorithm]_minimal_n</code>. So,
+for <function>fill</function>, the threshold variable
+is <code>__gnu_parallel::_Settings::fill_minimal_n</code>
</para>
<para>
-To specify the number of threads to be used for an algorithm,
-use <code>omp_set_num_threads</code>.
-To force a function to execute sequentially,
-even though parallelism is switched on in general,
-add <code>__gnu_parallel::sequential_tag()</code>
-to the end of the argument list.
+Finally, hardware details like L1/L2 cache size can be hardwired
+via <code>__gnu_parallel::_Settings::L1_cache_size</code> and friends.
</para>
<para>
-Parallelism always incurs some overhead. Thus, it is not
-helpful to parallelize operations on very small sets of data.
-There are measures to avoid parallelizing stuff that is not worth it.
-For each algorithm, a minimum problem size can be stated,
-usually using the variable
-<code>__gnu_parallel::Settings::[algorithm]_minimal_n</code>.
-Please see <ulink url="latest-doxygen/settings_8h.html">
-<code>&lt;settings.h&gt;</code></ulink> for details.</para>
+All these configuration variables can be changed by the user, if
+desired. Please
+see <ulink url="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a00640.html"><filename class="headerfile">settings.h</filename></ulink>
+for complete details.
+</para>
+
+<para>
+A small example of tuning the default:
+</para>
+
+<programlisting>
+#include &lt;parallel/algorithm&gt;
+#include &lt;parallel/settings.h&gt;
+
+int main()
+{
+ __gnu_parallel::_Settings s;
+ s.algorithm_strategy = __gnu_parallel::force_parallel;
+ __gnu_parallel::_Settings::set(s);
+
+ // Do work... all algorithms will be parallelized, always.
+
+ return 0;
+}
+</programlisting>
+</sect3>
</sect2>
<sect2 id="manual.ext.parallel_mode.design.impl" xreflabel="Impl">
<title>Implementation Namespaces</title>
-<para> One namespace contain versions of code that are explicitly sequential:
+<para> One namespace contain versions of code that are always
+explicitly sequential:
<code>__gnu_serial</code>.
</para>
@@ -548,10 +688,8 @@ Please see <ulink url="latest-doxygen/settings_8h.html">
<para> Parallel implementations of standard components, including
template helpers to select parallelism, are defined in <code>namespace
-std::__parallel</code>. For instance, <code>std::transform</code> from
-&lt;algorithm&gt; has a parallel counterpart in
-<code>std::__parallel::transform</code> from
-&lt;parallel/algorithm&gt;. In addition, these parallel
+std::__parallel</code>. For instance, <function>std::transform</function> from <filename class="headerfile">algorithm</filename> has a parallel counterpart in
+<function>std::__parallel::transform</function> from <filename class="headerfile">parallel/algorithm</filename>. In addition, these parallel
implementations are injected into <code>namespace
__gnu_parallel</code> with using declarations.
</para>
@@ -588,7 +726,7 @@ the generated source documentation.
<para>
The log and summary files for conformance testing are in the
- <code>testsuite/parallel</code> directory.
+ <filename class="directory">testsuite/parallel</filename> directory.
</para>
<para>
@@ -596,13 +734,13 @@ the generated source documentation.
</para>
<screen>
- <userinput>check-performance-parallel</userinput>
+ <userinput>make check-performance-parallel</userinput>
</screen>
<para>
The result file for performance testing are in the
- <code>testsuite</code> directory, in the file
- <code>libstdc++_performance.sum</code>. In addition, the
+ <filename class="directory">testsuite</filename> directory, in the file
+ <filename>libstdc++_performance.sum</filename>. In addition, the
policy-based containers have their own visualizations, which have
additional software dependencies than the usual bare-boned text
file, and can be generated by using the <code>make