6 files changed, 268 insertions, 52 deletions
diff --git a/llvm/docs/CommandGuide/llvm-ir2vec.rst b/llvm/docs/CommandGuide/llvm-ir2vec.rst
index 55fe75d..f51da06 100644
--- a/llvm/docs/CommandGuide/llvm-ir2vec.rst
+++ b/llvm/docs/CommandGuide/llvm-ir2vec.rst
@@ -68,32 +68,52 @@ these two modes are used to generate the triplets and entity mappings.
 Triplet Generation
 ~~~~~~~~~~~~~~~~~~
 
-With the `triplets` subcommand, :program:`llvm-ir2vec` analyzes LLVM IR and extracts
-numeric triplets consisting of opcode IDs, type IDs, and operand IDs. These triplets
+With the `triplets` subcommand, :program:`llvm-ir2vec` analyzes LLVM IR or Machine IR
+and extracts numeric triplets consisting of opcode IDs and operand IDs. These triplets
 are generated in the standard format used for knowledge graph embedding training.
-The tool outputs numeric IDs directly using the ir2vec::Vocabulary mapping
-infrastructure, eliminating the need for string-to-ID preprocessing.
+The tool outputs numeric IDs directly using the vocabulary mapping infrastructure,
+eliminating the need for string-to-ID preprocessing.
 
-Usage:
+Usage for LLVM IR:
 
 .. code-block:: bash
 
-   llvm-ir2vec triplets input.bc -o triplets_train2id.txt
+   llvm-ir2vec triplets --mode=llvm input.bc -o triplets_train2id.txt
+
+Usage for Machine IR:
+
+.. code-block:: bash
+
+   llvm-ir2vec triplets --mode=mir input.mir -o triplets_train2id.txt
 
 Entity Mapping Generation
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 
 With the `entities` subcommand, :program:`llvm-ir2vec` generates the entity mappings
-supported by IR2Vec in the standard format used for knowledge graph embedding
-training. This subcommand outputs all supported entities (opcodes, types, and
-operands) with their corresponding numeric IDs, and is not specific for an
-LLVM IR file.
+supported by IR2Vec or MIR2Vec in the standard format used for knowledge graph embedding
+training. This subcommand outputs all supported entities with their corresponding numeric IDs.
+
+For LLVM IR, entities include opcodes, types, and operands. For Machine IR, entities include
+machine opcodes, common operands, and register classes (both physical and virtual).
+
+Usage for LLVM IR:
 
-Usage:
+.. code-block:: bash
+
+   llvm-ir2vec entities --mode=llvm -o entity2id.txt
+
+Usage for Machine IR:
 
 .. code-block:: bash
 
-   llvm-ir2vec entities -o entity2id.txt
+   llvm-ir2vec entities --mode=mir input.mir -o entity2id.txt
+
+.. note::
+
+   For LLVM IR mode, the entity mapping is target-independent and does not require an input file.
+   For Machine IR mode, an input .mir file is required to determine the target architecture,
+   as entity mappings vary by target (different architectures have different instruction sets
+   and register classes).
 
 Embedding Generation
 ~~~~~~~~~~~~~~~~~~~~
@@ -222,12 +242,17 @@ Subcommand-specific options:
 
 .. option:: <input-file>
 
-   The input LLVM IR or bitcode file to process. This positional argument is
-   required for the `triplets` subcommand.
+   The input LLVM IR/bitcode file (.ll/.bc) or Machine IR file (.mir) to process. 
+   This positional argument is required for the `triplets` subcommand.
 
 **entities** subcommand:
 
-   No subcommand-specific options.
+.. option:: <input-file>
+
+   The input Machine IR file (.mir) to process. This positional argument is required
+   for the `entities` subcommand when using ``--mode=mir``, as the entity mappings
+   are target-specific. For ``--mode=llvm``, no input file is required as IR2Vec
+   entity mappings are target-independent.
 
 OUTPUT FORMAT
 -------------
@@ -240,19 +265,37 @@ metadata headers. The format includes:
 
 .. code-block:: text
 
-   MAX_RELATIONS=<max_relations_count>
+   MAX_RELATION=<max_relation_count>
    <head_entity_id> <tail_entity_id> <relation_id>
    <head_entity_id> <tail_entity_id> <relation_id>
    ...
 
 Each line after the metadata header represents one instruction relationship,
-with numeric IDs for head entity, relation, and tail entity. The metadata 
-header (MAX_RELATIONS) provides counts for post-processing and training setup.
+with numeric IDs for head entity, tail entity, and relation type. The metadata 
+header (MAX_RELATION) indicates the maximum relation ID used.
+
+**Relation Types:**
+
+For LLVM IR (IR2Vec):
+  * **0** = Type relationship (instruction to its type)
+  * **1** = Next relationship (sequential instructions)
+  * **2+** = Argument relationships (Arg0, Arg1, Arg2, ...)
+
+For Machine IR (MIR2Vec):
+  * **0** = Next relationship (sequential instructions)
+  * **1+** = Argument relationships (Arg0, Arg1, Arg2, ...)
+
+**Entity IDs:**
+
+For LLVM IR: Entity IDs represent opcodes, types, and operands as defined by the IR2Vec vocabulary.
+
+For Machine IR: Entity IDs represent machine opcodes, common operands (immediate, frame index, etc.),
+physical register classes, and virtual register classes as defined by the MIR2Vec vocabulary. The entity layout is target-specific.
 
 Entity Mode Output
 ~~~~~~~~~~~~~~~~~~
 
-In entity mode, the output consists of entity mapping in the format:
+In entity mode, the output consists of entity mappings in the format:
 
 .. code-block:: text
 
@@ -264,6 +307,13 @@ In entity mode, the output consists of entity mapping in the format:
 The first line contains the total number of entities, followed by one entity
 mapping per line with tab-separated entity string and numeric ID.
 
+For LLVM IR, entities include instruction opcodes (e.g., "Add", "Ret"), types 
+(e.g., "INT", "PTR"), and operand kinds.
+
+For Machine IR, entities include machine opcodes (e.g., "COPY", "ADD"), 
+common operands (e.g., "Immediate", "FrameIndex"), physical register classes 
+(e.g., "PhyReg_GR32"), and virtual register classes (e.g., "VirtReg_GR32").
+
 Embedding Mode Output
 ~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/llvm/docs/Extensions.rst b/llvm/docs/Extensions.rst
index 214323e..91a3ac0 100644
--- a/llvm/docs/Extensions.rst
+++ b/llvm/docs/Extensions.rst
@@ -416,7 +416,36 @@ as offsets relative to prior addresses.
 The following versioning schemes are currently supported (newer versions support
 features of the older versions).
 
-Version 3 (newest): Capable of encoding callsite offsets. Enabled by the 6th bit
+Version 4 (newest): Capable of encoding basic block hashes. This feature is
+enabled by the 7th bit of the feature byte.
+
+Example:
+
+.. code-block:: gas
+
+  .section  ".llvm_bb_addr_map","",@llvm_bb_addr_map
+  .byte     4                             # version number
+  .byte     96                            # feature byte
+  .quad     .Lfunc_begin0                 # address of the function
+  .byte     2                             # number of basic blocks
+  # BB record for BB_0
+   .byte     0                            # BB_0 ID
+   .uleb128  .Lfunc_begin0-.Lfunc_begin0  # BB_0 offset relative to function entry (always zero)
+   .byte     0                            # number of callsites in this block
+   .uleb128  .LBB_END0_0-.Lfunc_begin0    # BB_0 size
+   .byte     x                            # BB_0 metadata
+   .quad     9080480745856761856          # BB_0 hash
+  # BB record for BB_1
+   .byte     1                            # BB_1 ID
+   .uleb128  .LBB0_1-.LBB_END0_0          # BB_1 offset relative to the end of last block (BB_0).
+   .byte     2                            # number of callsites in this block
+   .uleb128  .LBB0_1_CS0-.LBB0_1          # offset of callsite end relative to the previous offset (.LBB0_1)
+   .uleb128  .LBB0_1_CS1-.LBB0_1_CS0      # offset of callsite end relative to the previous offset (.LBB0_1_CS0)
+   .uleb128  .LBB_END0_1-.LBB0_1_CS1      # BB_1 size offset (Offset of the block end relative to the previous offset).
+   .byte     y                            # BB_1 metadata
+   .quad     2363478788702666771          # BB_1 hash
+
+Version 3: Capable of encoding callsite offsets. Enabled by the 6th bit
 of the feature byte.
 
 Example:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 0339101..1c6823b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -11455,9 +11455,9 @@ If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
 ``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
 Atomic loads produce :ref:`defined <memmodel>` results when they may see
-multiple atomic stores. The type of the pointee must be an integer, pointer, or
-floating-point type whose bit width is a power of two greater than or equal to
-eight. ``align`` must be
+multiple atomic stores. The type of the pointee must be an integer, pointer,
+floating-point, or vector type whose bit width is a power of two greater than
+or equal to eight. ``align`` must be
 explicitly specified on atomic loads. Note: if the alignment is not greater or
 equal to the size of the `<value>` type, the atomic operation is likely to
 require a lock and have poor performance. ``!nontemporal`` does not have any
@@ -11594,9 +11594,9 @@ If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
 ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
 Atomic loads produce :ref:`defined <memmodel>` results when they may see
-multiple atomic stores. The type of the pointee must be an integer, pointer, or
-floating-point type whose bit width is a power of two greater than or equal to
-eight. ``align`` must be
+multiple atomic stores. The type of the pointee must be an integer, pointer,
+floating-point, or vector type whose bit width is a power of two greater than
+or equal to eight. ``align`` must be
 explicitly specified on atomic stores. Note: if the alignment is not greater or
 equal to the size of the `<value>` type, the atomic operation is likely to
 require a lock and have poor performance. ``!nontemporal`` does not have any
diff --git a/llvm/docs/MLGO.rst b/llvm/docs/MLGO.rst
index bf3de11..2443835 100644
--- a/llvm/docs/MLGO.rst
+++ b/llvm/docs/MLGO.rst
@@ -434,8 +434,27 @@ The latter is also used in tests.
 There is no C++ implementation of a log reader. We do not have a scenario
 motivating one.
 
-IR2Vec Embeddings
-=================
+Embeddings
+==========
+
+LLVM provides embedding frameworks to generate vector representations of code
+at different abstraction levels. These embeddings capture syntactic, semantic,
+and structural properties of the code and can be used as features for machine
+learning models in various compiler optimization tasks.
+
+Two embedding frameworks are available:
+
+- **IR2Vec**: Generates embeddings for LLVM IR
+- **MIR2Vec**: Generates embeddings for Machine IR
+
+Both frameworks follow a similar architecture with vocabulary-based embedding
+generation, where a vocabulary maps code entities to n-dimensional floating
+point vectors. These embeddings can be computed at multiple granularity levels
+(instruction, basic block, and function) and used for ML-guided compiler
+optimizations.
+
+IR2Vec
+------
 
 IR2Vec is a program embedding approach designed specifically for LLVM IR. It
 is implemented as a function analysis pass in LLVM. The IR2Vec embeddings
@@ -466,7 +485,7 @@ The core components are:
     compute embeddings for instructions, basic blocks, and functions.
 
 Using IR2Vec
-------------
+^^^^^^^^^^^^
 
 .. note::
 
@@ -526,7 +545,7 @@ embeddings can be computed and accessed via an ``ir2vec::Embedder`` instance.
    between different code snippets, or perform other analyses as needed.
 
 Further Details
----------------
+^^^^^^^^^^^^^^^
 
 For more detailed information about the IR2Vec algorithm, its parameters, and
 advanced usage, please refer to the original paper:
@@ -538,6 +557,123 @@ triplets from LLVM IR, see :doc:`CommandGuide/llvm-ir2vec`.
 The LLVM source code for ``IR2Vec`` can also be explored to understand the 
 implementation details.
 
+MIR2Vec
+-------
+
+MIR2Vec is an extension of IR2Vec designed specifically for LLVM Machine IR 
+(MIR). It generates embeddings for machine-level instructions, basic blocks, 
+and functions. MIR2Vec operates on the target-specific machine representation,
+capturing machine instruction semantics including opcodes, operands, and 
+register information at the machine level.
+
+MIR2Vec extends the vocabulary to include:
+
+- **Machine Opcodes**: Target-specific instruction opcodes derived from the
+  TargetInstrInfo, grouped by instruction semantics.
+
+- **Common Operands**: All common operand types (excluding register operands),
+  defined by the ``MachineOperand::MachineOperandType`` enum.
+
+- **Physical Register Classes**: Register classes defined by the target,
+  specialized for physical registers.
+
+- **Virtual Register Classes**: Register classes defined by the target,
+  specialized for virtual registers.
+
+The core components are:
+
+- **Vocabulary**: A mapping from machine IR entities (opcodes, operands, register
+  classes) to their vector representations. This is managed by 
+  ``MIR2VecVocabLegacyAnalysis`` for the legacy pass manager, with a 
+  ``MIR2VecVocabProvider`` that can be used standalone or wrapped by pass 
+  managers. The vocabulary (.json file) contains sections for opcodes, common 
+  operands, physical register classes, and virtual register classes.
+
+  .. note::
+    
+    The vocabulary file should contain these sections for it to be valid.
+
+- **Embedder**: A class (``mir2vec::MIREmbedder``) that uses the vocabulary to
+  compute embeddings for machine instructions, machine basic blocks, and 
+  machine functions. Currently, ``SymbolicMIREmbedder`` is the available 
+  implementation.
+
+Using MIR2Vec
+^^^^^^^^^^^^^
+
+.. note::
+
+   This section describes how to use MIR2Vec within LLVM passes. `llvm-ir2vec`
+   tool ` :doc:`CommandGuide/llvm-ir2vec` can be used for generating MIR2Vec
+   embeddings from Machine IR files (.mir), which can be useful for generating
+   embeddings outside of compiler passes.
+
+To generate MIR2Vec embeddings in a compiler pass, first obtain the vocabulary,
+then create an embedder instance to compute and access embeddings.
+
+1. **Get the Vocabulary**:
+   In a MachineFunctionPass, get the vocabulary from the analysis:
+
+   .. code-block:: c++
+
+      auto &VocabAnalysis = getAnalysis<MIR2VecVocabLegacyAnalysis>();
+      auto VocabOrErr = VocabAnalysis.getMIR2VecVocabulary(*MF.getFunction().getParent());
+      if (!VocabOrErr) {
+        // Handle error: vocabulary is not available or invalid
+        return;
+      }
+      const mir2vec::MIRVocabulary &Vocabulary = *VocabOrErr;
+
+   Note that ``MIR2VecVocabLegacyAnalysis`` is an immutable pass.
+
+2. **Create Embedder instance**:
+   With the vocabulary, create an embedder for a specific machine function:
+
+   .. code-block:: c++
+
+      // Assuming MF is a MachineFunction&
+      // For example, using MIR2VecKind::Symbolic:
+      std::unique_ptr<mir2vec::MIREmbedder> Emb =
+          mir2vec::MIREmbedder::create(MIR2VecKind::Symbolic, MF, Vocabulary);
+
+
+3. **Compute and Access Embeddings**:
+   Call ``getMFunctionVector()`` to get the embedding for the machine function.
+
+   .. code-block:: c++
+
+    mir2vec::Embedding FuncVector = Emb->getMFunctionVector();
+
+   Currently, ``MIREmbedder`` can generate embeddings at three levels: Machine
+   Instructions, Machine Basic Blocks, and Machine Functions. Appropriate 
+   getters are provided to access the embeddings at these levels.
+
+   .. note::
+
+    The validity of the ``MIREmbedder`` instance (and the embeddings it 
+    generates) is tied to the machine function it is associated with. If the 
+    machine function is modified, the embeddings may become stale and should 
+    be recomputed accordingly.
+
+4. **Working with Embeddings:**
+   Embeddings are represented as ``std::vector<double>``. These vectors can be
+   used as features for machine learning models, compute similarity scores
+   between different code snippets, or perform other analyses as needed.
+
+Further Details
+^^^^^^^^^^^^^^^
+
+For more detailed information about the MIR2Vec algorithm, its parameters, and
+advanced usage, please refer to the original paper:
+`RL4ReAl: Reinforcement Learning for Register Allocation <https://doi.org/10.1145/3578360.3580273>`_.
+
+For information about using MIR2Vec tool for generating embeddings from
+Machine IR, see :doc:`CommandGuide/llvm-ir2vec`.
+
+The LLVM source code for ``MIR2Vec`` can be explored to understand the 
+implementation details. See ``llvm/include/llvm/CodeGen/MIR2Vec.h`` and 
+``llvm/lib/CodeGen/MIR2Vec.cpp``.
+
 Building with ML support
 ========================
 
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index 754cd40..6bd3278 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -66,6 +66,7 @@ Changes to the LLVM IR
   `@llvm.masked.gather` and `@llvm.masked.scatter` intrinsics has been removed.
   Instead, the `align` attribute should be placed on the pointer (or vector of
   pointers) argument.
+* A `load atomic` may now be used with vector types on x86.
 
 Changes to LLVM infrastructure
 ------------------------------
diff --git a/llvm/docs/TableGen/index.rst b/llvm/docs/TableGen/index.rst
index e916c15..e334b2e 100644
--- a/llvm/docs/TableGen/index.rst
+++ b/llvm/docs/TableGen/index.rst
@@ -20,7 +20,7 @@ domain-specific information.  Because there may be a large number of these
 records, it is specifically designed to allow writing flexible descriptions and
 for common features of these records to be factored out.  This reduces the
 amount of duplication in the description, reduces the chance of error, and makes
-it easier to structure domain specific information.
+it easier to structure domain-specific information.
 
 The TableGen front end parses a file, instantiates the declarations, and
 hands the result off to a domain-specific `backend`_ for processing.  See
@@ -44,8 +44,8 @@ distribution, respectively.
 The TableGen program
 ====================
 
-TableGen files are interpreted by the TableGen program: `llvm-tblgen` available
-on your build directory under `bin`. It is not installed in the system (or where
+TableGen files are interpreted by the TableGen program: ``llvm-tblgen`` available
+in your build directory under ``bin``. It is not installed in the system (or where
 your sysroot is set to), since it has no use beyond LLVM's build process.
 
 Running TableGen
@@ -86,7 +86,7 @@ the `-dump-json` option.
 
 If you plan to use TableGen, you will most likely have to write a `backend`_
 that extracts the information specific to what you need and formats it in the
-appropriate way. You can do this by extending TableGen itself in C++, or by
+appropriate way. You can do this by extending TableGen itself in C++ or by
 writing a script in any language that can consume the JSON output.
 
 Example
@@ -152,7 +152,7 @@ of the x86 architecture.  ``def ADD32rr`` defines a record named
 ``ADD32rr``, and the comment at the end of the line indicates the superclasses
 of the definition.  The body of the record contains all of the data that
 TableGen assembled for the record, indicating that the instruction is part of
-the "X86" namespace, the pattern indicating how the instruction is selected by
+the ``X86`` namespace, the pattern indicating how the instruction is selected by
 the code generator, that it is a two-address instruction, has a particular
 encoding, etc.  The contents and semantics of the information in the record are
 specific to the needs of the X86 backend, and are only shown as an example.
@@ -175,13 +175,13 @@ TableGen, all of the information was derived from the following definition:
 This definition makes use of the custom class ``I`` (extended from the custom
 class ``X86Inst``), which is defined in the X86-specific TableGen file, to
 factor out the common features that instructions of its class share.  A key
-feature of TableGen is that it allows the end-user to define the abstractions
+feature of TableGen is that it allows the end user to define the abstractions
 they prefer to use when describing their information.
 
 Syntax
 ======
 
-TableGen has a syntax that is loosely based on C++ templates, with built-in
+TableGen has a syntax loosely based on C++ templates, with built-in
 types and specification. In addition, TableGen's syntax introduces some
 automation concepts like multiclass, foreach, let, etc.
 
@@ -193,41 +193,41 @@ which are considered 'records'.
 
 **TableGen records** have a unique name, a list of values, and a list of
 superclasses.  The list of values is the main data that TableGen builds for each
-record; it is this that holds the domain specific information for the
+record; it is this that holds the domain-specific information for the
 application.  The interpretation of this data is left to a specific `backend`_,
-but the structure and format rules are taken care of and are fixed by
+but the structure and format rules are taken care of and fixed by
 TableGen.
 
 **TableGen definitions** are the concrete form of 'records'.  These generally do
-not have any undefined values, and are marked with the '``def``' keyword.
+not have any undefined values and are marked with the '``def``' keyword.
 
 .. code-block:: text
 
   def FeatureFPARMv8 : SubtargetFeature<"fp-armv8", "HasFPARMv8", "true",
                                         "Enable ARMv8 FP">;
 
-In this example, FeatureFPARMv8 is ``SubtargetFeature`` record initialised
+In this example, ``FeatureFPARMv8`` is ``SubtargetFeature`` record initialised
 with some values. The names of the classes are defined via the
-keyword `class` either on the same file or some other included. Most target
+keyword `class` either in the same file or some other included. Most target
 TableGen files include the generic ones in ``include/llvm/Target``.
 
 **TableGen classes** are abstract records that are used to build and describe
 other records.  These classes allow the end-user to build abstractions for
-either the domain they are targeting (such as "Register", "RegisterClass", and
-"Instruction" in the LLVM code generator) or for the implementor to help factor
-out common properties of records (such as "FPInst", which is used to represent
+either the domain they are targeting (such as ``Register``, ``RegisterClass``, and
+``Instruction`` in the LLVM code generator) or for the implementor to help factor
+out common properties of records (such as ``FPInst``, which is used to represent
 floating point instructions in the X86 backend).  TableGen keeps track of all of
 the classes that are used to build up a definition, so the backend can find all
-definitions of a particular class, such as "Instruction".
+definitions of a particular class, such as ``Instruction``.
 
 .. code-block:: text
 
  class ProcNoItin<string Name, list<SubtargetFeature> Features>
        : Processor<Name, NoItineraries, Features>;
 
-Here, the class ProcNoItin, receiving parameters `Name` of type `string` and
-a list of target features is specializing the class Processor by passing the
-arguments down as well as hard-coding NoItineraries.
+Here, the class ``ProcNoItin``, receiving parameters ``Name`` of type ``string`` and
+a list of target features is specializing the class ``Processor`` by passing the
+arguments down as well as hard-coding ``NoItineraries``.
 
 **TableGen multiclasses** are groups of abstract records that are instantiated
 all at once.  Each instantiation can result in multiple TableGen definitions.
@@ -295,8 +295,8 @@ TableGen Deficiencies
 
 Despite being very generic, TableGen has some deficiencies that have been
 pointed out numerous times. The common theme is that, while TableGen allows
-you to build domain specific languages, the final languages that you create
-lack the power of other DSLs, which in turn increase considerably the size
+you to build domain-specific languages, the final languages that you create
+lack the power of other DSLs, which in turn considerably increases the size
 and complexity of TableGen files.
 
 At the same time, TableGen allows you to create virtually any meaning of
@@ -305,6 +305,6 @@ design and make it very hard for newcomers to understand the evil TableGen
 file.
 
 There are some in favor of extending the semantics even more, but making sure
-backends adhere to strict rules. Others are suggesting we should move to less,
+backends adhere to strict rules. Others suggest moving to fewer,
 more powerful DSLs designed with specific purposes, or even reusing existing
 DSLs.