riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2016-04-13	Calculate __builtin_object_size when pointer depends on a condition	Petar Jovanovic	1	-14/+35
	This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193
2016-04-13	[InstCombine] We folded an fcmp to an i1 instead of a vector of i1	David Majnemer	3	-65/+71
	Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175
2016-04-12	Add space between words in verify-scev-maps option help message	Jeroen Ketema	1	-1/+1
	llvm-svn: 266149
2016-04-12	Add the allocsize attribute to LLVM.	George Burgess IV	1	-55/+101
	`allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032
2016-04-11	This reverts commit r265913 and r265912	Sanjoy Das	2	-110/+5
	See PR27315 r265913: "[IndVars] Eliminate op.with.overflow when possible" r265912: "[SCEV] See through op.with.overflow intrinsics" llvm-svn: 265950
2016-04-11	[ThinLTO] Move summary computation from BitcodeWriter to new pass	Teresa Johnson	3	-0/+188
	Summary: This is the first step in also serializing the index out to LLVM assembly. The per-module summary written to bitcode is moved out of the bitcode writer and to a new analysis pass (ModuleSummaryIndexWrapperPass). The pass itself uses a new builder class to compute index, and the builder class is used directly in places where we don't have a pass manager (e.g. llvm-as). Because we are computing summaries outside of the bitcode writer, we no longer can use value ids created by the bitcode writer's ValueEnumerator. This required changing the reference graph edge type to use a new ValueInfo class holding a union between a GUID (combined index) and Value* (permodule index). The Value* are converted to the appropriate value ID during bitcode writing. Also, this enables removal of the BitWriter library's dependence on the Analysis library that was previously required for the summary computation. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18763 llvm-svn: 265941
2016-04-10	[SCEV] See through op.with.overflow intrinsics	Sanjoy Das	2	-5/+110
	Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 llvm-svn: 265912
2016-04-08	Refactor Threshold computation. NFC.	Easwaran Raman	1	-22/+35
	This is part of changes reviewed in http://reviews.llvm.org/D17584. llvm-svn: 265852
2016-04-08	Propagate Undef in llvm.cos Intrinsic	Sanjoy Das	1	-0/+5
	Summary: The llvm cos intrinsic currently does not propagate undef's. This change transforms cos(undef) to null value or 0. There are 2 test cases added as well. Patch by Anna Thomas! Reviewers: sanjoy Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18863 llvm-svn: 265825
2016-04-08	Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA ↵	Silviu Baranga	3	-89/+234
	and LV This re-commits r265535 which was reverted in r265541 because it broke the windows bots. The problem was that we had a PointerIntPair which took a pointer to a struct allocated with new. The problem was that new doesn't provide sufficient alignment guarantees. This pattern was already present before r265535 and it just happened to work. To fix this, we now separate the PointerToIntPair from the ExitNotTakenInfo struct into a pointer and a bool. Original commit message: Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265786
2016-04-08	Don't IPO over functions that can be de-refined	Sanjoy Das	9	-20/+22
	Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762
2016-04-07	Const correctness for BranchProbabilityInfo (NFC)	Mehdi Amini	1	-25/+26
	From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265731
2016-04-06	NFC: make AtomicOrdering an enum class	JF Bastien	3	-12/+11
	Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602
2016-04-06	Revert r265535 until we know how we can fix the bots	Silviu Baranga	3	-234/+89
	llvm-svn: 265541
2016-04-06	[SCEV] Introduce a guarded backedge taken count and use it in LAA and LV	Silviu Baranga	3	-89/+234
	Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265535
2016-04-06	[SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan	David Majnemer	1	-1/+3
	To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521
2016-04-06	[SLPVectorizer] Vectorize libcalls of sqrt	David Majnemer	1	-0/+4
	We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493
2016-04-05	[CFLAA] Fix PR27213; incorrect tagging of args/globals	George Burgess IV	1	-1/+5
	Prior to this patch, CFLAA wouldn't tag arguments/globals properly if it didn't find any "interesting" edges on them. This means that, if all you do is store constants to a global or argument, we would never actually treat it as a global/argument. Test case: define void @foo(i32* %A, i32* %B) #0 { entry: store i32 0, i32* %A, align 4 store i32 0, i32* %B, align 4 ret void } CFLAA would say that %A can't alias %B, because neither pointer was used in an interesting way. This patch makes us note whether something is an argument, global, ... regardless of how interesting CFLAA thinks its uses are. (For the record, using a value in an interesting way means loading from it, using it in a GEP, ...) llvm-svn: 265474
2016-04-05	Minor code cleanups. NFC.	Junmo Park	1	-2/+2
	llvm-svn: 265468
2016-04-05	IR: Introduce ConstantAggregate, NFC	Duncan P. N. Exon Smith	1	-4/+3
	Add a common parent class for ConstantArray, ConstantVector, and ConstantStruct called ConstantAggregate. These are the aggregate subclasses of Constant that take operands. This is mainly a cleanup, adding common `isa` target and removing duplicated code. However, it also simplifies caching which constants point transitively at `GlobalValue` (a possible future direction). llvm-svn: 265466
2016-04-04	[DependenceAnalysis] Check if result of getConstantPart is null	Brendon Cahoon	1	-0/+6
	A seg-fault occurs due to a reference of a null pointer, which is the value returned by getConstantPart. This function returns null if the constant part is not found. The code that calls this function needs to check for the null return value. Differential Revision: http://reviews.llvm.org/D18718 llvm-svn: 265319
2016-04-03	Mark some FP intrinsics as safe to speculatively execute	Peter Zotov	1	-4/+16
	Floating point intrinsics in LLVM are generally not speculatively executed, since most of them are defined to behave the same as libm functions, which set errno. However, the only error that can happen when executing ceil, floor, nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE, and that requires the maximum value of the exponent to be smaller than the number of mantissa bits, which is not the case with any of the floating point types supported by LLVM. The trunc and copysign functions never set errno per per POSIX.1-2001. Differential Revision: http://reviews.llvm.org/D18643 llvm-svn: 265262
2016-04-02	Fix "warning: variabl 'XX’ set but not used" in release build (variable ↵	Mehdi Amini	1	-1/+1
	used in assertion, NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265220
2016-03-31	[NVPTX] Infer __nvvm_reflect as nounwind, readnone	David Majnemer	1	-1/+5
	This patch simply mirrors the attributes we give to @llvm.nvvm.reflect to the __nvvm_reflect libdevice call. This shaves about 30% of the code in libdevice away because of CSE opportunities. It's also helps us figure out that libdevice implementations of transcendental functions don't have side-effects. llvm-svn: 265060
2016-03-31	[InstCombine] Fix incorrect rule from rL236202	Sanjoy Das	1	-1/+2
	The rule for SMIN introduced in rL236202 doesn't work as advertised: the check for Pred == ICmpInst::ICMP_SGT was missing. llvm-svn: 264996
2016-03-31	Delete trailing whitespace	Sanjoy Das	1	-1/+1
	llvm-svn: 264995
2016-03-31	[SCEV] Track NoWrap properties using MatchBinaryOp, NFC	Sanjoy Das	1	-7/+16
	This way once we teach MatchBinaryOp to map more things into arithmetic, the non-wrapping add recurrence construction would understand it too. Right now MatchBinaryOp still only understands arithmetic, so this is solely a code-reorganization change. llvm-svn: 264994
2016-03-31	[SCEV] NFC code motion to simplify later change	Sanjoy Das	1	-77/+77
	llvm-svn: 264993
2016-03-30	[VectorUtils] Don't try and truncate PHIs to a smaller bitwidth	James Molloy	1	-0/+15
	We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852
2016-03-29	[SCEV] Extract out a MatchBinaryOp; NFCI	Sanjoy Das	1	-222/+284
	MatchBinaryOp abstracts out the IR instructions from the operations they represent. While this change is NFC, we will use this factoring later to map things like `(extractvalue 0 (sadd.with.overflow X Y))` to `(add X Y)`. llvm-svn: 264747
2016-03-29	[SCEV] Use Operator::getOpcode instead of manual dispatch; NFC	Sanjoy Das	1	-8/+3
	llvm-svn: 264746
2016-03-28	Fix Clang-tidy modernize-deprecated-headers warnings in some files; other ↵	Eugene Zelenko	1	-61/+63
	minor fixes. Differential revision: http://reviews.llvm.org/D18469 llvm-svn: 264598
2016-03-25	Allow value forwarding past release fences in GVN	Philip Reames	1	-0/+9
	A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence. We choose to be much more conservative about stores. In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store. Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time. The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now. Differential Revision: http://reviews.llvm.org/D11436 llvm-svn: 264472
2016-03-25	IR: Reserve an MDKind for !llvm.loop; NFC	Duncan P. N. Exon Smith	1	-7/+4
	This reserves an MDKind for !llvm.loop, which allows callers to avoid a string-based lookup. I'm not sure why it was missing. There should be no functionality change here, just a small compile-time speedup. llvm-svn: 264371
2016-03-24	[LAA] Formatting fix in previous change	Adam Nemet	1	-2/+1
	llvm-svn: 264244
2016-03-24	[LAA] Support memchecks involving loop-invariant addresses	Adam Nemet	1	-17/+31
	We used to only allow SCEVAddRecExpr for pointer expressions in order to be able to compute the bounds. However this is also trivially possible for loop-invariant addresses (scUnknown) since then the bounds are the address itself. Interestingly, we used allow this for the special case when the loop-invariant address happens to also be an SCEVAddRecExpr (in an outer loop). There are a couple more loops that are vectorized in SPEC after this. My guess is that the main reason we don't see more because for example a loop-invariant load is vectorized into a splat vector with several vector-inserts. This is likely to make the vectorization unprofitable. I.e. we don't notice that a later LICM will move all of this out of the loop so the cost estimate should really be 0. llvm-svn: 264243
2016-03-23	Add getBlockProfileCount method to BlockFrequencyInfo	Easwaran Raman	1	-0/+14
	Differential Revision: http://reviews.llvm.org/D18233 llvm-svn: 264179
2016-03-23	[SCEV] Change the SCEV Predicates interfaces for conversion to AddRecExpr to ↵	Silviu Baranga	2	-5/+19
	return SCEVAddRecExpr* instead of SCEV* Summary: This changes the conversion functions from SCEV * to SCEVAddRecExpr from ScalarEvolution and PredicatedScalarEvolution to return a SCEVAddRecExpr* instead of a SCEV* (which removes the need of most clients to do a dyn_cast right after calling these functions). We also don't add new predicates if the transformation was not successful. This is not entirely a NFC (as it can theoretically remove some predicates from LAA when we have an unknown dependece), but I couldn't find an obvious regression test for it. Reviewers: sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18368 llvm-svn: 264161
2016-03-22	Rename DenseMap::resize() into DenseMap::reserve() (NFC)	Mehdi Amini	1	-1/+1
	This is more coherent with usual containers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 264026
2016-03-21	Implement constant folding for bitreverse	Matt Arsenault	1	-0/+3
	llvm-svn: 263945
2016-03-21	[IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSA	Silviu Baranga	1	-0/+1
	Summary: replaceCongruentIVs can break LCSSA when trying to replace IV increments since it tries to replace all uses of a phi node with another phi node while both of the phi nodes are not necessarily in the processed loop. This will cause an assert in IndVars. To fix this, we add a check to make sure that the replacement maintains LCSSA. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18266 llvm-svn: 263941
2016-03-18	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead	Adam Nemet	1	-0/+4
	Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772
2016-03-18	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses	Adam Nemet	1	-0/+4
	Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771
2016-03-15	Add Rust's personality function to the list of known personality functions	Bjorn Steinbrink	1	-0/+1
	Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18192 llvm-svn: 263581
2016-03-14	Re-add ConstantFoldInstOperands form taking opcode and return type.	Manuel Jacob	1	-4/+13
	Summary: This form was replaced by a form taking an instruction instead of opcode and return type in r258391. After committing this change (and some depending, follow-up changes) it turned out in the review thread to be controversial. The discussion didn't come to a conclusion yet. I'm re-adding the old form to fix the API regression and to provide a better base for discussion, possibly on llvm-dev. A difference to the original function is that it can't be called with GEPs (similarly to how it was already the case for compares). In order to support opaque pointers in the future, folding GEPs needs to be passed the source element type, which is not possible with the current API. Reviewers: dberlin, reames Subscribers: dblaikie, eddyb Differential Revision: http://reviews.llvm.org/D17901 llvm-svn: 263501
2016-03-14	[AliasSetTracker] Do not strip pointer casts when processing MemSetInst	Michael Kuperstein	1	-2/+2
	This fixes PR26843. llvm-svn: 263462
2016-03-13	ConstantFoldInstruction: avoid wasted calls to ConstantFoldConstantExpression	Fiona Glaser	1	-5/+5
	Check to see if all operands are constant before calling simplify on them so that we don't perform wasted simplifications. llvm-svn: 263374
2016-03-11	[AA] Make BasicAA just require domtree.	Chandler Carruth	1	-4/+5
	This doesn't change how many times we construct domtrees in the normal pipeline, and it removes fragility and instability where basic-aa may not be run in time to see domtrees because they happen to be constructed afterward. This isn't quite as clean as the change to memdep because there is a mode where basic-aa specifically runs without domtrees -- in the hacking version used by function-attrs with the legacy pass manager. llvm-svn: 263234
2016-03-11	[memdep] Just require domtree for memdep.	Chandler Carruth	1	-16/+11
	This doesn't cause us to construct dominator trees any more often in the normal pipeline, and removes an entire mode of memdep that needed to be reasoned about and maintained. Perhaps more importantly, it removes the ability for the results of memdep to be different because of accidental pass scheduling goofs or the order of evaluation of 'getResult' calls. Essentially, 'getCachedResult', unless across IR-unit boundaries, is extremely dangerous. We need to work much harder to avoid it (or its analog in the old pass manager). llvm-svn: 263232
2016-03-11	[PM] Make the AnalysisManager parameter to run methods a reference.	Chandler Carruth	17	-53/+53
	This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219