Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause #134709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Jun 11, 2025
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
a05af19
Codegen for Reduction over private variables with reduction clause
Apr 7, 2025
4e6eea6
review comment changes incorporated
Apr 8, 2025
18e1708
review comment , removing redundant code
Apr 9, 2025
59ab4be
fix for user-defined reduction op
Apr 10, 2025
e45c30a
Handle user-defined reduction and updated lit test
May 1, 2025
980bc06
conditional checks
May 1, 2025
a103dfa
lit update
May 1, 2025
526314c
Support for UDR for private variables
May 5, 2025
c77fb0e
Implicit reduction identifier fix
May 5, 2025
f202eaa
updated with comments, unified logic and docs
May 7, 2025
9d2370b
Update OpenMPSupport.rst
chandraghale May 7, 2025
0ca2f86
Handle UDR init and updated lit
May 7, 2025
9335af1
multiple reduced value change
May 8, 2025
e1a1998
UDR init logic leveraged from emitInitWithReductionInitializer fn
May 8, 2025
efd69bb
runtime tests
May 9, 2025
c01671e
Update omp_for_private_reduction.cpp
chandraghale May 9, 2025
ad0d2f0
Update omp_for_private_reduction.cpp
chandraghale May 9, 2025
4df2910
update test
May 9, 2025
2468be3
test update
May 9, 2025
9576c87
Resolve mergeconflict rel notes
May 9, 2025
7e324bd
Resolve mergeconflict rel notes
May 9, 2025
262a861
Release notes update
May 9, 2025
a0d29ab
address comments,support all types
May 13, 2025
0c2978c
complex type test for priv redn
May 13, 2025
384cd4a
add addtional complex test
May 14, 2025
76db75a
Merge branch 'main' into codegen_private_variable_reducn
chandraghale May 14, 2025
0b59740
format error fix
chandraghale May 14, 2025
694e241
Merge branch 'main' into codegen_private_variable_reducn
chandraghale May 15, 2025
4c36ba7
Format fix
May 23, 2025
b146a1a
Few more fixes with ref from spec
May 30, 2025
e1ca648
removing wrong asserts
May 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion 3 clang/docs/OpenMPSupport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,8 @@ implementation.
+-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+
| Extensions to atomic construct | :none:`unclaimed` | :none:`unclaimed` | |
+-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+
| Private reductions | :part:`partial` | :none:`unclaimed` | Parse/Sema:https://github.com/llvm/llvm-project/pull/129938 |
| Private reductions | :good:`mostly` | :none:`unclaimed` | Parse/Sema:https://github.com/llvm/llvm-project/pull/129938 |
| | | | Codegen: https://github.com/llvm/llvm-project/pull/134709 |
+-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+
| Self maps | :part:`partial` | :none:`unclaimed` | parsing/sema done: https://github.com/llvm/llvm-project/pull/129888 |
+-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+
Expand Down
1 change: 1 addition & 0 deletions 1 clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -530,6 +530,7 @@ OpenMP Support
- Added support 'no_openmp_constructs' assumption clause.
- Added support for 'self_maps' in map and requirement clause.
- Added support for 'omp stripe' directive.
- Added support for reduction over private variable with 'reduction' clause.

Improvements
^^^^^^^^^^^^
Expand Down
272 changes: 272 additions & 0 deletions 272 clang/lib/CodeGen/CGOpenMPRuntime.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4899,6 +4899,266 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF,
}
}

void CGOpenMPRuntime::emitPrivateReduction(
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates,
const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) {

// Create a shared global variable (__shared_reduction_var) to accumulate the
// final result.
//
// Call __kmpc_barrier to synchronize threads before initialization.
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
//
// The master thread (thread_id == 0) initializes __shared_reduction_var
// with the identity value or initializer.
//
// Call __kmpc_barrier to synchronize before combining.
// For each i:
// - Thread enters critical section.
// - Reads its private value from LHSExprs[i].
// - Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i],
// LHSExprs[i]).
// - Exits critical section.
//
// Call __kmpc_barrier after combining.
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
//
// Each thread copies __shared_reduction_var[i] back to LHSExprs[i].
//
// Final __kmpc_barrier to synchronize after broadcasting
QualType PrivateType = Privates->getType();
llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType);

llvm::Constant *InitVal = nullptr;
const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps);
// Determine the initial value for the shared reduction variable
if (!UDR) {
InitVal = llvm::Constant::getNullValue(LLVMType);
if (const auto *DRE = dyn_cast<DeclRefExpr>(Privates)) {
if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
const Expr *InitExpr = VD->getInit();
if (InitExpr && !PrivateType->isAggregateType() &&
!PrivateType->isAnyComplexType()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complex types should be supported, the compiler should not drop it silently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done .. Added support for all types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add at least a runtime test with complex types, if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated runtime test case a complex type test.

Expr::EvalResult Result;
if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) {
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
APValue &InitValue = Result.Val;
if (InitValue.isInt())
InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt());
}
}
}
}
} else {
InitVal = llvm::Constant::getNullValue(LLVMType);
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
}
std::string ReductionVarNameStr;
if (const auto *DRE = dyn_cast<DeclRefExpr>(Privates->IgnoreParenCasts())) {
ReductionVarNameStr = DRE->getDecl()->getNameAsString();
} else {
ReductionVarNameStr = "unnamed_priv_var";
}
chandraghale marked this conversation as resolved.
Show resolved Hide resolved

// Create an internal shared variable
std::string SharedName =
CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr});
llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable(
CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage,
InitVal, ".omp.reduction." + SharedName, nullptr,
llvm::GlobalVariable::NotThreadLocal);

SharedVar->setAlignment(
llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8));
chandraghale marked this conversation as resolved.
Show resolved Hide resolved

Address SharedResult(SharedVar, SharedVar->getValueType(),
CGF.getContext().getTypeAlignInChars(PrivateType));
chandraghale marked this conversation as resolved.
Show resolved Hide resolved

llvm::Value *ThreadId = getThreadID(CGF, Loc);
llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE);
llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId};

llvm::BasicBlock *InitBB = CGF.createBasicBlock("init");
llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end");

llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ(
ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0));
CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB);

CGF.EmitBlock(InitBB);

auto EmitSharedInit = [&]() {
if (UDR) { // Check if it's a User-Defined Reduction
if (const Expr *UDRInitExpr = UDR->getInitializer()) {
std::pair<llvm::Function *, llvm::Function *> FnPair =
getUserDefinedReduction(UDR);
llvm::Function *InitializerFn = FnPair.second;
if (InitializerFn) {
if (const auto *CE =
dyn_cast<CallExpr>(UDRInitExpr->IgnoreParenImpCasts())) {
const auto *OutDRE = cast<DeclRefExpr>(
cast<UnaryOperator>(CE->getArg(0)->IgnoreParenImpCasts())
->getSubExpr());
const VarDecl *OutVD = cast<VarDecl>(OutDRE->getDecl());

CodeGenFunction::OMPPrivateScope LocalScope(CGF);
LocalScope.addPrivate(OutVD, SharedResult);

(void)LocalScope.Privatize();
if (const auto *OVE = dyn_cast<OpaqueValueExpr>(
CE->getCallee()->IgnoreParenImpCasts())) {
CodeGenFunction::OpaqueValueMapping OpaqueMap(
CGF, OVE, RValue::get(InitializerFn));
CGF.EmitIgnoredExpr(CE);
} else {
CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult,
PrivateType.getQualifiers(), true);
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
}
} else {
CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult,
PrivateType.getQualifiers(), true);
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
}
} else {
CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult,
PrivateType.getQualifiers(), true);
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
}
} else {
// EmitNullInitialization handles default construction for C++ classes
// and zeroing for scalars, which is a reasonable default.
CGF.EmitNullInitialization(SharedResult, PrivateType);
}
return; // UDR initialization handled
}
if (const auto *DRE = dyn_cast<DeclRefExpr>(Privates)) {
if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
const Expr *InitExpr = VD->getInit();
if (InitExpr && (PrivateType->isAggregateType() ||
PrivateType->isAnyComplexType())) {
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
CGF.EmitAnyExprToMem(InitExpr, SharedResult,
PrivateType.getQualifiers(), true);
return;
}
if (!InitVal->isNullValue()) {
CGF.EmitStoreOfScalar(InitVal,
CGF.MakeAddrLValue(SharedResult, PrivateType));
return;
}
}
}
CGF.EmitNullInitialization(SharedResult, PrivateType);
};
EmitSharedInit();
CGF.Builder.CreateBr(InitEndBB);
CGF.EmitBlock(InitEndBB);

CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
CGM.getModule(), OMPRTL___kmpc_barrier),
BarrierArgs);

const Expr *ReductionOp = ReductionOps;
const OMPDeclareReductionDecl *CurrentUDR = getReductionInit(ReductionOp);
LValue SharedLV = CGF.MakeAddrLValue(SharedResult, PrivateType);
LValue LHSLV = CGF.EmitLValue(LHSExprs);

auto EmitCriticalReduction = [&](auto ReductionGen) {
std::string CriticalName = getName({"reduction_critical"});
emitCriticalRegion(CGF, CriticalName, ReductionGen, Loc);
};

if (CurrentUDR) {
// Handle user-defined reduction.
auto ReductionGen = [&](CodeGenFunction &CGF, PrePostActionTy &Action) {
Action.Enter(CGF);
std::pair<llvm::Function *, llvm::Function *> FnPair =
getUserDefinedReduction(CurrentUDR);
if (FnPair.first) {
if (const auto *CE = dyn_cast<CallExpr>(ReductionOp)) {
const auto *OutDRE = cast<DeclRefExpr>(
cast<UnaryOperator>(CE->getArg(0)->IgnoreParenImpCasts())
->getSubExpr());
const auto *InDRE = cast<DeclRefExpr>(
cast<UnaryOperator>(CE->getArg(1)->IgnoreParenImpCasts())
->getSubExpr());
CodeGenFunction::OMPPrivateScope LocalScope(CGF);
LocalScope.addPrivate(cast<VarDecl>(OutDRE->getDecl()),
SharedLV.getAddress());
LocalScope.addPrivate(cast<VarDecl>(InDRE->getDecl()),
LHSLV.getAddress());
(void)LocalScope.Privatize();
emitReductionCombiner(CGF, ReductionOp);
}
}
};
EmitCriticalReduction(ReductionGen);
}
// Handle built-in reduction operations.
else {
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
const Expr *ReductionClauseExpr = ReductionOp->IgnoreParenCasts();
if (const auto *Cleanup = dyn_cast<ExprWithCleanups>(ReductionClauseExpr))
ReductionClauseExpr = Cleanup->getSubExpr()->IgnoreParenCasts();

const Expr *AssignRHS = nullptr;
if (const auto *BinOp = dyn_cast<BinaryOperator>(ReductionClauseExpr)) {
if (BinOp->getOpcode() == BO_Assign)
AssignRHS = BinOp->getRHS();
} else if (const auto *OpCall =
dyn_cast<CXXOperatorCallExpr>(ReductionClauseExpr)) {
if (OpCall->getOperator() == OO_Equal)
AssignRHS = OpCall->getArg(1);
}

if (!AssignRHS)
return;

const Expr *CombinerExpr = AssignRHS->IgnoreParenImpCasts();
if (const auto *MTE = dyn_cast<MaterializeTemporaryExpr>(CombinerExpr))
CombinerExpr = MTE->getSubExpr()->IgnoreParenImpCasts();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it do? CombinerExpr is not used anywhere

Copy link
Contributor Author

@chandraghale chandraghale May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used, redundant code, it was left from prev re-work, Removed.


auto ReductionGen = [&](CodeGenFunction &CGF, PrePostActionTy &Action) {
Action.Enter(CGF);
const auto *OmpOutDRE =
dyn_cast<DeclRefExpr>(LHSExprs->IgnoreParenImpCasts());
const auto *OmpInDRE =
dyn_cast<DeclRefExpr>(RHSExprs->IgnoreParenImpCasts());
if (!OmpOutDRE || !OmpInDRE)
return;
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
const VarDecl *OmpOutVD = cast<VarDecl>(OmpOutDRE->getDecl());
const VarDecl *OmpInVD = cast<VarDecl>(OmpInDRE->getDecl());
CodeGenFunction::OMPPrivateScope LocalScope(CGF);
LocalScope.addPrivate(OmpOutVD, SharedLV.getAddress());
LocalScope.addPrivate(OmpInVD, LHSLV.getAddress());
(void)LocalScope.Privatize();
// Emit the actual reduction operation
CGF.EmitIgnoredExpr(ReductionOp);
};
EmitCriticalReduction(ReductionGen);
}

CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
CGM.getModule(), OMPRTL___kmpc_barrier),
BarrierArgs);

// Broadcast final result
bool IsAggregate = PrivateType->isAggregateType();
LValue SharedLV1 = CGF.MakeAddrLValue(SharedResult, PrivateType);
llvm::Value *FinalResultVal = nullptr;
Address FinalResultAddr = Address::invalid();

if (IsAggregate)
FinalResultAddr = SharedResult;
else
FinalResultVal = CGF.EmitLoadOfScalar(SharedLV1, Loc);

LValue TargetLHSLV = CGF.EmitLValue(LHSExprs);
if (IsAggregate) {
CGF.EmitAggregateCopy(TargetLHSLV,
CGF.MakeAddrLValue(FinalResultAddr, PrivateType),
PrivateType, AggValueSlot::DoesNotOverlap, false);
} else {
CGF.EmitStoreOfScalar(FinalResultVal, TargetLHSLV);
}
// Final synchronization barrier
CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
CGM.getModule(), OMPRTL___kmpc_barrier),
BarrierArgs);
}

void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc,
ArrayRef<const Expr *> Privates,
ArrayRef<const Expr *> LHSExprs,
Expand Down Expand Up @@ -5201,6 +5461,18 @@ void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc,

CGF.EmitBranch(DefaultBB);
CGF.EmitBlock(DefaultBB, /*IsFinished=*/true);
if (Options.IsPrivateVarReduction) {
if (LHSExprs.empty() || Privates.empty() || ReductionOps.empty())
return;
if (LHSExprs.size() != Privates.size() ||
LHSExprs.size() != ReductionOps.size())
return;
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
for (unsigned I :
llvm::seq<unsigned>(std::min(ReductionOps.size(), LHSExprs.size()))) {
emitPrivateReduction(CGF, Loc, Privates[I], LHSExprs[I], RHSExprs[I],
ReductionOps[I]);
}
}
}

/// Generates unique name for artificial threadprivate variables.
Expand Down
12 changes: 12 additions & 0 deletions 12 clang/lib/CodeGen/CGOpenMPRuntime.h
Original file line number Diff line number Diff line change
Expand Up @@ -1201,8 +1201,20 @@ class CGOpenMPRuntime {
struct ReductionOptionsTy {
bool WithNowait;
bool SimpleReduction;
bool IsPrivateVarReduction;
OpenMPDirectiveKind ReductionKind;
};

/// Emits code for private variable reduction
/// \param Privates List of private copies for original reduction arguments.
/// \param LHSExprs List of LHS in \a ReductionOps reduction operations.
/// \param RHSExprs List of RHS in \a ReductionOps reduction operations.
/// \param ReductionOps List of reduction operations in form 'LHS binop RHS'
/// or 'operator binop(LHS, RHS)'.
void emitPrivateReduction(CodeGenFunction &CGF, SourceLocation Loc,
const Expr *Privates, const Expr *LHSExprs,
const Expr *RHSExprs, const Expr *ReductionOps);

/// Emit a code for reduction clause. Next code should be emitted for
/// reduction:
/// \code
Expand Down
12 changes: 9 additions & 3 deletions 12 clang/lib/CodeGen/CGStmtOpenMP.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1470,6 +1470,7 @@ void CodeGenFunction::EmitOMPReductionClauseFinal(
llvm::SmallVector<const Expr *, 8> LHSExprs;
llvm::SmallVector<const Expr *, 8> RHSExprs;
llvm::SmallVector<const Expr *, 8> ReductionOps;
llvm::SmallVector<bool, 8> IsPrivate;
bool HasAtLeastOneReduction = false;
bool IsReductionWithTaskMod = false;
for (const auto *C : D.getClausesOfKind<OMPReductionClause>()) {
Expand All @@ -1480,6 +1481,8 @@ void CodeGenFunction::EmitOMPReductionClauseFinal(
Privates.append(C->privates().begin(), C->privates().end());
LHSExprs.append(C->lhs_exprs().begin(), C->lhs_exprs().end());
RHSExprs.append(C->rhs_exprs().begin(), C->rhs_exprs().end());
IsPrivate.append(C->private_var_reduction_flags().begin(),
C->private_var_reduction_flags().end());
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
ReductionOps.append(C->reduction_ops().begin(), C->reduction_ops().end());
IsReductionWithTaskMod =
IsReductionWithTaskMod || C->getModifier() == OMPC_REDUCTION_task;
Expand All @@ -1499,9 +1502,11 @@ void CodeGenFunction::EmitOMPReductionClauseFinal(
bool SimpleReduction = ReductionKind == OMPD_simd;
// Emit nowait reduction if nowait clause is present or directive is a
// parallel directive (it always has implicit barrier).
bool IsPrivateVarReduction =
llvm::any_of(IsPrivate, [](bool IsPriv) { return IsPriv; });
CGM.getOpenMPRuntime().emitReduction(
*this, D.getEndLoc(), Privates, LHSExprs, RHSExprs, ReductionOps,
{WithNowait, SimpleReduction, ReductionKind});
{WithNowait, SimpleReduction, IsPrivateVarReduction, ReductionKind});
}
}

Expand Down Expand Up @@ -3943,7 +3948,8 @@ static void emitScanBasedDirective(
PrivScope.Privatize();
CGF.CGM.getOpenMPRuntime().emitReduction(
CGF, S.getEndLoc(), Privates, LHSs, RHSs, ReductionOps,
{/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_unknown});
{/*WithNowait=*/true, /*SimpleReduction=*/true,
/*IsPrivateVarReduction */ false, OMPD_unknown});
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
}
llvm::Value *NextIVal =
CGF.Builder.CreateNUWSub(IVal, llvm::ConstantInt::get(CGF.SizeTy, 1));
Expand Down Expand Up @@ -5747,7 +5753,7 @@ void CodeGenFunction::EmitOMPScanDirective(const OMPScanDirective &S) {
}
CGM.getOpenMPRuntime().emitReduction(
*this, ParentDir.getEndLoc(), Privates, LHSs, RHSs, ReductionOps,
{/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_simd});
{/*WithNowait=*/true, /*SimpleReduction=*/true, false, OMPD_simd});
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
for (unsigned I = 0, E = CopyArrayElems.size(); I < E; ++I) {
const Expr *PrivateExpr = Privates[I];
LValue DestLVal;
Expand Down
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.