Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause #134709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: main
Choose a base branch
Loading
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
a05af19
Codegen for Reduction over private variables with reduction clause
Apr 7, 2025
4e6eea6
review comment changes incorporated
Apr 8, 2025
18e1708
review comment , removing redundant code
Apr 9, 2025
59ab4be
fix for user-defined reduction op
Apr 10, 2025
e45c30a
Handle user-defined reduction and updated lit test
May 1, 2025
980bc06
conditional checks
May 1, 2025
a103dfa
lit update
May 1, 2025
526314c
Support for UDR for private variables
May 5, 2025
c77fb0e
Implicit reduction identifier fix
May 5, 2025
f202eaa
updated with comments, unified logic and docs
May 7, 2025
9d2370b
Update OpenMPSupport.rst
chandraghale May 7, 2025
0ca2f86
Handle UDR init and updated lit
May 7, 2025
9335af1
multiple reduced value change
May 8, 2025
e1a1998
UDR init logic leveraged from emitInitWithReductionInitializer fn
May 8, 2025
efd69bb
runtime tests
May 9, 2025
c01671e
Update omp_for_private_reduction.cpp
chandraghale May 9, 2025
ad0d2f0
Update omp_for_private_reduction.cpp
chandraghale May 9, 2025
4df2910
update test
May 9, 2025
2468be3
test update
May 9, 2025
9576c87
Resolve mergeconflict rel notes
May 9, 2025
7e324bd
Resolve mergeconflict rel notes
May 9, 2025
262a861
Release notes update
May 9, 2025
a0d29ab
address comments,support all types
May 13, 2025
0c2978c
complex type test for priv redn
May 13, 2025
384cd4a
add addtional complex test
May 14, 2025
76db75a
Merge branch 'main' into codegen_private_variable_reducn
chandraghale May 14, 2025
0b59740
format error fix
chandraghale May 14, 2025
694e241
Merge branch 'main' into codegen_private_variable_reducn
chandraghale May 15, 2025
4c36ba7
Format fix
May 23, 2025
b146a1a
Few more fixes with ref from spec
May 30, 2025
3bb17c1
removing wrong asserts
May 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
234 changes: 234 additions & 0 deletions 234 clang/lib/CodeGen/CGOpenMPRuntime.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4899,6 +4899,238 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF,
}
}

void CGOpenMPRuntime::emitPrivateReduction(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Add the comments, describing the logic of the function.
  2. How does it differ from the regular reductions codegen? Can you try to unify the logic to reduce the maintenance cost?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Updated the code with comments and clarified the logic
  • Updated OpenMPSupport.rst and the release notes.
  • Unified the logic as much as possible.
  • In regular reduction codegen, threads update a shared variable directly. With private reduction, each thread first works with its own private copy, and then all these partial results are combined into a shared variable. The final result is copied back to each thread’s original variable

CodeGenFunction &CGF, SourceLocation Loc, ArrayRef<const Expr *> Privates,
ArrayRef<const Expr *> LHSExprs, ArrayRef<const Expr *> RHSExprs,
ArrayRef<const Expr *> ReductionOps) {
if (LHSExprs.empty() || Privates.empty() || ReductionOps.empty())
return;

if (LHSExprs.size() != Privates.size() ||
LHSExprs.size() != ReductionOps.size())
return;

QualType PrivateType = Privates[0]->getType();
llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType);

llvm::Constant *InitVal = nullptr;
const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps[0]);

if (!UDR) {
InitVal = llvm::Constant::getNullValue(LLVMType);
if (const auto *DRE = dyn_cast<DeclRefExpr>(Privates[0])) {
if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
const Expr *InitExpr = VD->getInit();
if (InitExpr && !PrivateType->isAggregateType() &&
!PrivateType->isAnyComplexType()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complex types should be supported, the compiler should not drop it silently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done .. Added support for all types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add at least a runtime test with complex types, if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated runtime test case a complex type test.

Expr::EvalResult Result;
if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) {
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
APValue &InitValue = Result.Val;
if (InitValue.isInt()) {
InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt());
}
}
}
}
}
} else {
InitVal = llvm::Constant::getNullValue(LLVMType);
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
}

// Create an internal shared variable
std::string SharedName = getName({"internal_private_var"});
llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable(
CGM.getModule(), LLVMType, false, llvm::GlobalValue::CommonLinkage,
InitVal, ".omp.reduction." + SharedName, nullptr,
llvm::GlobalVariable::NotThreadLocal);

SharedVar->setAlignment(
llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8));
chandraghale marked this conversation as resolved.
Show resolved Hide resolved

Address SharedResult(SharedVar, SharedVar->getValueType(),
CGF.getContext().getTypeAlignInChars(PrivateType));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use CGF.MakeNaturalAlignRawAddrLValue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done !!


llvm::Value *ThreadId = getThreadID(CGF, Loc);
llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE);
llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId};

CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
CGM.getModule(), OMPRTL___kmpc_barrier),
BarrierArgs);

llvm::BasicBlock *InitBB = CGF.createBasicBlock("init");
llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end");

llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ(
ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0));
CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB);

CGF.EmitBlock(InitBB);

if (const auto *DRE = dyn_cast<DeclRefExpr>(Privates[0])) {
if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
const Expr *InitExpr = VD->getInit();
if (InitExpr &&
(PrivateType->isAggregateType() || PrivateType->isAnyComplexType())) {
CGF.EmitAnyExprToMem(InitExpr, SharedResult,
PrivateType.getQualifiers(), true);
} else if (!InitVal->isNullValue()) {
CGF.EmitStoreOfScalar(InitVal,
CGF.MakeAddrLValue(SharedResult, PrivateType));
} else {
CGF.EmitNullInitialization(SharedResult, PrivateType);
}
} else {
CGF.EmitNullInitialization(SharedResult, PrivateType);
}
} else {
CGF.EmitNullInitialization(SharedResult, PrivateType);
}

CGF.Builder.CreateBr(InitEndBB);
CGF.EmitBlock(InitEndBB);
CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
CGM.getModule(), OMPRTL___kmpc_barrier),
BarrierArgs);
for (unsigned I :
llvm::seq<unsigned>(std::min(ReductionOps.size(), LHSExprs.size()))) {

const Expr *ReductionOp = ReductionOps[I];
LValue SharedLV = CGF.MakeAddrLValue(SharedResult, PrivateType);
LValue LHSLV = CGF.EmitLValue(LHSExprs[I]);
// If UDR
const OMPDeclareReductionDecl *CurrentUDR =
getReductionInit(ReductionOps[I]);
if (CurrentUDR) {
auto ReductionGen = [&](CodeGenFunction &CGF, PrePostActionTy &Action) {
Action.Enter(CGF);
std::pair<llvm::Function *, llvm::Function *> ReductionFnPair =
getUserDefinedReduction(CurrentUDR);
llvm::Function *CombinerFn = ReductionFnPair.first;
if (const auto *CE = dyn_cast<CallExpr>(ReductionOp)) {
if (CE && CombinerFn) {
const auto *CE = cast<CallExpr>(ReductionOps[I]);
const auto *OutDRE = cast<DeclRefExpr>(
cast<UnaryOperator>(CE->getArg(0)->IgnoreParenImpCasts())
->getSubExpr());
const auto *InDRE = cast<DeclRefExpr>(
cast<UnaryOperator>(CE->getArg(1)->IgnoreParenImpCasts())
->getSubExpr());
CodeGenFunction::OMPPrivateScope LocalScope(CGF);
LocalScope.addPrivate(cast<VarDecl>(OutDRE->getDecl()),
SharedLV.getAddress());
LocalScope.addPrivate(cast<VarDecl>(InDRE->getDecl()),
LHSLV.getAddress());
(void)LocalScope.Privatize();
emitReductionCombiner(CGF, ReductionOp);
}
}
};
std::string CriticalName = getName({"reduction_critical"});
emitCriticalRegion(CGF, CriticalName, ReductionGen, Loc);
} else {
// Built-in Operator Combination
const Expr *ReductionClauseExpr = ReductionOp->IgnoreParenCasts();
if (const auto *Cleanup = dyn_cast<ExprWithCleanups>(ReductionClauseExpr))
ReductionClauseExpr = Cleanup->getSubExpr()->IgnoreParenCasts();
const Expr *AssignRHS = nullptr;
if (const auto *BinOp = dyn_cast<BinaryOperator>(ReductionClauseExpr)) {
if (BinOp->getOpcode() == BO_Assign)
AssignRHS = BinOp->getRHS();
} else if (const auto *OpCall =
dyn_cast<CXXOperatorCallExpr>(ReductionClauseExpr)) {
if (OpCall->getOperator() == OO_Equal)
AssignRHS = OpCall->getArg(1);
}
if (!AssignRHS)
continue;
const Expr *ReductionCombinerExpr = AssignRHS->IgnoreParenImpCasts();
if (const auto *MTE =
dyn_cast<MaterializeTemporaryExpr>(ReductionCombinerExpr))
ReductionCombinerExpr = MTE->getSubExpr()->IgnoreParenImpCasts();

BinaryOperatorKind BO = BO_Assign;
RValue PrivateRV = CGF.EmitLoadOfLValue(LHSLV, Loc);
if (const auto *BinOp = dyn_cast<BinaryOperator>(ReductionCombinerExpr)) {
BO = BinOp->getOpcode();
auto UpdateOp = [&](RValue OldVal) {
if (BO == BO_Mul) {
llvm::Value *OldScalar = OldVal.getScalarVal();
llvm::Value *PrivateScalar = PrivateRV.getScalarVal();
llvm::Value *Result =
CGF.Builder.CreateMul(OldScalar, PrivateScalar);
return RValue::get(Result);
} else {
OpaqueValueExpr OVE(BinOp->getLHS()->getExprLoc(),
BinOp->getLHS()->getType(),
ExprValueKind::VK_PRValue);
CodeGenFunction::OpaqueValueMapping OldValMapping(CGF, &OVE,
OldVal);
return CGF.EmitAnyExpr(BinOp->getRHS());
}
};

(void)CGF.EmitOMPAtomicSimpleUpdateExpr(
SharedLV, PrivateRV, BO, true,
llvm::AtomicOrdering::SequentiallyConsistent, Loc, UpdateOp);
} else if (dyn_cast<CXXOperatorCallExpr>(ReductionClauseExpr)) {
// Implicit Reduction Identifiers ( openmp 6.0 sec 7.6.5 )
auto ReductionGen = [&](CodeGenFunction &CGF, PrePostActionTy &Action) {
Action.Enter(CGF);
const auto *OmpOutDRE =
dyn_cast<DeclRefExpr>(LHSExprs[I]->IgnoreParenImpCasts());
const auto *OmpInDRE =
dyn_cast<DeclRefExpr>(RHSExprs[I]->IgnoreParenImpCasts());

if (!OmpOutDRE || !OmpInDRE) {
return;
}
const VarDecl *OmpOutVD = cast<VarDecl>(OmpOutDRE->getDecl());
const VarDecl *OmpInVD = cast<VarDecl>(OmpInDRE->getDecl());
CodeGenFunction::OMPPrivateScope LocalScope(CGF);
LocalScope.addPrivate(OmpOutVD, SharedLV.getAddress());
LocalScope.addPrivate(OmpInVD, LHSLV.getAddress());
(void)LocalScope.Privatize();
CGF.EmitIgnoredExpr(ReductionOp);
};
std::string CriticalName = getName({"reduction_critical"});
emitCriticalRegion(CGF, CriticalName, ReductionGen, Loc);
}
}
}
// Final barrier
CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
CGM.getModule(), OMPRTL___kmpc_barrier),
BarrierArgs);

// Broadcast final result
bool IsAggregate = PrivateType->isAggregateType();
llvm::Value *FinalResultVal = nullptr;
LValue SharedLV = CGF.MakeAddrLValue(SharedResult, PrivateType);
Address FinalResultAddr = Address::invalid();
if (IsAggregate) {
FinalResultAddr = SharedResult;
} else {
FinalResultVal = CGF.EmitLoadOfScalar(SharedLV, Loc);
}

for (unsigned I : llvm::seq<unsigned>(Privates.size())) {
LValue TargetLHSLV = CGF.EmitLValue(LHSExprs[I]);
if (IsAggregate) {
CGF.EmitAggregateCopy(TargetLHSLV,
CGF.MakeAddrLValue(FinalResultAddr, PrivateType),
PrivateType, AggValueSlot::DoesNotOverlap, false);
} else {
CGF.EmitStoreOfScalar(FinalResultVal, TargetLHSLV);
}
}

// Final synchronization
CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
CGM.getModule(), OMPRTL___kmpc_barrier),
BarrierArgs);
}

void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc,
ArrayRef<const Expr *> Privates,
ArrayRef<const Expr *> LHSExprs,
Expand Down Expand Up @@ -5201,6 +5433,8 @@ void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc,

CGF.EmitBranch(DefaultBB);
CGF.EmitBlock(DefaultBB, /*IsFinished=*/true);
if (Options.IsPrivateVarReduction)
emitPrivateReduction(CGF, Loc, Privates, LHSExprs, RHSExprs, ReductionOps);
}

/// Generates unique name for artificial threadprivate variables.
Expand Down
14 changes: 14 additions & 0 deletions 14 clang/lib/CodeGen/CGOpenMPRuntime.h
Original file line number Diff line number Diff line change
Expand Up @@ -1201,8 +1201,22 @@ class CGOpenMPRuntime {
struct ReductionOptionsTy {
bool WithNowait;
bool SimpleReduction;
bool IsPrivateVarReduction;
OpenMPDirectiveKind ReductionKind;
};

/// Emits code for private variable reduction
/// \param Privates List of private copies for original reduction arguments.
/// \param LHSExprs List of LHS in \a ReductionOps reduction operations.
/// \param RHSExprs List of RHS in \a ReductionOps reduction operations.
/// \param ReductionOps List of reduction operations in form 'LHS binop RHS'
/// or 'operator binop(LHS, RHS)'.
void emitPrivateReduction(CodeGenFunction &CGF, SourceLocation Loc,
ArrayRef<const Expr *> Privates,
ArrayRef<const Expr *> LHSExprs,
ArrayRef<const Expr *> RHSExprs,
ArrayRef<const Expr *> ReductionOps);

/// Emit a code for reduction clause. Next code should be emitted for
/// reduction:
/// \code
Expand Down
12 changes: 9 additions & 3 deletions 12 clang/lib/CodeGen/CGStmtOpenMP.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1470,6 +1470,7 @@ void CodeGenFunction::EmitOMPReductionClauseFinal(
llvm::SmallVector<const Expr *, 8> LHSExprs;
llvm::SmallVector<const Expr *, 8> RHSExprs;
llvm::SmallVector<const Expr *, 8> ReductionOps;
llvm::SmallVector<bool, 8> IsPrivate;
bool HasAtLeastOneReduction = false;
bool IsReductionWithTaskMod = false;
for (const auto *C : D.getClausesOfKind<OMPReductionClause>()) {
Expand All @@ -1480,6 +1481,8 @@ void CodeGenFunction::EmitOMPReductionClauseFinal(
Privates.append(C->privates().begin(), C->privates().end());
LHSExprs.append(C->lhs_exprs().begin(), C->lhs_exprs().end());
RHSExprs.append(C->rhs_exprs().begin(), C->rhs_exprs().end());
IsPrivate.append(C->private_var_reduction_flags().begin(),
C->private_var_reduction_flags().end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is a mix of private and non-private reductions in a single construct? IsPrivateVarReduction flag is set for all reduced value, not matter if they are private or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing, I have fixed it now. Mix of private of non-private reduction is now correctly populating.

ReductionOps.append(C->reduction_ops().begin(), C->reduction_ops().end());
IsReductionWithTaskMod =
IsReductionWithTaskMod || C->getModifier() == OMPC_REDUCTION_task;
Expand All @@ -1499,9 +1502,11 @@ void CodeGenFunction::EmitOMPReductionClauseFinal(
bool SimpleReduction = ReductionKind == OMPD_simd;
// Emit nowait reduction if nowait clause is present or directive is a
// parallel directive (it always has implicit barrier).
bool IsPrivateVarReduction =
llvm::any_of(IsPrivate, [](bool IsPriv) { return IsPriv; });
CGM.getOpenMPRuntime().emitReduction(
*this, D.getEndLoc(), Privates, LHSExprs, RHSExprs, ReductionOps,
{WithNowait, SimpleReduction, ReductionKind});
{WithNowait, SimpleReduction, IsPrivateVarReduction, ReductionKind});
}
}

Expand Down Expand Up @@ -3943,7 +3948,8 @@ static void emitScanBasedDirective(
PrivScope.Privatize();
CGF.CGM.getOpenMPRuntime().emitReduction(
CGF, S.getEndLoc(), Privates, LHSs, RHSs, ReductionOps,
{/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_unknown});
{/*WithNowait=*/true, /*SimpleReduction=*/true,
/*IsPrivateVarReduction */ false, OMPD_unknown});
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
}
llvm::Value *NextIVal =
CGF.Builder.CreateNUWSub(IVal, llvm::ConstantInt::get(CGF.SizeTy, 1));
Expand Down Expand Up @@ -5747,7 +5753,7 @@ void CodeGenFunction::EmitOMPScanDirective(const OMPScanDirective &S) {
}
CGM.getOpenMPRuntime().emitReduction(
*this, ParentDir.getEndLoc(), Privates, LHSs, RHSs, ReductionOps,
{/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_simd});
{/*WithNowait=*/true, /*SimpleReduction=*/true, false, OMPD_simd});
chandraghale marked this conversation as resolved.
Show resolved Hide resolved
for (unsigned I = 0, E = CopyArrayElems.size(); I < E; ++I) {
const Expr *PrivateExpr = Privates[I];
LValue DestLVal;
Expand Down
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.